ProxmoxVE/misc/error_handler.func
CanbiZ a826769899
Three-tier defaults system | security improvements | error_handler | improved logging | improved container creation | improved architecture (#9540)
* Refactor Core

Refactored misc/alpine-install.func to improve error handling, network checks, and MOTD setup. Added misc/alpine-tools.func and misc/error_handler.func for modular tool installation and error management. Enhanced misc/api.func with detailed exit code explanations and telemetry functions. Updated misc/core.func for better initialization, validation, and execution helpers. Removed misc/create_lxc.sh as part of cleanup.

* Delete config-file.func

* Update install.func

* Refactor stop_all_services function and variable names

Refactor service stopping logic and improve variable handling

* Refactor installation script and update copyright

Updated copyright information and adjusted package installation commands. Enhanced IPv6 disabling logic and improved container customization process.

* Update install.func

* Update license comment format in install.func

* Refactor IPv6 handling and enhance MOTD and SSH

Refactor IPv6 handling and update OS function. Enhance MOTD with additional details and configure SSH settings.

* big core refactor

* Enhance IPv6 configuration menu options

Updated IPv6 Address Management menu options for clarity and added a new option for fully disabling IPv6.

* Update default Node.js version to 24 LTS

* Update misc/alpine-tools.func

Co-authored-by: Michel Roegl-Brunner <73236783+michelroegl-brunner@users.noreply.github.com>

* indention

* remove debugf and duplicate codes

* Update whiptail backtitles and error codes

Removed '[dev]' from whiptail --backtitle strings for consistency. Refactored custom exit codes in build.func and error_handler.func: updated Proxmox error codes, shifted MySQL/MariaDB codes to 260-263, and removed unused MongoDB code. Updated error descriptions to match new codes.

* comments

* Refactor error handling and clean up debug comments

Standardized bash variable checks, removed unnecessary debug and commented code, and clarified error handling logic in container build and setup scripts. These changes improve code readability and maintainability without altering functional behavior.

* Update build.func

* feat: Improve LXC network checks and LINSTOR storage handling

Enhanced LXC container network setup to check for both IPv4 and IPv6 addresses, added connectivity (ping) tests, and provided troubleshooting tips on failure. Updated storage validation to support LINSTOR, including cluster connectivity checks and special handling for LINSTOR template storage.

---------

Co-authored-by: Michel Roegl-Brunner <73236783+michelroegl-brunner@users.noreply.github.com>
2025-12-04 07:52:18 +01:00

318 lines
13 KiB
Bash

#!/usr/bin/env bash
# ------------------------------------------------------------------------------
# ERROR HANDLER - ERROR & SIGNAL MANAGEMENT
# ------------------------------------------------------------------------------
# Copyright (c) 2021-2025 community-scripts ORG
# Author: MickLesk (CanbiZ)
# License: MIT | https://github.com/community-scripts/ProxmoxVE/raw/main/LICENSE
# ------------------------------------------------------------------------------
#
# Provides comprehensive error handling and signal management for all scripts.
# Includes:
# - Exit code explanations (shell, package managers, databases, custom codes)
# - Error handler with detailed logging
# - Signal handlers (EXIT, INT, TERM)
# - Initialization function for trap setup
#
# Usage:
# source <(curl -fsSL .../error_handler.func)
# catch_errors
#
# ------------------------------------------------------------------------------
# ==============================================================================
# SECTION 1: EXIT CODE EXPLANATIONS
# ==============================================================================
# ------------------------------------------------------------------------------
# explain_exit_code()
#
# - Maps numeric exit codes to human-readable error descriptions
# - Supports:
# * Generic/Shell errors (1, 2, 126, 127, 128, 130, 137, 139, 143)
# * Package manager errors (APT, DPKG: 100, 101, 255)
# * Node.js/npm errors (243-249, 254)
# * Python/pip/uv errors (210-212)
# * PostgreSQL errors (231-234)
# * MySQL/MariaDB errors (260-263)
# * MongoDB errors (251-253)
# * Proxmox custom codes (200-209, 213-223, 225)
# - Returns description string for given exit code
# ------------------------------------------------------------------------------
explain_exit_code() {
local code="$1"
case "$code" in
# --- Generic / Shell ---
1) echo "General error / Operation not permitted" ;;
2) echo "Misuse of shell builtins (e.g. syntax error)" ;;
126) echo "Command invoked cannot execute (permission problem?)" ;;
127) echo "Command not found" ;;
128) echo "Invalid argument to exit" ;;
130) echo "Terminated by Ctrl+C (SIGINT)" ;;
137) echo "Killed (SIGKILL / Out of memory?)" ;;
139) echo "Segmentation fault (core dumped)" ;;
143) echo "Terminated (SIGTERM)" ;;
# --- Package manager / APT / DPKG ---
100) echo "APT: Package manager error (broken packages / dependency problems)" ;;
101) echo "APT: Configuration error (bad sources.list, malformed config)" ;;
255) echo "DPKG: Fatal internal error" ;;
# --- Node.js / npm / pnpm / yarn ---
243) echo "Node.js: Out of memory (JavaScript heap out of memory)" ;;
245) echo "Node.js: Invalid command-line option" ;;
246) echo "Node.js: Internal JavaScript Parse Error" ;;
247) echo "Node.js: Fatal internal error" ;;
248) echo "Node.js: Invalid C++ addon / N-API failure" ;;
249) echo "Node.js: Inspector error" ;;
254) echo "npm/pnpm/yarn: Unknown fatal error" ;;
# --- Python / pip / uv ---
210) echo "Python: Virtualenv / uv environment missing or broken" ;;
211) echo "Python: Dependency resolution failed" ;;
212) echo "Python: Installation aborted (permissions or EXTERNALLY-MANAGED)" ;;
# --- PostgreSQL ---
231) echo "PostgreSQL: Connection failed (server not running / wrong socket)" ;;
232) echo "PostgreSQL: Authentication failed (bad user/password)" ;;
233) echo "PostgreSQL: Database does not exist" ;;
234) echo "PostgreSQL: Fatal error in query / syntax" ;;
# --- MySQL / MariaDB ---
260) echo "MySQL/MariaDB: Connection failed (server not running / wrong socket)" ;;
261) echo "MySQL/MariaDB: Authentication failed (bad user/password)" ;;
262) echo "MySQL/MariaDB: Database does not exist" ;;
263) echo "MySQL/MariaDB: Fatal error in query / syntax" ;;
# --- MongoDB ---
251) echo "MongoDB: Connection failed (server not running)" ;;
252) echo "MongoDB: Authentication failed (bad user/password)" ;;
253) echo "MongoDB: Database not found" ;;
# --- Proxmox Custom Codes ---
200) echo "Custom: Failed to create lock file" ;;
201) echo "Custom: Cluster not quorate" ;;
202) echo "Custom: Timeout waiting for template lock (concurrent download in progress)" ;;
203) echo "Custom: Missing CTID variable" ;;
204) echo "Custom: Missing PCT_OSTYPE variable" ;;
205) echo "Custom: Invalid CTID (<100)" ;;
206) echo "Custom: CTID already in use (check 'pct list' and /etc/pve/lxc/)" ;;
207) echo "Custom: Password contains unescaped special characters (-, /, \\, *, etc.)" ;;
208) echo "Custom: Invalid configuration (DNS/MAC/Network format error)" ;;
209) echo "Custom: Container creation failed (check logs for pct create output)" ;;
213) echo "Custom: LXC stack upgrade/retry failed (outdated pve-container - check https://github.com/community-scripts/ProxmoxVE/discussions/8126)" ;;
214) echo "Custom: Not enough storage space" ;;
215) echo "Custom: Container created but not listed (ghost state - check /etc/pve/lxc/)" ;;
216) echo "Custom: RootFS entry missing in config (incomplete creation)" ;;
217) echo "Custom: Storage does not support rootdir (check storage capabilities)" ;;
218) echo "Custom: Template file corrupted or incomplete download (size <1MB or invalid archive)" ;;
220) echo "Custom: Unable to resolve template path" ;;
221) echo "Custom: Template file exists but not readable (check file permissions)" ;;
222) echo "Custom: Template download failed after 3 attempts (network/storage issue)" ;;
223) echo "Custom: Template not available after download (storage sync issue)" ;;
225) echo "Custom: No template available for OS/Version (check 'pveam available')" ;;
# --- Default ---
*) echo "Unknown error" ;;
esac
}
# ==============================================================================
# SECTION 2: ERROR HANDLERS
# ==============================================================================
# ------------------------------------------------------------------------------
# error_handler()
#
# - Main error handler triggered by ERR trap
# - Arguments: exit_code, command, line_number
# - Behavior:
# * Returns silently if exit_code is 0 (success)
# * Sources explain_exit_code() for detailed error description
# * Displays error message with:
# - Line number where error occurred
# - Exit code with explanation
# - Command that failed
# * Shows last 20 lines of SILENT_LOGFILE if available
# * Copies log to container /root for later inspection
# * Exits with original exit code
# ------------------------------------------------------------------------------
error_handler() {
local exit_code=${1:-$?}
local command=${2:-${BASH_COMMAND:-unknown}}
local line_number=${BASH_LINENO[0]:-unknown}
command="${command//\$STD/}"
if [[ "$exit_code" -eq 0 ]]; then
return 0
fi
local explanation
explanation="$(explain_exit_code "$exit_code")"
printf "\e[?25h"
# Use msg_error if available, fallback to echo
if declare -f msg_error >/dev/null 2>&1; then
msg_error "in line ${line_number}: exit code ${exit_code} (${explanation}): while executing command ${command}"
else
echo -e "\n${RD}[ERROR]${CL} in line ${RD}${line_number}${CL}: exit code ${RD}${exit_code}${CL} (${explanation}): while executing command ${YWB}${command}${CL}\n"
fi
if [[ -n "${DEBUG_LOGFILE:-}" ]]; then
{
echo "------ ERROR ------"
echo "Timestamp : $(date '+%Y-%m-%d %H:%M:%S')"
echo "Exit Code : $exit_code ($explanation)"
echo "Line : $line_number"
echo "Command : $command"
echo "-------------------"
} >>"$DEBUG_LOGFILE"
fi
# Get active log file (BUILD_LOG or INSTALL_LOG)
local active_log=""
if declare -f get_active_logfile >/dev/null 2>&1; then
active_log="$(get_active_logfile)"
elif [[ -n "${SILENT_LOGFILE:-}" ]]; then
active_log="$SILENT_LOGFILE"
fi
if [[ -n "$active_log" && -s "$active_log" ]]; then
echo "--- Last 20 lines of silent log ---"
tail -n 20 "$active_log"
echo "-----------------------------------"
# Detect context: Container (INSTALL_LOG set + /root exists) vs Host (BUILD_LOG)
if [[ -n "${INSTALL_LOG:-}" && -d /root ]]; then
# CONTAINER CONTEXT: Copy log and create flag file for host
local container_log="/root/.install-${SESSION_ID:-error}.log"
cp "$active_log" "$container_log" 2>/dev/null || true
# Create error flag file with exit code for host detection
echo "$exit_code" >"/root/.install-${SESSION_ID:-error}.failed" 2>/dev/null || true
if declare -f msg_custom >/dev/null 2>&1; then
msg_custom "📋" "${YW}" "Log saved to: ${container_log}"
else
echo -e "${YW}Log saved to:${CL} ${BL}${container_log}${CL}"
fi
else
# HOST CONTEXT: Show local log path and offer container cleanup
if declare -f msg_custom >/dev/null 2>&1; then
msg_custom "📋" "${YW}" "Full log: ${active_log}"
else
echo -e "${YW}Full log:${CL} ${BL}${active_log}${CL}"
fi
# Offer to remove container if it exists (build errors after container creation)
if [[ -n "${CTID:-}" ]] && command -v pct &>/dev/null && pct status "$CTID" &>/dev/null; then
echo ""
echo -en "${YW}Remove broken container ${CTID}? (Y/n) [auto-remove in 60s]: ${CL}"
if read -t 60 -r response; then
if [[ -z "$response" || "$response" =~ ^[Yy]$ ]]; then
echo -e "\n${YW}Removing container ${CTID}${CL}"
pct stop "$CTID" &>/dev/null || true
pct destroy "$CTID" &>/dev/null || true
echo -e "${GN}${CL} Container ${CTID} removed"
elif [[ "$response" =~ ^[Nn]$ ]]; then
echo -e "\n${YW}Container ${CTID} kept for debugging${CL}"
fi
else
# Timeout - auto-remove
echo -e "\n${YW}No response - auto-removing container${CL}"
pct stop "$CTID" &>/dev/null || true
pct destroy "$CTID" &>/dev/null || true
echo -e "${GN}${CL} Container ${CTID} removed"
fi
fi
fi
fi
exit "$exit_code"
}
# ==============================================================================
# SECTION 3: SIGNAL HANDLERS
# ==============================================================================
# ------------------------------------------------------------------------------
# on_exit()
#
# - EXIT trap handler
# - Cleans up lock files if lockfile variable is set
# - Exits with captured exit code
# - Always runs on script termination (success or failure)
# ------------------------------------------------------------------------------
on_exit() {
local exit_code=$?
[[ -n "${lockfile:-}" && -e "$lockfile" ]] && rm -f "$lockfile"
exit "$exit_code"
}
# ------------------------------------------------------------------------------
# on_interrupt()
#
# - SIGINT (Ctrl+C) trap handler
# - Displays "Interrupted by user" message
# - Exits with code 130 (128 + SIGINT=2)
# ------------------------------------------------------------------------------
on_interrupt() {
if declare -f msg_error >/dev/null 2>&1; then
msg_error "Interrupted by user (SIGINT)"
else
echo -e "\n${RD}Interrupted by user (SIGINT)${CL}"
fi
exit 130
}
# ------------------------------------------------------------------------------
# on_terminate()
#
# - SIGTERM trap handler
# - Displays "Terminated by signal" message
# - Exits with code 143 (128 + SIGTERM=15)
# - Triggered by external process termination
# ------------------------------------------------------------------------------
on_terminate() {
if declare -f msg_error >/dev/null 2>&1; then
msg_error "Terminated by signal (SIGTERM)"
else
echo -e "\n${RD}Terminated by signal (SIGTERM)${CL}"
fi
exit 143
}
# ==============================================================================
# SECTION 4: INITIALIZATION
# ==============================================================================
# ------------------------------------------------------------------------------
# catch_errors()
#
# - Initializes error handling and signal traps
# - Enables strict error handling:
# * set -Ee: Exit on error, inherit ERR trap in functions
# * set -o pipefail: Pipeline fails if any command fails
# * set -u: (optional) Exit on undefined variable (if STRICT_UNSET=1)
# - Sets up traps:
# * ERR → error_handler
# * EXIT → on_exit
# * INT → on_interrupt
# * TERM → on_terminate
# - Call this function early in every script
# ------------------------------------------------------------------------------
catch_errors() {
set -Ee -o pipefail
if [ "${STRICT_UNSET:-0}" = "1" ]; then
set -u
fi
trap 'error_handler' ERR
trap on_exit EXIT
trap on_interrupt INT
trap on_terminate TERM
}