Recovery Script Usage

This page names the current scripts and where each command runs.

Current Restore Scripts

Script	Run from	Purpose
`scripts/restore/00-preflight-access.sh`	Administration workstation	Validate `ssh pve`, `ssh -p 2242 nas`, NAS path, optional guarded `ssh docker`, and target CT identity
`scripts/restore/01-stage-backup.sh`	Administration workstation	Copy the NAS backup to local `/tmp` with tar over SSH and write `.copy-complete`
`scripts/restore/02-validate-artifacts.sh`	Administration workstation	Validate archive shape, database dump shape, and required service artifacts
`scripts/restore/03-restore-docker-services.sh`	Administration workstation via `ssh pve` and `pct exec`	Copy staged backup into CT `101` and run the inner Docker restore
`scripts/restore/04-validate-services.sh`	Administration workstation via `ssh pve` and `pct exec`	Validate networks, PostgreSQL, restored databases, containers, and Traefik availability
`scripts/recover-docker-services.sh`	Inside target Docker LXC, normally invoked by stage 03	Restore Docker networks, compose files, env files, archives, PostgreSQL dumps, and services

The older root-level helper scripts are still relevant:

Script	Run from	Purpose
`scripts/create-docker-host-lxc-101.sh`	Proxmox host as root	Create or update CT `101 docker-host`; refuses an existing wrong hostname
`scripts/bootstrap-debian13-docker-lxc.sh`	Inside the new Docker LXC as root	Install Docker Engine and create `/opt/docker/compose` and `/opt/docker/volumes`
`scripts/proxmox-host-maintenance.sh`	Proxmox host as root	Optional host maintenance; not part of a restore unless planned

Create the Docker LXC

scripts/create-docker-host-lxc-101.sh must run on the Proxmox host as root. From the administration workstation, the simplest safe pattern is to stream the local script over the documented SSH alias:

ssh pve 'bash -s' < scripts/create-docker-host-lxc-101.sh

Run this read-only check first:

ssh pve 'hostname; command -v pct; command -v pveam; pct status 101 || true; pct config 101 || true'

If CT 101 already exists, the script accepts it only when its hostname is the expected docker-host. A different hostname causes the script to stop before changing that container.

To run with explicit settings, copy the answer file to Proxmox, edit it there, then stream the script:

scp scripts/recovery-answer.env.example pve:/root/recovery-answer.env
ssh pve 'nano /root/recovery-answer.env'
ssh pve 'bash -s -- --answer-file /root/recovery-answer.env' < scripts/create-docker-host-lxc-101.sh

The answer file path in that command is a path on the Proxmox host, not on the workstation.

The create script downloads the Debian LXC template when needed. It queries Proxmox template metadata with pveam available --section system, selects the latest debian-13-standard_*_amd64.tar.zst, checks the configured TEMPLATE_STORAGE, and runs pveam download only if the template is missing.

The created container is unprivileged:

--unprivileged 1

The script always enables the standard Docker-in-unprivileged-LXC features:

nesting=1,keyctl=1

Those are applied at creation and again with pct set, so rerunning the script keeps the container converged.

Broader LXC relaxations are available only when explicitly requested:

RELAXED_LXC_SECURITY=1

That setting appends these Proxmox LXC config lines if they are missing:

lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: a
lxc.cap.drop:

Leave RELAXED_LXC_SECURITY=0 unless Docker workloads require those broader permissions. The script does not currently set a low-port sysctl such as net.ipv4.ip_unprivileged_port_start. For rootful Docker inside the LXC, published ports such as 80 and 443 are normally handled by Docker running as root inside the container. If a restored workload still cannot bind low ports, record the failure and decide whether RELAXED_LXC_SECURITY=1 is justified.

After CT creation succeeds, install Docker from inside the new LXC with scripts/bootstrap-debian13-docker-lxc.sh. That bootstrap script does not download the LXC image; it installs Docker Engine in an already-created Debian container.

Bootstrap Docker in CT 101

After the create script reports CT 101 is ready for bootstrap, run the Docker bootstrap from the administration workstation through Proxmox:

ssh pve 'pct exec 101 -- bash -s' < scripts/bootstrap-debian13-docker-lxc.sh

If the bootstrap needs values from /root/recovery-answer.env, remember that pct exec runs inside CT 101. A file at /root/recovery-answer.env on the Proxmox host is not automatically visible inside the LXC. Push it into the LXC first:

ssh pve 'pct push 101 /root/recovery-answer.env /root/recovery-answer.env --perms 0600'
ssh pve 'pct exec 101 -- bash -s -- --answer-file /root/recovery-answer.env' < scripts/bootstrap-debian13-docker-lxc.sh

For the current script, the answer file is optional because TIMEZONE defaults to Africa/Accra.

The bootstrap script sets LANG=C.UTF-8 and LC_ALL=C.UTF-8 to avoid locale warnings inherited from the workstation. Earlier runs showed unsupported LC_CTYPE=UTF-8 warnings during pct, Perl, and APT operations; those warnings were noisy but not the cause of the Docker install failure.

The Debian 13 template used during the June 17, 2026 restore did not provide the previously listed software-properties-common package from the configured repositories. The bootstrap script no longer installs that package because Docker's official repository only needs ca-certificates, curl, gnupg, and the keyring-scoped source file the script writes.

Successful bootstrap output includes:

Docker Engine - Community
Docker Compose version
Docker installation verified

Verify the completed bootstrap with:

ssh pve 'pct exec 101 -- docker version; pct exec 101 -- docker compose version; pct exec 101 -- ls -ld /opt/docker/compose /opt/docker/volumes'

If direct SSH to docker-host asks for a password, do not assume one exists. The create script sets a root password only when PASSWORD_FILE is configured before CT creation. Use pct exec or pct enter through Proxmox for recovery work unless direct SSH access is intentionally configured later.

Template Download and IPv6

The template download may prefer IPv6 if the Proxmox host resolves both A and AAAA records for download.proxmox.com. During the June 17, 2026 restore, IPv6 connections repeatedly timed out. The immediate workaround was to stop the download and force IPv4 for the template:

ssh pve 'wget -4 -c -O /var/lib/vz/template/cache/debian-13-standard_13.1-2_amd64.tar.zst http://download.proxmox.com/images/system/debian-13-standard_13.1-2_amd64.tar.zst'

After the file downloads and verifies, rerun the create script. It will see the template in local storage and continue.

To make Proxmox prefer IPv4 generally without fully disabling IPv6, review and then uncomment this line in /etc/gai.conf:

precedence ::ffff:0:0/96  100

For wget only, inet4_only = on in /etc/wgetrc forces IPv4. Prefer the narrowest change that solves the download problem.

Current Backup Scripts

Script	Run from	Purpose
`scripts/backup/00-prepare-backup-root.sh`	Proxmox host as root	Create the timestamped backup root on the CIFS backup mount
`scripts/backup/01-capture-proxmox.sh`	Proxmox host as root	Capture host, storage, VM, and LXC definitions
`scripts/backup/02-capture-dns.sh`	Proxmox host as root	Capture DNS reference from CT `107` and optional Teleporter file
`scripts/backup/03-capture-docker-definitions.sh`	Proxmox host as root	Capture Docker compose/env/runtime reference from CT `100`
`scripts/backup/04-export-databases.sh`	Proxmox host as root	Export PostgreSQL, MariaDB, and MongoDB logical backups
`scripts/backup/05-archive-applications.sh`	Proxmox host as root	Archive retained application directories with volatile exclusions
`scripts/backup/06-archive-websites.sh`	Proxmox host as root	Archive website working trees when repository state is not enough
`scripts/backup/07-verify-backup.sh`	Proxmox host as root	Verify archives, dumps, and checksums
`scripts/backup/08-encrypt-and-copy-backup.sh`	Proxmox host as root	Create encrypted archive when `age` is configured and copy to a second destination

Answer Files

Restore:

cp scripts/restore/restore-answer.env.example scripts/restore/restore-answer.env
chmod 600 scripts/restore/restore-answer.env

Backup, on Proxmox:

cp scripts/backup/backup-answer.env.example /root/backup-answer.env
chmod 600 /root/backup-answer.env

The answer files are shell syntax. Do not commit filled copies.

Dry Runs

Most wrapper scripts support:

--dry-run

Dry-run mode checks arguments and prints actions that would write data. It does not prove that every remote command will succeed, so use it before a real run, not instead of validation.

Host Identity Rules

Proxmox commands require pct, qm, and pvesm.
Accepted Proxmox short hostnames default to pve pve02 because the live SSH alias returned pve on June 16, 2026 while older docs said pve02.
Docker restore commands inside the LXC require docker and docker compose.
NAS validation requires the configured backup directory to exist.
Direct ssh docker is never required.

Marker Files

Marker	Meaning
Workstation `.copy-complete`	NAS backup was fully extracted into the local staged directory
LXC `.copy-complete`	Local staged backup was fully streamed into the Docker LXC
`/opt/docker/volumes/postgresql/.logical-restore-complete`	PostgreSQL logical dumps were restored
`/opt/docker/volumes/gitea/.archive-restore-complete`	Forgejo archive data was applied or safely skipped because target data existed
`/opt/docker/volumes/vaultwarden/.archive-restore-complete`	Vaultwarden archive data was applied or safely skipped because target data existed

Markers make reruns converge. Remove or bypass them only after preserving the failed state and deciding that a forced restore is correct.

Restore Troubleshooting Notes

When the staged backup is streamed from macOS into the LXC, GNU tar in Debian may print messages like:

tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'

Those are macOS extended attributes added by the workstation tar implementation. They are not recovery data and can be ignored when tar exits successfully.

During PostgreSQL restore, globals.sql is applied before database dumps. A fresh official PostgreSQL container already has the postgres role, and a rerun after a partial restore may already have application roles. The restore script skips only CREATE ROLE lines for roles that already exist and keeps the later ALTER ROLE statements so restored attributes and password hashes are still applied.

Recovered compose files often rely on a sibling .env file for values such as DOMAIN, FQDN, database hostnames, and service ports. Running docker compose -f /path/to/compose.yml from another directory can use the wrong environment source. The restore script therefore runs Compose with --project-directory set to the compose file directory, so each recovered stack loads its own .env.

Shell environment variables also override values from a Compose project's .env. During the June 17, 2026 restore, the wrapper-level DOMAIN=kh3group.com overrode Vaultwarden's recovered DOMAIN=https://pass.kh3group.com and caused Vaultwarden to restart with:

DOMAIN variable needs to contain the protocol

The restore script now unsets known recovered-project keys before invoking Docker Compose, so the per-project .env values are used. Avoid running docker compose config in shared terminals or logs because it expands secret values from .env.

The historical Docker host used database hostnames that may not exist in the rebuilt Docker network. During the June 17, 2026 restore, Forgejo and Vaultwarden were recovered with DB_HOST=db2, while the rebuilt PostgreSQL container was named postgresql. The restore script now normalizes these application .env values to the rebuilt service name:

Forgejo DB_HOST=postgresql:5432
Vaultwarden DB_HOST=postgresql

PostgreSQL custom-format dumps can restore tables with the expected application owner while leaving the database itself owned by postgres, because the fresh database was created by the restore process before pg_restore. Vaultwarden then fails during startup migrations with:

permission denied for schema public

After restoring each application dump, the restore script now sets the database and public schema ownership/grants for the matching application role.

The Forgejo runner runs as UID/GID 1000:996 in the captured compose file, but the rebuilt host may assign a different GID to the docker group. During the June 17, 2026 restore, /var/run/docker.sock was root:docker with GID 991, so the runner could read .runner but could not use the Docker socket:

permission denied while trying to connect to the Docker daemon socket

The restore script now rewrites the runner compose user group to the current Docker socket GID and normalizes runner-data ownership to 1000:<docker-gid> with mode 0750.

The restored runner registration file can also point at the public Forgejo URL, for example https://git.kh3group.com. Before Traefik is restored, that public route may return 502 Bad Gateway even while Forgejo is healthy on the Docker backend network. The restore script normalizes the runner registration address to the internal backend URL:

http://forgejo:3000

This lets the runner start without depending on external route validation. Validate the public Forgejo route separately after Traefik is restored.

If 03-restore-docker-services.sh fails after PostgreSQL has started but before .logical-restore-complete is written, leave FORCE_RESTORE=0, keep the staged backup markers, and rerun the same script after fixing the cause. The script should reuse the staged backup and converge.