Recovery Script Usage
This page names the current scripts and where each command runs.
Current Restore Scripts
| Script | Run from | Purpose |
|---|---|---|
scripts/restore/00-preflight-access.sh |
Administration workstation | Validate ssh pve, ssh -p 2242 nas, NAS path, optional guarded ssh docker, and target CT identity |
scripts/restore/01-stage-backup.sh |
Administration workstation | Copy the NAS backup to local /tmp with tar over SSH and write .copy-complete |
scripts/restore/02-validate-artifacts.sh |
Administration workstation | Validate archive shape, database dump shape, and required service artifacts |
scripts/restore/03-restore-docker-services.sh |
Administration workstation via ssh pve and pct exec |
Copy staged backup into CT 101 and run the inner Docker restore |
scripts/restore/04-validate-services.sh |
Administration workstation via ssh pve and pct exec |
Validate networks, PostgreSQL, restored databases, containers, and Traefik availability |
scripts/recover-docker-services.sh |
Inside target Docker LXC, normally invoked by stage 03 | Restore Docker networks, compose files, env files, archives, PostgreSQL dumps, and services |
The older root-level helper scripts are still relevant:
| Script | Run from | Purpose |
|---|---|---|
scripts/create-docker-host-lxc-101.sh |
Proxmox host as root | Create or update CT 101 docker-host; refuses an existing wrong hostname |
scripts/bootstrap-debian13-docker-lxc.sh |
Inside the new Docker LXC as root | Install Docker Engine and create /opt/docker/compose and /opt/docker/volumes |
scripts/proxmox-host-maintenance.sh |
Proxmox host as root | Optional host maintenance; not part of a restore unless planned |
Create the Docker LXC
scripts/create-docker-host-lxc-101.sh must run on the Proxmox host as root.
From the administration workstation, the simplest safe pattern is to stream the
local script over the documented SSH alias:
ssh pve 'bash -s' < scripts/create-docker-host-lxc-101.sh
Run this read-only check first:
ssh pve 'hostname; command -v pct; command -v pveam; pct status 101 || true; pct config 101 || true'
If CT 101 already exists, the script accepts it only when its hostname is the
expected docker-host. A different hostname causes the script to stop before
changing that container.
To run with explicit settings, copy the answer file to Proxmox, edit it there, then stream the script:
scp scripts/recovery-answer.env.example pve:/root/recovery-answer.env
ssh pve 'nano /root/recovery-answer.env'
ssh pve 'bash -s -- --answer-file /root/recovery-answer.env' < scripts/create-docker-host-lxc-101.sh
The answer file path in that command is a path on the Proxmox host, not on the workstation.
The create script downloads the Debian LXC template when needed. It queries
Proxmox template metadata with pveam available --section system, selects the
latest debian-13-standard_*_amd64.tar.zst, checks the configured
TEMPLATE_STORAGE, and runs pveam download only if the template is missing.
The created container is unprivileged:
--unprivileged 1
The script always enables the standard Docker-in-unprivileged-LXC features:
nesting=1,keyctl=1
Those are applied at creation and again with pct set, so rerunning the script
keeps the container converged.
Broader LXC relaxations are available only when explicitly requested:
RELAXED_LXC_SECURITY=1
That setting appends these Proxmox LXC config lines if they are missing:
lxc.apparmor.profile: unconfined
lxc.cgroup2.devices.allow: a
lxc.cap.drop:
Leave RELAXED_LXC_SECURITY=0 unless Docker workloads require those broader
permissions. The script does not currently set a low-port sysctl such as
net.ipv4.ip_unprivileged_port_start. For rootful Docker inside the LXC,
published ports such as 80 and 443 are normally handled by Docker running as
root inside the container. If a restored workload still cannot bind low ports,
record the failure and decide whether RELAXED_LXC_SECURITY=1 is justified.
After CT creation succeeds, install Docker from inside the new LXC with
scripts/bootstrap-debian13-docker-lxc.sh. That bootstrap script does not
download the LXC image; it installs Docker Engine in an already-created Debian
container.
Bootstrap Docker in CT 101
After the create script reports CT 101 is ready for bootstrap, run the Docker
bootstrap from the administration workstation through Proxmox:
ssh pve 'pct exec 101 -- bash -s' < scripts/bootstrap-debian13-docker-lxc.sh
If the bootstrap needs values from /root/recovery-answer.env, remember that
pct exec runs inside CT 101. A file at /root/recovery-answer.env on the
Proxmox host is not automatically visible inside the LXC. Push it into the LXC
first:
ssh pve 'pct push 101 /root/recovery-answer.env /root/recovery-answer.env --perms 0600'
ssh pve 'pct exec 101 -- bash -s -- --answer-file /root/recovery-answer.env' < scripts/bootstrap-debian13-docker-lxc.sh
For the current script, the answer file is optional because TIMEZONE defaults
to Africa/Accra.
The bootstrap script sets LANG=C.UTF-8 and LC_ALL=C.UTF-8 to avoid locale
warnings inherited from the workstation. Earlier runs showed unsupported
LC_CTYPE=UTF-8 warnings during pct, Perl, and APT operations; those warnings
were noisy but not the cause of the Docker install failure.
The Debian 13 template used during the June 17, 2026 restore did not provide the
previously listed software-properties-common package from the configured
repositories. The bootstrap script no longer installs that package because
Docker's official repository only needs ca-certificates, curl, gnupg, and
the keyring-scoped source file the script writes.
Successful bootstrap output includes:
Docker Engine - Community
Docker Compose version
Docker installation verified
Verify the completed bootstrap with:
ssh pve 'pct exec 101 -- docker version; pct exec 101 -- docker compose version; pct exec 101 -- ls -ld /opt/docker/compose /opt/docker/volumes'
If direct SSH to docker-host asks for a password, do not assume one exists.
The create script sets a root password only when PASSWORD_FILE is configured
before CT creation. Use pct exec or pct enter through Proxmox for recovery
work unless direct SSH access is intentionally configured later.
Template Download and IPv6
The template download may prefer IPv6 if the Proxmox host resolves both A and
AAAA records for download.proxmox.com. During the June 17, 2026 restore, IPv6
connections repeatedly timed out. The immediate workaround was to stop the
download and force IPv4 for the template:
ssh pve 'wget -4 -c -O /var/lib/vz/template/cache/debian-13-standard_13.1-2_amd64.tar.zst http://download.proxmox.com/images/system/debian-13-standard_13.1-2_amd64.tar.zst'
After the file downloads and verifies, rerun the create script. It will see the template in local storage and continue.
To make Proxmox prefer IPv4 generally without fully disabling IPv6, review and
then uncomment this line in /etc/gai.conf:
precedence ::ffff:0:0/96 100
For wget only, inet4_only = on in /etc/wgetrc forces IPv4. Prefer the
narrowest change that solves the download problem.
Current Backup Scripts
| Script | Run from | Purpose |
|---|---|---|
scripts/backup/00-prepare-backup-root.sh |
Proxmox host as root | Create the timestamped backup root on the CIFS backup mount |
scripts/backup/01-capture-proxmox.sh |
Proxmox host as root | Capture host, storage, VM, and LXC definitions |
scripts/backup/02-capture-dns.sh |
Proxmox host as root | Capture DNS reference from CT 107 and optional Teleporter file |
scripts/backup/03-capture-docker-definitions.sh |
Proxmox host as root | Capture Docker compose/env/runtime reference from CT 100 |
scripts/backup/04-export-databases.sh |
Proxmox host as root | Export PostgreSQL, MariaDB, and MongoDB logical backups |
scripts/backup/05-archive-applications.sh |
Proxmox host as root | Archive retained application directories with volatile exclusions |
scripts/backup/06-archive-websites.sh |
Proxmox host as root | Archive website working trees when repository state is not enough |
scripts/backup/07-verify-backup.sh |
Proxmox host as root | Verify archives, dumps, and checksums |
scripts/backup/08-encrypt-and-copy-backup.sh |
Proxmox host as root | Create encrypted archive when age is configured and copy to a second destination |
Answer Files
Restore:
cp scripts/restore/restore-answer.env.example scripts/restore/restore-answer.env
chmod 600 scripts/restore/restore-answer.env
Backup, on Proxmox:
cp scripts/backup/backup-answer.env.example /root/backup-answer.env
chmod 600 /root/backup-answer.env
The answer files are shell syntax. Do not commit filled copies.
Dry Runs
Most wrapper scripts support:
--dry-run
Dry-run mode checks arguments and prints actions that would write data. It does not prove that every remote command will succeed, so use it before a real run, not instead of validation.
Host Identity Rules
- Proxmox commands require
pct,qm, andpvesm. - Accepted Proxmox short hostnames default to
pve pve02because the live SSH alias returnedpveon June 16, 2026 while older docs saidpve02. - Docker restore commands inside the LXC require
dockeranddocker compose. - NAS validation requires the configured backup directory to exist.
- Direct
ssh dockeris never required.
Marker Files
| Marker | Meaning |
|---|---|
Workstation .copy-complete |
NAS backup was fully extracted into the local staged directory |
LXC .copy-complete |
Local staged backup was fully streamed into the Docker LXC |
/opt/docker/volumes/postgresql/.logical-restore-complete |
PostgreSQL logical dumps were restored |
/opt/docker/volumes/gitea/.archive-restore-complete |
Forgejo archive data was applied or safely skipped because target data existed |
/opt/docker/volumes/vaultwarden/.archive-restore-complete |
Vaultwarden archive data was applied or safely skipped because target data existed |
Markers make reruns converge. Remove or bypass them only after preserving the failed state and deciding that a forced restore is correct.
Restore Troubleshooting Notes
When the staged backup is streamed from macOS into the LXC, GNU tar in Debian may print messages like:
tar: Ignoring unknown extended header keyword 'LIBARCHIVE.xattr.com.apple.provenance'
Those are macOS extended attributes added by the workstation tar implementation. They are not recovery data and can be ignored when tar exits successfully.
During PostgreSQL restore, globals.sql is applied before database dumps. A
fresh official PostgreSQL container already has the postgres role, and a
rerun after a partial restore may already have application roles. The restore
script skips only CREATE ROLE lines for roles that already exist and keeps the
later ALTER ROLE statements so restored attributes and password hashes are
still applied.
Recovered compose files often rely on a sibling .env file for values such as
DOMAIN, FQDN, database hostnames, and service ports. Running
docker compose -f /path/to/compose.yml from another directory can use the
wrong environment source. The restore script therefore runs Compose with
--project-directory set to the compose file directory, so each recovered stack
loads its own .env.
Shell environment variables also override values from a Compose project's
.env. During the June 17, 2026 restore, the wrapper-level DOMAIN=kh3group.com
overrode Vaultwarden's recovered DOMAIN=https://pass.kh3group.com and caused
Vaultwarden to restart with:
DOMAIN variable needs to contain the protocol
The restore script now unsets known recovered-project keys before invoking
Docker Compose, so the per-project .env values are used. Avoid running
docker compose config in shared terminals or logs because it expands secret
values from .env.
The historical Docker host used database hostnames that may not exist in the
rebuilt Docker network. During the June 17, 2026 restore, Forgejo and
Vaultwarden were recovered with DB_HOST=db2, while the rebuilt PostgreSQL
container was named postgresql. The restore script now normalizes these
application .env values to the rebuilt service name:
Forgejo DB_HOST=postgresql:5432
Vaultwarden DB_HOST=postgresql
PostgreSQL custom-format dumps can restore tables with the expected application
owner while leaving the database itself owned by postgres, because the fresh
database was created by the restore process before pg_restore. Vaultwarden
then fails during startup migrations with:
permission denied for schema public
After restoring each application dump, the restore script now sets the database
and public schema ownership/grants for the matching application role.
The Forgejo runner runs as UID/GID 1000:996 in the captured compose file, but
the rebuilt host may assign a different GID to the docker group. During the
June 17, 2026 restore, /var/run/docker.sock was root:docker with GID 991,
so the runner could read .runner but could not use the Docker socket:
permission denied while trying to connect to the Docker daemon socket
The restore script now rewrites the runner compose user group to the current
Docker socket GID and normalizes runner-data ownership to 1000:<docker-gid>
with mode 0750.
The restored runner registration file can also point at the public Forgejo URL,
for example https://git.kh3group.com. Before Traefik is restored, that public
route may return 502 Bad Gateway even while Forgejo is healthy on the Docker
backend network. The restore script normalizes the runner registration address
to the internal backend URL:
http://forgejo:3000
This lets the runner start without depending on external route validation. Validate the public Forgejo route separately after Traefik is restored.
If 03-restore-docker-services.sh fails after PostgreSQL has started but before
.logical-restore-complete is written, leave FORCE_RESTORE=0, keep the
staged backup markers, and rerun the same script after fixing the cause. The
script should reuse the staged backup and converge.