✨ Add automated session logging system

- Created logs/ structure (sessions, operations, incidents) - Added session-start/log/end scripts - Installed Git hooks for auto-logging commits/pushes - Added shell integration for zsh - Created CHANGELOG.md - Documented today's session (2026-01-10)
2026-01-10 04:53:17 -08:00
parent e67882fd15
commit 744c149300
260 changed files with 6364 additions and 68 deletions
--- a/INFRASTRUCTURE.md
+++ b/INFRASTRUCTURE.md
@@ -1607,3 +1607,199 @@ ps aux | awk '$3 > 50'

 ---

+
+### Incident #4: ALL PostgreSQL Images Show Malware — NODE1 Host Compromise Suspected (Jan 10, 2026)
+
+**Timeline:**
+- **Jan 10, 2026**: Testing postgres:16-alpine — malware artifacts found
+- **Jan 10, 2026**: Testing postgres:14 (non-alpine) — malware artifacts found  
+- **Jan 10, 2026**: Testing postgres:16 (Debian) — malware artifacts found
+
+**Confirmed "Compromised" Images (on NODE1):**
+```bash
+# ALL of these show malware artifacts when run on NODE1:
+❌ postgres:15-alpine  # Incident #3
+❌ postgres:16-alpine  # NEW
+❌ postgres:14         # NEW (non-alpine!)
+❌ postgres:16         # NEW (Debian base!)
+```
+
+**Malware Artifacts (IOC):**
+```bash
+/tmp/httpd           # ~10MB, crypto miner (xmrig variant)
+/tmp/.perf.c/        # perfctl malware staging directory
+```
+
+**🔴 CRITICAL ASSESSMENT:**
+
+**This is NOT "all Docker Hub official images are infected".**
+
+**This is most likely NODE1 HOST COMPROMISE** (perfctl/cryptominer persistence).
+
+**Evidence supporting HOST compromise (not image compromise):**
+
+| Evidence | Explanation |
+|----------|-------------|
+| `/tmp/.perf.c/` directory | Classic perfctl malware staging directory |
+| `/tmp/httpd` ~10MB | Typical xmrig miner with Apache masquerade |
+| ALL postgres variants affected | Statistically impossible for Docker Hub |
+| NODE1 had 3 previous incidents | Already compromised (Incidents #1, #2, #3) |
+| `tmpfs noexec` didn't help | Malware runs from HOST, not container |
+| Same IOCs across different images | Infection happens post-pull, not in image |
+
+**Probable Attack Vector (perfctl family):**
+- Initial compromise via Incident #1 or #2 (daarion-web container)
+- Persistence mechanism survived container/image cleanup
+- Malware hooks into Docker daemon or uses cron/systemd
+- Infects ANY new container on startup via:
+  - Modified docker daemon
+  - LD_PRELOAD injection
+  - Kernel module
+  - Cron job that monitors new containers
+
+**🔬 VERIFICATION PROCEDURE (REQUIRED):**
+
+```bash
+# ═══════════════════════════════════════════════════════════════
+# STEP 1: Get image digest from NODE1
+# ═══════════════════════════════════════════════════════════════
+ssh root@144.76.224.179 "docker inspect --format='{{index .RepoDigests 0}}' postgres:16"
+# Example output: postgres@sha256:abc123...
+
+# ═══════════════════════════════════════════════════════════════
+# STEP 2: On CLEAN host (MacBook/NODE2), pull SAME digest
+# ═══════════════════════════════════════════════════════════════
+# On your MacBook (NOT NODE1!):
+docker pull postgres:16@sha256:<digest_from_step1>
+
+# ═══════════════════════════════════════════════════════════════
+# STEP 3: Run on clean host and check /tmp
+# ═══════════════════════════════════════════════════════════════
+docker run --rm -it postgres:16@sha256:<digest> sh -c "ls -la /tmp/ && find /tmp -type f"
+
+# EXPECTED RESULTS:
+# - If /tmp is EMPTY on clean host → IMAGE IS CLEAN → NODE1 IS COMPROMISED
+# - If /tmp has httpd/.perf.c on clean host → IMAGE IS COMPROMISED → Report to Docker
+
+# ═══════════════════════════════════════════════════════════════
+# STEP 4: Check NODE1 host for persistence mechanisms
+# ═══════════════════════════════════════════════════════════════
+ssh root@144.76.224.179 << 'REMOTE_CHECK'
+echo "=== CRON ==="
+crontab -l 2>/dev/null
+cat /etc/crontab
+ls -la /etc/cron.d/
+
+echo "=== SYSTEMD ==="
+systemctl list-units --type=service | grep -iE "perf|miner|http|crypto"
+
+echo "=== LD_PRELOAD ==="
+cat /etc/ld.so.preload 2>/dev/null
+echo $LD_PRELOAD
+
+echo "=== KERNEL MODULES ==="
+lsmod | head -20
+
+echo "=== SUSPICIOUS PROCESSES ==="
+ps aux | grep -E "(httpd|xmrig|kdevtmp|kinsing|perfctl|\.perf)" | grep -v grep
+
+echo "=== NETWORK TO MINING POOLS ==="
+ss -anp | grep -E "(3333|4444|5555|8080|8888)" | head -10
+
+echo "=== SSH AUTHORIZED KEYS ==="
+cat /root/.ssh/authorized_keys
+
+echo "=== DOCKER DAEMON CONFIG ==="
+cat /etc/docker/daemon.json 2>/dev/null
+REMOTE_CHECK
+```
+
+**🔴 DECISION MATRIX:**
+
+| Verification Result | Conclusion | Action |
+|---------------------|------------|--------|
+| Clean host: no malware | **NODE1 COMPROMISED** | Full rebuild of NODE1 |
+| Clean host: same malware | **Docker Hub compromised** | Report to Docker Security |
+
+**If NODE1 Confirmed Compromised (most likely):**
+
+1. 🔴 **STOP using NODE1 immediately** for any workloads
+2. 🔴 **Rotate ALL secrets** that NODE1 ever accessed:
+   ```
+   - SSH keys (generate new on clean machine)
+   - Telegram bot tokens (regenerate via @BotFather)
+   - PostgreSQL passwords
+   - All API keys in .env
+   - JWT secrets
+   - Neo4j credentials
+   - Redis password (if any)
+   ```
+3. 🔴 **Full OS reinstall** (not cleanup!):
+   - Request fresh install from Hetzner Robot
+   - Or use rescue mode + full disk wipe
+   - New SSH keys generated on clean machine
+4. 🟡 **Verify images on clean host BEFORE deploying to new NODE1**
+5. 🟢 **Implement proper security controls** (see Prevention below)
+
+**Alternative PostgreSQL Sources (if Docker Hub suspected):**
+```bash
+# GitHub Container Registry (GHCR)
+docker pull ghcr.io/docker-library/postgres:16-alpine
+
+# Quay.io (Red Hat operated)
+docker pull quay.io/fedora/postgresql-16
+
+# Build from official Dockerfile (most secure)
+git clone https://github.com/docker-library/postgres.git
+cd postgres/16/alpine
+docker build -t postgres:16-alpine-verified .
+# Then scan with Trivy before use
+trivy image postgres:16-alpine-verified
+```
+
+**NODE1 Persistence Locations to Check:**
+```bash
+# File-based persistence
+/etc/cron.d/*
+/etc/crontab
+/var/spool/cron/*
+/etc/systemd/system/*.service
+/etc/init.d/*
+/etc/rc.local
+/root/.bashrc
+/root/.profile
+/etc/ld.so.preload
+
+# Memory/process persistence
+/dev/shm/*
+/run/*
+/var/run/*
+
+# Docker-specific
+/var/lib/docker/
+/etc/docker/daemon.json
+~/.docker/config.json
+
+# Kernel-level (advanced)
+/lib/modules/*/
+/proc/modules
+```
+
+**References:**
+- perfctl malware: https://blog.exatrack.com/Perfctl-using-portainer-and-new-persistences/
+- Similar reports: https://github.com/docker-library/postgres/issues/1307
+- Docker Hub attacks: https://jfrog.com/blog/attacks-on-docker-with-millions-of-malicious-repositories-spread-malware-and-phishing-scams/
+
+**Lessons Learned (Incident #4 Specific):**
+1. 🔴 **Host compromise masquerades as image compromise** — Always verify on clean host
+2. 🟡 **Previous incidents leave persistence** — Cleanup is not enough, rebuild required
+3. 🟢 **perfctl family is sophisticated** — Survives container restarts, image deletions
+4. 🔵 **Multiple images "infected" = host problem** — Statistical impossibility otherwise
+5. 🟣 **NODE1 is UNTRUSTED** — Do not use until full rebuild + verification
+
+**Current Status:**
+- ⏳ **Verification pending** — Need to test same digest on clean host
+- 🔴 **NODE1 unsafe** — Do not deploy PostgreSQL or any new containers
+- 🟡 **Secrets rotation needed** — Assume all NODE1 secrets compromised
+
+---