Commit Graph

67 Commits

Author SHA1 Message Date
Apple
de7533f97e docs: add session preflight and expand lint scope batch5 2026-02-16 03:53:56 -08:00
Apple
b722e28338 docs: add local scheduled maintenance runner (no auto-push) 2026-02-16 02:37:29 -08:00
Apple
5f2fd7905f docs: sync consolidation and session starter 2026-02-16 02:32:27 -08:00
Apple
3146e74ce8 docs: sync consolidation and session starter 2026-02-16 02:32:27 -08:00
Apple
0d8582d552 docs(node1): add safe deploy workflow and snapshot
Document canonical sync between GitHub and NODA1 and add a snapshot script to capture runtime state without editing production by hand.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-10 05:33:32 -08:00
Apple
a46a70c014 fix(ops): Add network aliases and stabilize DNS for NODA1
- docker-compose.node1.yml: Add network aliases (router, gateway,
  memory-service, qdrant, nats, neo4j) to eliminate manual
  `docker network connect --alias` commands
- docker-compose.node1.yml: ROUTER_URL now uses env variable with
  fallback: ${ROUTER_URL:-http://router:8000}
- docker-compose.node1.yml: Increase router healthcheck start_period
  to 30s and retries to 5
- .gitignore: Add noda1-credentials.local.mdc (local-only SSH creds)
- scripts/node1/verify_agents.sh: Improved output with agent list
- docs: Add NODA1-AGENT-VERIFICATION.md, NODA1-AGENT-ARCHITECTURE.md,
  NODA1-VERIFICATION-REPORT-2026-02-03.md
- config/README.md: How to add new agents
- .cursor/rules/, .cursor/skills/: NODA1 operations skill for Cursor

Root cause fixed: Gateway could not resolve 'router' DNS name when
Router container was named 'dagi-staging-router' without alias.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-03 05:55:56 -08:00
Apple
3ecb43dafc feat(P0): Add JetStream streams, DLQ, timeout policy 2026-01-28 07:11:09 -08:00
Apple
a3923cd96f feat(P0/P1/P2): Add E2E agent prober, version pinning, prometheus fixes 2026-01-28 07:06:07 -08:00
Apple
0c8bef82f4 feat: Add Alateya, Clan, Eonarch agents + fix gateway-router connection
## Agents Added
- Alateya: R&D, biotech, innovations
- Clan (Spirit): Community spirit agent
- Eonarch: Consciousness evolution agent

## Changes
- docker-compose.node1.yml: Added tokens for all 3 new agents
- gateway-bot/http_api.py: Added configs and webhook endpoints
- gateway-bot/clan_prompt.txt: New prompt file
- gateway-bot/eonarch_prompt.txt: New prompt file

## Fixes
- Fixed ROUTER_URL from :9102 to :8000 (internal container port)
- All 9 Telegram agents now working

## Documentation
- Created PROJECT-MASTER-INDEX.md - single entry point
- Added various status documents and scripts

Tokens configured:
- Helion, NUTRA, Agromatrix (existing)
- Alateya, Clan, Eonarch (new)
- Druid, GreenFood, DAARWIZZ (configured)
2026-01-28 06:40:34 -08:00
Apple
5290287058 feat: implement TTS, Document processing, and Memory Service /facts API
- TTS: xtts-v2 integration with voice cloning support
- Document: docling integration for PDF/DOCX/PPTX processing
- Memory Service: added /facts/upsert, /facts/{key}, /facts endpoints
- Added required dependencies (TTS, docling)
2026-01-17 08:16:37 -08:00
Apple
1231647f94 🛡️ Add comprehensive Security Hardening Plan
- Created SECURITY-HARDENING-PLAN.md with 6 security levels
- Added setup-node1-security.sh for automated hardening
- Added scan-image.sh for pre-deployment image scanning
- Created docker-compose.secure.yml template
- Includes: Trivy, fail2ban, UFW, auditd, rkhunter, chkrootkit
- Network isolation, egress filtering, process monitoring
- Incident response procedures and recovery playbook
2026-01-10 05:05:21 -08:00
Apple
744c149300 Add automated session logging system
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Created logs/ structure (sessions, operations, incidents)
- Added session-start/log/end scripts
- Installed Git hooks for auto-logging commits/pushes
- Added shell integration for zsh
- Created CHANGELOG.md
- Documented today's session (2026-01-10)
2026-01-10 04:53:17 -08:00
Apple
d77a4769c6 🔒 security(daarion-web): Hardening after crypto-mining incidents
## Root Cause Analysis
- Found CRITICAL RCE vulnerability in Next.js 15.0.3 (GHSA-9qr9-h5gf-34mp)
- 10 vulnerabilities total including SSRF, DoS, Auth Bypass
- Attack vector: exposed port 3000 + vulnerable Next.js → remote code execution

## Security Fixes
- Upgraded Next.js: 15.0.3 → 15.5.9 (0 vulnerabilities)
- Upgraded eslint-config-next: 15.0.3 → 15.5.9

## Hardening (New Files)
- apps/web/Dockerfile.secure: Multi-stage build, read-only FS, no shell
- docker-compose.web.secure.yml: Resource limits, cap_drop ALL, localhost bind
- scripts/rebuild-daarion-web-secure.sh: Local secure rebuild with Trivy scan
- scripts/deploy-daarion-web-node1.sh: Production deployment to NODE1
- SECURITY-REBUILD-REPORT.md: Full incident analysis and remediation report

## Key Security Measures
- restart: "no" (until verified)
- ports: 127.0.0.1:3000 (localhost only, use Nginx reverse proxy)
- read_only: true
- cap_drop: ALL
- resources.limits: 1 CPU, 512M RAM
- no-new-privileges: true

## Related Incidents
- Incident #1 (Jan 8): catcal, G4NQXBp miners
- Incident #2 (Jan 9): softirq, vrarhpb miners
- Hetzner AbuseID: 10F3971:2A

Co-authored-by: Cursor Agent <agent@cursor.sh>
2026-01-09 02:08:13 -08:00
Apple
b2caee4e0e fix: CRITICAL - Prevent infinite DROP DATABASE loop
ROOT CAUSE: Monitor was doing DROP DATABASE when NODE2 agents were missing,
but the backup didn't have NODE2 agents, causing an infinite loop.

FIX:
- FULL RECOVERY (DROP DATABASE) only when MicroDAOs < 5 (critical data loss)
- SOFT RECOVERY (just sync agents) when MicroDAOs exist but agents missing
- Prefer backup with NODE2 agents (full_backup_with_node2*.sql)
- Never DROP DATABASE if MicroDAOs exist

This prevents the daily data loss issue.
2025-12-05 02:41:43 -08:00
Apple
02a0ea9540 fix: Add NODE2 agent count check to prevent data loss
- Check for at least 45 NODE2 agents (out of 50 expected)
- This prevents false positives when only core agents exist
- Better detection of actual data loss
2025-12-05 02:36:36 -08:00
Apple
06fe0c5204 fix: Improve database recovery process
- Fix empty variable handling in data checks
- Terminate active connections before dropping database
- Increase agent threshold to 50 (9 core + 50 NODE2)
- Add better logging for agent sync verification
2025-12-05 02:35:57 -08:00
Apple
db3b74e1ba fix: Integrate asset URL fix into recovery process and update docs 2025-12-03 10:13:19 -08:00
Apple
51fdd0d5da feat: Add script to fix asset URLs after restore 2025-12-03 10:12:21 -08:00
Apple
94889783a3 fix: Restore asset URLs (logos/banners) after database recovery
- Update monitor-db-stability.sh to fix asset URLs after restore
- Convert old /assets/ URLs to MinIO format
- Clear invalid banner URLs
2025-12-03 10:12:16 -08:00
Apple
19e8436a02 fix: Add database stability monitoring and improve PostgreSQL config
- Add monitor-db-stability.sh for automatic recovery
- Improve PostgreSQL shutdown settings to prevent data loss
- Add checkpoint and WAL settings for better persistence
2025-12-03 09:59:41 -08:00
Apple
7ac2f9c958 fix: Remove setup-node2-agents.sh that was creating test agents
- This script was trying to assign test agents (ag_atlas, etc.) to NODE2
- Use sync-node2-dagi-agents.py instead for loading real agents
- Test agents are now automatically removed by health check
2025-12-02 13:58:58 -08:00
Apple
6a76cffb88 fix: Add automatic removal of test agents in health check
- Add remove-test-agents.sh script
- Integrate test agent removal into db-health-check.sh
- Prevents test agents (ag_atlas, ag_oracle, ag_builder, ag_greeter) from reappearing
2025-12-02 13:57:28 -08:00
Apple
b27bfc1df5 feat: Add script to restore assets to MinIO and update DB URLs 2025-12-02 13:45:14 -08:00
Apple
488dd13af2 fix: Add database persistence and health check scripts
- Add apply-migrations.sh for automatic migration application
- Add ensure-db-persistence.sh for database integrity checks
- Add db-health-check.sh for periodic health monitoring
- Improve PostgreSQL configuration in docker-compose.db.yml
- Add proper shutdown settings to prevent data loss
2025-12-02 13:41:03 -08:00
Apple
fca48b3eb0 feat(node2): Complete NODE2 setup - guardian, agents, swapper models
- Node-guardian running on MacBook and updating metrics
- NODE2 agents (Atlas, Greeter, Oracle, Builder Bot) assigned to node-2-macbook-m4max
- Swapper models displaying correctly (8 models)
- DAGI Router agents showing with correct status (3 active, 1 stale)
- Router health check using node_cache for remote nodes
2025-12-02 07:07:58 -08:00
Apple
88188ed693 fix(node2): Use node_cache router_healthy for DAGI Router agents status
- Fix get_dagi_router_agents to use router_healthy from node_cache first
- Fallback to direct API call only if cache is unavailable
- This fixes NODE2 agents showing as 'stale' when router is actually healthy
- Fix CITY_SERVICE_URL in scripts (remove /api/city, use /api)
2025-12-02 07:02:08 -08:00
Apple
80123fd1be feat(node2): Add scripts and docs for NODE2 guardian setup
- Add start-node2-guardian.sh script for easy launch
- Add setup-node2-agents.sh to update node_id for NODE2 agents
- Add NODE2_GUARDIAN_QUICKSTART.md with detailed instructions
- Update agents node_id to node-2-macbook-m4max
2025-12-02 06:59:48 -08:00
Apple
b79db5b2a4 feat(assets): Add NGINX config and migration scripts for MinIO assets
- Add NGINX reverse proxy config for assets.daarion.space
- Add script to migrate assets from /static/uploads to MinIO
- Add script to update asset URLs in database after migration
2025-12-02 02:11:26 -08:00
Apple
8e8f95e9ef feat(db-hardening): Add database persistence, backups, and MinIO assets storage
Database Hardening:
- Add docker-compose.db.yml with persistent PostgreSQL volume
- Add automatic DB backups every 12h (7 days, 4 weeks, 6 months retention)
- Add MinIO S3-compatible storage for assets

Assets Migration:
- Add MinIO client (lib/assets_client.py) for upload/delete
- Update upload endpoint to use MinIO (with local fallback)
- Add migration 043_asset_urls_to_text.sql for full HTTPS URLs
- Simplify normalizeAssetUrl for S3 URLs

Recovery:
- Add seed_full_city_reset.py for emergency city recovery
- Add DB_RESTORE.md with backup restore instructions
- Add SEED_RECOVERY.md with recovery procedures
- Add INFRA_ASSETS_MINIO.md with MinIO setup guide

Task: TASK_PHASE_DATABASE_HARDENING_AND_ASSETS_MIGRATION_v1
2025-12-02 01:56:39 -08:00
Apple
0039be5dc0 feat(rooms): Add city-lobby with DAARWIZZ + fix API proxy
- Created city-lobby room as main public chat with DAARWIZZ
- Fixed /api/city/rooms proxy to use correct backend path (/api/v1/city/rooms)
- Updated district rooms with zone keys (leadership, system, engineering, etc.)
- Set MicroDAO lobbies as primary rooms
- Created seed_city_rooms.py script
- Created TASK_PHASE_CITY_ROOMS_AND_PUBLIC_CHAT_v1.md

Total: 35 rooms, 31 public, 10 districts
2025-12-01 08:47:37 -08:00
Apple
2f8e471e03 feat(node2): Full DAGI integration - 50 agents synced
- Created sync-node2-dagi-agents.py script to sync agents from agents_city_mapping.yaml
- Synced 50 DAGI agents across 10 districts:
  - Leadership Hall (4): Solarius, Sofia, PrimeSynth, Nexor
  - System Control (6): Monitor, Strategic Sentinels, Vindex, Helix, Aurora, Arbitron
  - Engineering Lab (5): ByteForge, Vector, ChainWeaver, Cypher, Canvas
  - Marketing Hub (6): Roxy, Mira, Tempo, Harmony, Faye, Storytelling
  - Finance Office (4): Financial Analyst, Accountant, Budget Planner, Tax Advisor
  - Web3 District (5): Smart Contract Dev, DeFi Analyst, Tokenomics Expert, NFT Specialist, DAO Governance
  - Security Bunker (7): Shadelock, Exor, Penetration Tester, Security Monitor, Incident Responder, Shadelock Forensics, Exor Forensics
  - Vision Studio (4): Iris, Lumen, Spectra, Video Analyzer
  - R&D Lab (6): ProtoMind, LabForge, TestPilot, ModelScout, BreakPoint, GrowCell
  - Memory Vault (3): Somnia, Memory Manager, Knowledge Indexer
- Fixed Swapper config to use swapper_config_node2.yaml with 8 models
- Created TASK_PHASE_NODE2_FULL_DAGI_INTEGRATION_v1.md

NODE2 now shows:
- 50 agents in DAGI Router Card
- 8 models in Swapper Service (gpt-oss, phi3, starcoder2, mistral-nemo, gemma2, deepseek-coder, qwen2.5-coder, deepseek-r1)
- Full isolation from NODE1
2025-12-01 08:31:25 -08:00
Apple
a818f2ac2f feat: add router health metrics to node_cache and node-guardian
- Add migration 042_node_cache_router_metrics.sql
- Node guardian now collects router health and sends in heartbeat
- City-service uses cached router_healthy from node_cache
- This allows NODE2 router status to be displayed correctly
2025-12-01 08:03:46 -08:00
Apple
9b9a72ffbd feat: full node isolation - use node-specific swapper_url and router_url from DB
- Add migration 041_node_local_endpoints.sql
- Add get_node_endpoints() to repo_city.py
- Update routes_city.py to use DB endpoints instead of hardcoded URLs
- Update node-guardian-loop.py to use NODE_SWAPPER_URL/NODE_ROUTER_URL env vars
- Update launchd plist for NODE2 with router URL
2025-12-01 08:01:53 -08:00
Apple
b25e002db6 feat: add logging for node isolation debugging in node-guardian 2025-12-01 07:35:37 -08:00
Apple
9e7b1f25ef fix: add node button visibility, fix node-guardian swapper health check, fix banner URL transform 2025-12-01 07:08:36 -08:00
Apple
8e14750f8b fix: discover_node_state.py global variable scope, add generated node state files 2025-12-01 06:50:48 -08:00
Apple
f5c58358a0 feat: add 'Додати ноду' button to Node Directory, create /nodes/register page, add node discovery script 2025-12-01 06:47:27 -08:00
Apple
909258fdcb fix: DAGI Router agents logic, MicroDAO logo URL handling 2025-12-01 06:03:08 -08:00
Apple
e3accd4df0 feat: DAGI Router v2 - new endpoints, hooks, and UI card 2025-12-01 05:21:43 -08:00
Apple
1a81cf75f1 feat: add unified API proxy layer, debug endpoint, and systemd service for node-guardian 2025-12-01 03:43:06 -08:00
Apple
b3e3c6417d fix: update Swapper endpoints (/health, /models), remove upload size limits, auto-convert images 2025-12-01 03:03:27 -08:00
Apple
135e8ed83c fix: suppress expected swapper connection errors in guardian loop 2025-11-30 15:41:42 -08:00
Apple
4f123ae79b fix: update deploy script to avoid container conflicts 2025-11-30 15:25:37 -08:00
Apple
cbaaed5e23 fix: fix backend files and add swapper functionality 2025-11-30 15:19:11 -08:00
Apple
fd814b2059 feat: implement Swapper metrics collection and UI 2025-11-30 15:12:49 -08:00
Apple
5b5160ad8b fix: correct API endpoints in node-guardian-loop script 2025-11-30 15:01:06 -08:00
Apple
4ae9ee4d70 fix: allow healthy status in invariants check 2025-11-30 14:53:53 -08:00
Apple
a8617df1d0 fix: check /health instead of /healthz in invariants script 2025-11-30 14:53:17 -08:00
Apple
5c1d7d15f9 fix: correct API endpoints in verification scripts 2025-11-30 14:51:59 -08:00
Apple
6d4f9ec7c5 feat: add post-deploy verification checklist and script 2025-11-30 14:47:27 -08:00