Apple
a46a70c014
fix(ops): Add network aliases and stabilize DNS for NODA1
...
- docker-compose.node1.yml: Add network aliases (router, gateway,
memory-service, qdrant, nats, neo4j) to eliminate manual
`docker network connect --alias` commands
- docker-compose.node1.yml: ROUTER_URL now uses env variable with
fallback: ${ROUTER_URL:-http://router:8000 }
- docker-compose.node1.yml: Increase router healthcheck start_period
to 30s and retries to 5
- .gitignore: Add noda1-credentials.local.mdc (local-only SSH creds)
- scripts/node1/verify_agents.sh: Improved output with agent list
- docs: Add NODA1-AGENT-VERIFICATION.md, NODA1-AGENT-ARCHITECTURE.md,
NODA1-VERIFICATION-REPORT-2026-02-03.md
- config/README.md: How to add new agents
- .cursor/rules/, .cursor/skills/: NODA1 operations skill for Cursor
Root cause fixed: Gateway could not resolve 'router' DNS name when
Router container was named 'dagi-staging-router' without alias.
Co-authored-by: Cursor <cursoragent@cursor.com >
2026-02-03 05:55:56 -08:00
Apple
3ecb43dafc
feat(P0): Add JetStream streams, DLQ, timeout policy
2026-01-28 07:11:09 -08:00
Apple
a3923cd96f
feat(P0/P1/P2): Add E2E agent prober, version pinning, prometheus fixes
2026-01-28 07:06:07 -08:00
Apple
0c8bef82f4
feat: Add Alateya, Clan, Eonarch agents + fix gateway-router connection
...
## Agents Added
- Alateya: R&D, biotech, innovations
- Clan (Spirit): Community spirit agent
- Eonarch: Consciousness evolution agent
## Changes
- docker-compose.node1.yml: Added tokens for all 3 new agents
- gateway-bot/http_api.py: Added configs and webhook endpoints
- gateway-bot/clan_prompt.txt: New prompt file
- gateway-bot/eonarch_prompt.txt: New prompt file
## Fixes
- Fixed ROUTER_URL from :9102 to :8000 (internal container port)
- All 9 Telegram agents now working
## Documentation
- Created PROJECT-MASTER-INDEX.md - single entry point
- Added various status documents and scripts
Tokens configured:
- Helion, NUTRA, Agromatrix (existing)
- Alateya, Clan, Eonarch (new)
- Druid, GreenFood, DAARWIZZ (configured)
2026-01-28 06:40:34 -08:00
Apple
5290287058
feat: implement TTS, Document processing, and Memory Service /facts API
...
- TTS: xtts-v2 integration with voice cloning support
- Document: docling integration for PDF/DOCX/PPTX processing
- Memory Service: added /facts/upsert, /facts/{key}, /facts endpoints
- Added required dependencies (TTS, docling)
2026-01-17 08:16:37 -08:00
Apple
1231647f94
🛡️ Add comprehensive Security Hardening Plan
...
- Created SECURITY-HARDENING-PLAN.md with 6 security levels
- Added setup-node1-security.sh for automated hardening
- Added scan-image.sh for pre-deployment image scanning
- Created docker-compose.secure.yml template
- Includes: Trivy, fail2ban, UFW, auditd, rkhunter, chkrootkit
- Network isolation, egress filtering, process monitoring
- Incident response procedures and recovery playbook
2026-01-10 05:05:21 -08:00
Apple
744c149300
✨ Add automated session logging system
...
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Created logs/ structure (sessions, operations, incidents)
- Added session-start/log/end scripts
- Installed Git hooks for auto-logging commits/pushes
- Added shell integration for zsh
- Created CHANGELOG.md
- Documented today's session (2026-01-10)
2026-01-10 04:53:17 -08:00
Apple
d77a4769c6
🔒 security(daarion-web): Hardening after crypto-mining incidents
...
## Root Cause Analysis
- Found CRITICAL RCE vulnerability in Next.js 15.0.3 (GHSA-9qr9-h5gf-34mp)
- 10 vulnerabilities total including SSRF, DoS, Auth Bypass
- Attack vector: exposed port 3000 + vulnerable Next.js → remote code execution
## Security Fixes
- Upgraded Next.js: 15.0.3 → 15.5.9 (0 vulnerabilities)
- Upgraded eslint-config-next: 15.0.3 → 15.5.9
## Hardening (New Files)
- apps/web/Dockerfile.secure: Multi-stage build, read-only FS, no shell
- docker-compose.web.secure.yml: Resource limits, cap_drop ALL, localhost bind
- scripts/rebuild-daarion-web-secure.sh: Local secure rebuild with Trivy scan
- scripts/deploy-daarion-web-node1.sh: Production deployment to NODE1
- SECURITY-REBUILD-REPORT.md: Full incident analysis and remediation report
## Key Security Measures
- restart: "no" (until verified)
- ports: 127.0.0.1:3000 (localhost only, use Nginx reverse proxy)
- read_only: true
- cap_drop: ALL
- resources.limits: 1 CPU, 512M RAM
- no-new-privileges: true
## Related Incidents
- Incident #1 (Jan 8): catcal, G4NQXBp miners
- Incident #2 (Jan 9): softirq, vrarhpb miners
- Hetzner AbuseID: 10F3971:2A
Co-authored-by: Cursor Agent <agent@cursor.sh >
2026-01-09 02:08:13 -08:00
Apple
b2caee4e0e
fix: CRITICAL - Prevent infinite DROP DATABASE loop
...
ROOT CAUSE: Monitor was doing DROP DATABASE when NODE2 agents were missing,
but the backup didn't have NODE2 agents, causing an infinite loop.
FIX:
- FULL RECOVERY (DROP DATABASE) only when MicroDAOs < 5 (critical data loss)
- SOFT RECOVERY (just sync agents) when MicroDAOs exist but agents missing
- Prefer backup with NODE2 agents (full_backup_with_node2*.sql)
- Never DROP DATABASE if MicroDAOs exist
This prevents the daily data loss issue.
2025-12-05 02:41:43 -08:00
Apple
02a0ea9540
fix: Add NODE2 agent count check to prevent data loss
...
- Check for at least 45 NODE2 agents (out of 50 expected)
- This prevents false positives when only core agents exist
- Better detection of actual data loss
2025-12-05 02:36:36 -08:00
Apple
06fe0c5204
fix: Improve database recovery process
...
- Fix empty variable handling in data checks
- Terminate active connections before dropping database
- Increase agent threshold to 50 (9 core + 50 NODE2)
- Add better logging for agent sync verification
2025-12-05 02:35:57 -08:00
Apple
db3b74e1ba
fix: Integrate asset URL fix into recovery process and update docs
2025-12-03 10:13:19 -08:00
Apple
51fdd0d5da
feat: Add script to fix asset URLs after restore
2025-12-03 10:12:21 -08:00
Apple
94889783a3
fix: Restore asset URLs (logos/banners) after database recovery
...
- Update monitor-db-stability.sh to fix asset URLs after restore
- Convert old /assets/ URLs to MinIO format
- Clear invalid banner URLs
2025-12-03 10:12:16 -08:00
Apple
19e8436a02
fix: Add database stability monitoring and improve PostgreSQL config
...
- Add monitor-db-stability.sh for automatic recovery
- Improve PostgreSQL shutdown settings to prevent data loss
- Add checkpoint and WAL settings for better persistence
2025-12-03 09:59:41 -08:00
Apple
7ac2f9c958
fix: Remove setup-node2-agents.sh that was creating test agents
...
- This script was trying to assign test agents (ag_atlas, etc.) to NODE2
- Use sync-node2-dagi-agents.py instead for loading real agents
- Test agents are now automatically removed by health check
2025-12-02 13:58:58 -08:00
Apple
6a76cffb88
fix: Add automatic removal of test agents in health check
...
- Add remove-test-agents.sh script
- Integrate test agent removal into db-health-check.sh
- Prevents test agents (ag_atlas, ag_oracle, ag_builder, ag_greeter) from reappearing
2025-12-02 13:57:28 -08:00
Apple
b27bfc1df5
feat: Add script to restore assets to MinIO and update DB URLs
2025-12-02 13:45:14 -08:00
Apple
488dd13af2
fix: Add database persistence and health check scripts
...
- Add apply-migrations.sh for automatic migration application
- Add ensure-db-persistence.sh for database integrity checks
- Add db-health-check.sh for periodic health monitoring
- Improve PostgreSQL configuration in docker-compose.db.yml
- Add proper shutdown settings to prevent data loss
2025-12-02 13:41:03 -08:00
Apple
fca48b3eb0
feat(node2): Complete NODE2 setup - guardian, agents, swapper models
...
- Node-guardian running on MacBook and updating metrics
- NODE2 agents (Atlas, Greeter, Oracle, Builder Bot) assigned to node-2-macbook-m4max
- Swapper models displaying correctly (8 models)
- DAGI Router agents showing with correct status (3 active, 1 stale)
- Router health check using node_cache for remote nodes
2025-12-02 07:07:58 -08:00
Apple
88188ed693
fix(node2): Use node_cache router_healthy for DAGI Router agents status
...
- Fix get_dagi_router_agents to use router_healthy from node_cache first
- Fallback to direct API call only if cache is unavailable
- This fixes NODE2 agents showing as 'stale' when router is actually healthy
- Fix CITY_SERVICE_URL in scripts (remove /api/city, use /api)
2025-12-02 07:02:08 -08:00
Apple
80123fd1be
feat(node2): Add scripts and docs for NODE2 guardian setup
...
- Add start-node2-guardian.sh script for easy launch
- Add setup-node2-agents.sh to update node_id for NODE2 agents
- Add NODE2_GUARDIAN_QUICKSTART.md with detailed instructions
- Update agents node_id to node-2-macbook-m4max
2025-12-02 06:59:48 -08:00
Apple
b79db5b2a4
feat(assets): Add NGINX config and migration scripts for MinIO assets
...
- Add NGINX reverse proxy config for assets.daarion.space
- Add script to migrate assets from /static/uploads to MinIO
- Add script to update asset URLs in database after migration
2025-12-02 02:11:26 -08:00
Apple
8e8f95e9ef
feat(db-hardening): Add database persistence, backups, and MinIO assets storage
...
Database Hardening:
- Add docker-compose.db.yml with persistent PostgreSQL volume
- Add automatic DB backups every 12h (7 days, 4 weeks, 6 months retention)
- Add MinIO S3-compatible storage for assets
Assets Migration:
- Add MinIO client (lib/assets_client.py) for upload/delete
- Update upload endpoint to use MinIO (with local fallback)
- Add migration 043_asset_urls_to_text.sql for full HTTPS URLs
- Simplify normalizeAssetUrl for S3 URLs
Recovery:
- Add seed_full_city_reset.py for emergency city recovery
- Add DB_RESTORE.md with backup restore instructions
- Add SEED_RECOVERY.md with recovery procedures
- Add INFRA_ASSETS_MINIO.md with MinIO setup guide
Task: TASK_PHASE_DATABASE_HARDENING_AND_ASSETS_MIGRATION_v1
2025-12-02 01:56:39 -08:00
Apple
0039be5dc0
feat(rooms): Add city-lobby with DAARWIZZ + fix API proxy
...
- Created city-lobby room as main public chat with DAARWIZZ
- Fixed /api/city/rooms proxy to use correct backend path (/api/v1/city/rooms)
- Updated district rooms with zone keys (leadership, system, engineering, etc.)
- Set MicroDAO lobbies as primary rooms
- Created seed_city_rooms.py script
- Created TASK_PHASE_CITY_ROOMS_AND_PUBLIC_CHAT_v1.md
Total: 35 rooms, 31 public, 10 districts
2025-12-01 08:47:37 -08:00
Apple
2f8e471e03
feat(node2): Full DAGI integration - 50 agents synced
...
- Created sync-node2-dagi-agents.py script to sync agents from agents_city_mapping.yaml
- Synced 50 DAGI agents across 10 districts:
- Leadership Hall (4): Solarius, Sofia, PrimeSynth, Nexor
- System Control (6): Monitor, Strategic Sentinels, Vindex, Helix, Aurora, Arbitron
- Engineering Lab (5): ByteForge, Vector, ChainWeaver, Cypher, Canvas
- Marketing Hub (6): Roxy, Mira, Tempo, Harmony, Faye, Storytelling
- Finance Office (4): Financial Analyst, Accountant, Budget Planner, Tax Advisor
- Web3 District (5): Smart Contract Dev, DeFi Analyst, Tokenomics Expert, NFT Specialist, DAO Governance
- Security Bunker (7): Shadelock, Exor, Penetration Tester, Security Monitor, Incident Responder, Shadelock Forensics, Exor Forensics
- Vision Studio (4): Iris, Lumen, Spectra, Video Analyzer
- R&D Lab (6): ProtoMind, LabForge, TestPilot, ModelScout, BreakPoint, GrowCell
- Memory Vault (3): Somnia, Memory Manager, Knowledge Indexer
- Fixed Swapper config to use swapper_config_node2.yaml with 8 models
- Created TASK_PHASE_NODE2_FULL_DAGI_INTEGRATION_v1.md
NODE2 now shows:
- 50 agents in DAGI Router Card
- 8 models in Swapper Service (gpt-oss, phi3, starcoder2, mistral-nemo, gemma2, deepseek-coder, qwen2.5-coder, deepseek-r1)
- Full isolation from NODE1
2025-12-01 08:31:25 -08:00
Apple
a818f2ac2f
feat: add router health metrics to node_cache and node-guardian
...
- Add migration 042_node_cache_router_metrics.sql
- Node guardian now collects router health and sends in heartbeat
- City-service uses cached router_healthy from node_cache
- This allows NODE2 router status to be displayed correctly
2025-12-01 08:03:46 -08:00
Apple
9b9a72ffbd
feat: full node isolation - use node-specific swapper_url and router_url from DB
...
- Add migration 041_node_local_endpoints.sql
- Add get_node_endpoints() to repo_city.py
- Update routes_city.py to use DB endpoints instead of hardcoded URLs
- Update node-guardian-loop.py to use NODE_SWAPPER_URL/NODE_ROUTER_URL env vars
- Update launchd plist for NODE2 with router URL
2025-12-01 08:01:53 -08:00
Apple
b25e002db6
feat: add logging for node isolation debugging in node-guardian
2025-12-01 07:35:37 -08:00
Apple
9e7b1f25ef
fix: add node button visibility, fix node-guardian swapper health check, fix banner URL transform
2025-12-01 07:08:36 -08:00
Apple
8e14750f8b
fix: discover_node_state.py global variable scope, add generated node state files
2025-12-01 06:50:48 -08:00
Apple
f5c58358a0
feat: add 'Додати ноду' button to Node Directory, create /nodes/register page, add node discovery script
2025-12-01 06:47:27 -08:00
Apple
909258fdcb
fix: DAGI Router agents logic, MicroDAO logo URL handling
2025-12-01 06:03:08 -08:00
Apple
e3accd4df0
feat: DAGI Router v2 - new endpoints, hooks, and UI card
2025-12-01 05:21:43 -08:00
Apple
1a81cf75f1
feat: add unified API proxy layer, debug endpoint, and systemd service for node-guardian
2025-12-01 03:43:06 -08:00
Apple
b3e3c6417d
fix: update Swapper endpoints (/health, /models), remove upload size limits, auto-convert images
2025-12-01 03:03:27 -08:00
Apple
135e8ed83c
fix: suppress expected swapper connection errors in guardian loop
2025-11-30 15:41:42 -08:00
Apple
4f123ae79b
fix: update deploy script to avoid container conflicts
2025-11-30 15:25:37 -08:00
Apple
cbaaed5e23
fix: fix backend files and add swapper functionality
2025-11-30 15:19:11 -08:00
Apple
fd814b2059
feat: implement Swapper metrics collection and UI
2025-11-30 15:12:49 -08:00
Apple
5b5160ad8b
fix: correct API endpoints in node-guardian-loop script
2025-11-30 15:01:06 -08:00
Apple
4ae9ee4d70
fix: allow healthy status in invariants check
2025-11-30 14:53:53 -08:00
Apple
a8617df1d0
fix: check /health instead of /healthz in invariants script
2025-11-30 14:53:17 -08:00
Apple
5c1d7d15f9
fix: correct API endpoints in verification scripts
2025-11-30 14:51:59 -08:00
Apple
6d4f9ec7c5
feat: add post-deploy verification checklist and script
2025-11-30 14:47:27 -08:00
Apple
b2240f5314
fix: make deploy script robust
2025-11-30 14:07:28 -08:00
Apple
d71da0bae3
fix: restore DB script and migrations
2025-11-30 14:06:45 -08:00
Apple
534bd72183
ops: add DB restore and deploy script
2025-11-30 14:05:55 -08:00
Apple
bca81dc719
feat: Node Self-Healing, DAGI Audit, Agent Prompts, Infra Invariants
...
### Backend (city-service)
- Node Registry + Self-Healing API (migration 039)
- Improved get_all_nodes() with robust fallback for node_registry/node_cache
- Agent Prompts Runtime API for DAGI Router integration
- DAGI Router Audit endpoints (phantom/stale detection)
- Node Agents API (Guardian/Steward)
- Node metrics extended (CPU/GPU/RAM/Disk)
### Frontend (apps/web)
- Node Directory with improved error handling
- Node Cabinet with metrics cards
- DAGI Router Card component
- Node Metrics Card component
- useDAGIAudit hook
### Scripts
- check-invariants.py - deploy verification
- node-bootstrap.sh - node self-registration
- node-guardian-loop.py - continuous self-healing
- dagi_agent_audit.py - DAGI audit utility
### Migrations
- 034: Agent prompts seed
- 035: Agent DAGI audit
- 036: Node metrics extended
- 037: Node agents complete
- 038: Agent prompts full coverage
- 039: Node registry self-healing
### Tests
- test_infra_smoke.py
- test_agent_prompts_runtime.py
- test_dagi_router_api.py
### Documentation
- DEPLOY_CHECKLIST_2024_11_30.md
- Multiple TASK_PHASE docs
2025-11-30 13:52:01 -08:00
Apple
e46d026cf2
debug: add logging to mark_test_entities.py
2025-11-28 09:25:04 -08:00