Commit Graph

119 Commits

Author SHA1 Message Date
Apple
f7bf935a21 NODE3: Memory Service мігровано з Docker в K8s
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- NODE3 додано до K3s кластера як worker (llm80-che-1-1)
- Memory Service працює в K8s на NODE3 (pod: memory-service-node3-*)
- Docker контейнер зупинено та видалено
- Оновлено MEMORY-MODULE-STATUS.md v3.1.0
2026-01-10 09:26:59 -08:00
Apple
116bf5f3f3 Memory Service запущено на всіх нодах + Cohere API налаштовано
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- NODE1: Memory Service в K8s (port 30800) 
- NODE2: Memory Service в Docker (port 8001) 
- NODE3: Memory Service в Docker (port 8001) 
- Всі ноди: Cohere API налаштовано для embeddings 
- NODE2: ComfyUI перевірено (macOS App, port 8000) 
- Оновлено MEMORY-MODULE-STATUS.md v3.0.0
2026-01-10 09:13:20 -08:00
Apple
6b02349300 🧠 Update Memory Module Status v2.1.0
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- NODE2: PostgreSQL + Agent Memory Schema 
- NODE3: ComfyUI installed (v0.8.2, PyTorch+CUDA) 
- All nodes now have full memory stack
- Added critical TODOs: Memory Service & Cohere API
2026-01-10 09:00:17 -08:00
Apple
f4ccf7c570 🧠 Complete Memory Stack setup across all nodes
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- NODE1: Neo4j (K8s), NVIDIA RTX 4000 + CUDA 13.1
- NODE2: Fixed Neo4j & Qdrant containers
- NODE3: Full stack (PostgreSQL + Qdrant + Neo4j)
- Updated MEMORY-MODULE-STATUS.md v2.0.0
2026-01-10 08:26:42 -08:00
Apple
8aee29d42d 📊 Add Memory Module Status Report across all nodes
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
2026-01-10 08:11:12 -08:00
Apple
1c247ea40c 📝 Update context docs with session logging system
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Added Session Logging System section to INFRASTRUCTURE.md
- Added Git Multi-Remote configuration (GitHub + Gitea + GitLab)
- Updated version to 2.5.0
- Added logging commands reference
- Updated infrastructure_quick_ref.ipynb with new features
- Added SSH tunnel instructions for GitLab access
2026-01-10 04:58:01 -08:00
Apple
744c149300 Add automated session logging system
Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Created logs/ structure (sessions, operations, incidents)
- Added session-start/log/end scripts
- Installed Git hooks for auto-logging commits/pushes
- Added shell integration for zsh
- Created CHANGELOG.md
- Documented today's session (2026-01-10)
2026-01-10 04:53:17 -08:00
Apple
778907cf0e docs: add NODE3 (Threadripper PRO + RTX 3090) to infrastructure
Added NODE3 - AI/ML Workstation Specification:

Hardware:
- CPU: AMD Ryzen Threadripper PRO 5975WX (32 cores / 64 threads, 3.6 GHz boost)
- RAM: 128GB DDR4
- GPU: NVIDIA GeForce RTX 3090 24GB GDDR6X
  - 10496 CUDA cores
  - CUDA 13.0, Driver 580.95.05
- Storage: Samsung SSD 990 PRO 4TB NVMe
  - Root: 100GB (27% used)
  - Available for expansion: 3.5TB

System:
- Hostname: llm80-che-1-1
- IP: 80.77.35.151:33147
- OS: Ubuntu 24.04.3 LTS (Noble Numbat)
- Container Runtime: MicroK8s + containerd
- Uptime: 24/7

Security Status:  CLEAN (verified 2026-01-09)
- No crypto miners detected
- 0 zombie processes
- CPU load: 0.17 (very low)
- GPU utilization: 0% (ready for workloads)

Services Running:
- Port 3000 - Unknown service (needs investigation)
- Port 8080 - Unknown service (needs investigation)
- Port 11434 - Ollama (localhost only)
- Port 27017/27019 - MongoDB (localhost only)
- Kubernetes API: 16443
- K8s services: 10248-10259, 25000

Recommended Use Cases:
- 🤖 Large LLM inference (Llama 70B, Qwen 72B, Mixtral 8x22B)
- 🧠 Model training and fine-tuning
- 🎨 Stable Diffusion XL image generation
- 🔬 AI/ML research and experimentation
- 🚀 Kubernetes-based AI service orchestration

Files Updated:
- INFRASTRUCTURE.md v2.4.0
- docs/infrastructure_quick_ref.ipynb v2.3.0

NODE3 is the most powerful node in the infrastructure:
- Most CPU cores: 32c/64t (vs 16c M4 Max)
- Most RAM: 128GB (vs 64GB)
- Dedicated GPU: RTX 3090 24GB VRAM
- Largest storage: 4TB NVMe (vs 2TB)

Co-Authored-By: Warp <agent@warp.dev>
2026-01-09 05:53:16 -08:00
Apple
21691aa042 docs: document Security Incident #2 - recurring container compromise
Security Incident #2 Emergency Response (Jan 9, 2026):
- Documented second compromise with NEW crypto miners (softirq, vrarhpb)
- Root cause: Docker image auto-restarted after server reboot
- Emergency mitigation completed (processes killed, container/images removed, load normalized)
- Created comprehensive rebuild task document: TASK_REBUILD_DAARION_WEB.md
- Updated INFRASTRUCTURE.md v2.3.0 with Incident #2 timeline and lessons learned
- Updated infrastructure_quick_ref.ipynb v2.2.0 with security status

Critical Changes:
- daarion-web container permanently disabled until secure rebuild
- Docker images DELETED (not just container stopped)
- Enhanced firewall rules (SSH rate limiting, port scan blocking)
- Retry test registered with Hetzner
- System load normalized: 30+ → 4.19
- Zombie processes cleaned: 1499 → 5

Files Created/Updated:
1. TASK_REBUILD_DAARION_WEB.md - Detailed rebuild instructions for Cursor agent
2. INFRASTRUCTURE.md - Added Incident #2 to Security section
3. docs/infrastructure_quick_ref.ipynb - Updated security status and version

Lessons Learned:
- ALWAYS delete Docker images, not just containers
- Auto-restart policies are dangerous for compromised containers
- Complete removal = container + image + restart policy change

Status: Emergency mitigation complete, statement submission pending (deadline: 2026-01-09 12:54 UTC)

Hetzner Incident ID: 10F3971:2A (AbuseID)

Co-Authored-By: Warp <agent@warp.dev>
2026-01-09 02:08:13 -08:00
Apple
a1091b03a3 docs: add Cursor Agent SSH access instructions for NODE1
- Add detailed SSH connection guide for Cursor agents
- Include common commands, safety checks, and troubleshooting
- Add interactive session example and best practices
- Update INFRASTRUCTURE.md with section for Cursor agents
- Update infrastructure_quick_ref.ipynb with SSH access configuration
- Provide complete workflow examples for remote operations

Co-Authored-By: Warp <agent@warp.dev>
2026-01-09 02:08:13 -08:00
Apple
e829fe66f2 docs: security incident resolution & firewall implementation
- Document network scanning incident (Dec 6 2025 - Jan 8 2026)
- Add firewall rules to prevent internal network access
- Deploy monitoring script for scanning attempts
- Update INFRASTRUCTURE.md v2.2.0 with Security section
- Update infrastructure_quick_ref.ipynb v2.1.0
- Root cause: compromised daarion-web container with crypto miner
- Resolution: container removed, firewall applied, monitoring deployed

Co-Authored-By: Warp <agent@warp.dev>
2026-01-09 02:08:13 -08:00
GitHub Action
e3a8b7464a docs: auto-update repository information [skip ci] 2025-12-08 09:30:23 +00:00
Apple
ad3026e32d docs: Document root cause of daily data loss and fix 2025-12-05 02:42:44 -08:00
Apple
70b528f5cf docs: Add documentation for periodic data loss fix 2025-12-05 02:36:49 -08:00
Apple
db3b74e1ba fix: Integrate asset URL fix into recovery process and update docs 2025-12-03 10:13:19 -08:00
Apple
83b7e8f372 docs: Add database stability fix documentation 2025-12-03 10:00:11 -08:00
Apple
0c75ded63a docs: Update test agents fix documentation with removed script info 2025-12-02 13:59:15 -08:00
Apple
9995e4ef75 docs: Add test agents fix documentation 2025-12-02 13:57:44 -08:00
Apple
d128caacf6 docs: Add assets restoration guide 2025-12-02 13:45:57 -08:00
Apple
8c801c1dab docs: Add database persistence summary 2025-12-02 13:43:16 -08:00
Apple
2bc00b99a8 docs: Add database persistence documentation and improve docker-compose 2025-12-02 13:42:45 -08:00
Apple
c968705ec7 docs: Add task for completing branding banners MVP
- Add task to verify upload flow for banners
- Document fallback options for banner_url == null
- Add troubleshooting guide
- Document branding assets guide requirements
2025-12-02 09:13:38 -08:00
Apple
742c238b3b docs: Add manual test plan for assets proxy debugging 2025-12-02 09:00:35 -08:00
Apple
51571b3e61 docs: Add assets proxy fix report with HEAD method support 2025-12-02 08:51:23 -08:00
Apple
55634eac9b docs: Add assets proxy debug report 2025-12-02 08:37:40 -08:00
Apple
1ca6a4f55a feat: Complete assets proxy implementation with documentation
- Add comprehensive documentation in docs/ASSETS_PROXY.md
- Add contract comments in normalizeAssetUrl and proxy_asset
- Verify all components use normalizeAssetUrl
- Verify ENV variables are correctly set
- Add troubleshooting guide
2025-12-02 08:36:55 -08:00
Apple
fca48b3eb0 feat(node2): Complete NODE2 setup - guardian, agents, swapper models
- Node-guardian running on MacBook and updating metrics
- NODE2 agents (Atlas, Greeter, Oracle, Builder Bot) assigned to node-2-macbook-m4max
- Swapper models displaying correctly (8 models)
- DAGI Router agents showing with correct status (3 active, 1 stale)
- Router health check using node_cache for remote nodes
2025-12-02 07:07:58 -08:00
Apple
80123fd1be feat(node2): Add scripts and docs for NODE2 guardian setup
- Add start-node2-guardian.sh script for easy launch
- Add setup-node2-agents.sh to update node_id for NODE2 agents
- Add NODE2_GUARDIAN_QUICKSTART.md with detailed instructions
- Update agents node_id to node-2-macbook-m4max
2025-12-02 06:59:48 -08:00
Apple
ace183e136 feat: Add MicroDAO Dashboard with activity feed and statistics
- Add microdao_activity table for news/updates/events
- Add statistics columns to microdaos table
- Implement dashboard API endpoints
- Create UI components (HeaderCard, ActivitySection, TeamSection)
- Add seed data for DAARION DAO
- Update backend models and repositories
- Add frontend types and API client
2025-12-02 06:37:16 -08:00
Apple
f95810e8a7 fix(nodes): Normalize Router/Swapper endpoints and fix NODE2 display
Major changes:
- Normalize get_node_endpoints to use ENV vars (ROUTER_BASE_URL, SWAPPER_BASE_URL)
- Remove node_id-based URL selection logic
- Add fallback direct API call in get_node_swapper_detail
- Fix Swapper API endpoint (/models instead of /api/v1/models)
- Add router_healthy and router_version to node_heartbeat fallback
- Add ENV vars to docker-compose for Router/Swapper URLs

Documentation:
- Add TASK_PHASE_NODE2_ROUTER_SWAPPER_FIX.md with full task description
- Add NODE2_GUARDIAN_SETUP.md with setup instructions

This fixes:
- Swapper models not showing for NODE1 and NODE2
- DAGI Router agents not showing for NODE2
- Router/Swapper showing as Down/Degraded when they're actually up
2025-12-02 03:13:01 -08:00
Apple
5061070d57 docs(assets): Add DNS setup and migration instructions 2025-12-02 02:14:07 -08:00
Apple
d24a23ec96 fix(db-hardening): Add lib __init__.py and improve MinIO import error handling 2025-12-02 01:57:27 -08:00
Apple
8e8f95e9ef feat(db-hardening): Add database persistence, backups, and MinIO assets storage
Database Hardening:
- Add docker-compose.db.yml with persistent PostgreSQL volume
- Add automatic DB backups every 12h (7 days, 4 weeks, 6 months retention)
- Add MinIO S3-compatible storage for assets

Assets Migration:
- Add MinIO client (lib/assets_client.py) for upload/delete
- Update upload endpoint to use MinIO (with local fallback)
- Add migration 043_asset_urls_to_text.sql for full HTTPS URLs
- Simplify normalizeAssetUrl for S3 URLs

Recovery:
- Add seed_full_city_reset.py for emergency city recovery
- Add DB_RESTORE.md with backup restore instructions
- Add SEED_RECOVERY.md with recovery procedures
- Add INFRA_ASSETS_MINIO.md with MinIO setup guide

Task: TASK_PHASE_DATABASE_HARDENING_AND_ASSETS_MIGRATION_v1
2025-12-02 01:56:39 -08:00
Apple
dddf51affe feat(microdao-rooms): Add MicroDAO rooms creation/deletion and agent chat
Backend:
- POST /city/microdao/{slug}/rooms - create new room for MicroDAO
- DELETE /city/microdao/{slug}/rooms/{room_id} - soft-delete room
- POST /city/agents/{agent_id}/ensure-room - create personal agent room

Frontend:
- MicrodaoRoomsSection: Added create room modal with name, description, type
- MicrodaoRoomsSection: Added delete room functionality for managers
- Agent page: Added 'Поговорити' button to open chat in City Room

Models:
- Added CreateMicrodaoRoomRequest model

Task: TASK_PHASE_MICRODAO_ROOMS_AND_PUBLIC_CHAT_v3
2025-12-01 10:09:28 -08:00
Apple
649d07ee29 feat(rooms): Fix NaN online stats + Add DAARWIZZ CTA on homepage
- Fixed NaN in online stats by using nullish coalescing (?? 0)
- Added members_online, zone, room_type to /api/v1/city/rooms response
- Added DAARWIZZ chat CTA section on homepage with link to city-lobby
- Created task files for next phases:
  - TASK_PHASE_CITY_ROOMS_FINISH_v2.md
  - TASK_PHASE_AGENT_MANAGEMENT_v1.md
  - TASK_PHASE_CITIZENS_DIRECTORY_v1.md
2025-12-01 09:19:07 -08:00
Apple
0039be5dc0 feat(rooms): Add city-lobby with DAARWIZZ + fix API proxy
- Created city-lobby room as main public chat with DAARWIZZ
- Fixed /api/city/rooms proxy to use correct backend path (/api/v1/city/rooms)
- Updated district rooms with zone keys (leadership, system, engineering, etc.)
- Set MicroDAO lobbies as primary rooms
- Created seed_city_rooms.py script
- Created TASK_PHASE_CITY_ROOMS_AND_PUBLIC_CHAT_v1.md

Total: 35 rooms, 31 public, 10 districts
2025-12-01 08:47:37 -08:00
Apple
2f8e471e03 feat(node2): Full DAGI integration - 50 agents synced
- Created sync-node2-dagi-agents.py script to sync agents from agents_city_mapping.yaml
- Synced 50 DAGI agents across 10 districts:
  - Leadership Hall (4): Solarius, Sofia, PrimeSynth, Nexor
  - System Control (6): Monitor, Strategic Sentinels, Vindex, Helix, Aurora, Arbitron
  - Engineering Lab (5): ByteForge, Vector, ChainWeaver, Cypher, Canvas
  - Marketing Hub (6): Roxy, Mira, Tempo, Harmony, Faye, Storytelling
  - Finance Office (4): Financial Analyst, Accountant, Budget Planner, Tax Advisor
  - Web3 District (5): Smart Contract Dev, DeFi Analyst, Tokenomics Expert, NFT Specialist, DAO Governance
  - Security Bunker (7): Shadelock, Exor, Penetration Tester, Security Monitor, Incident Responder, Shadelock Forensics, Exor Forensics
  - Vision Studio (4): Iris, Lumen, Spectra, Video Analyzer
  - R&D Lab (6): ProtoMind, LabForge, TestPilot, ModelScout, BreakPoint, GrowCell
  - Memory Vault (3): Somnia, Memory Manager, Knowledge Indexer
- Fixed Swapper config to use swapper_config_node2.yaml with 8 models
- Created TASK_PHASE_NODE2_FULL_DAGI_INTEGRATION_v1.md

NODE2 now shows:
- 50 agents in DAGI Router Card
- 8 models in Swapper Service (gpt-oss, phi3, starcoder2, mistral-nemo, gemma2, deepseek-coder, qwen2.5-coder, deepseek-r1)
- Full isolation from NODE1
2025-12-01 08:31:25 -08:00
Apple
8e14750f8b fix: discover_node_state.py global variable scope, add generated node state files 2025-12-01 06:50:48 -08:00
Apple
f5c58358a0 feat: add 'Додати ноду' button to Node Directory, create /nodes/register page, add node discovery script 2025-12-01 06:47:27 -08:00
Apple
e3accd4df0 feat: DAGI Router v2 - new endpoints, hooks, and UI card 2025-12-01 05:21:43 -08:00
Apple
d4e20ea513 feat: add MicroDAO branding and Agent avatar upload UI 2025-12-01 02:26:02 -08:00
GitHub Action
f0d113e234 docs: auto-update repository information [skip ci] 2025-12-01 09:30:48 +00:00
Apple
281c79f916 feat: implement swapper metrics and node cabinet ui 2025-11-30 15:40:41 -08:00
Apple
fd814b2059 feat: implement Swapper metrics collection and UI 2025-11-30 15:12:49 -08:00
Apple
6d4f9ec7c5 feat: add post-deploy verification checklist and script 2025-11-30 14:47:27 -08:00
Apple
1830109a95 feat: Agent System Prompts MVP (B) - database, backend API, and frontend integration 2025-11-30 14:04:48 -08:00
Apple
bca81dc719 feat: Node Self-Healing, DAGI Audit, Agent Prompts, Infra Invariants
### Backend (city-service)
- Node Registry + Self-Healing API (migration 039)
- Improved get_all_nodes() with robust fallback for node_registry/node_cache
- Agent Prompts Runtime API for DAGI Router integration
- DAGI Router Audit endpoints (phantom/stale detection)
- Node Agents API (Guardian/Steward)
- Node metrics extended (CPU/GPU/RAM/Disk)

### Frontend (apps/web)
- Node Directory with improved error handling
- Node Cabinet with metrics cards
- DAGI Router Card component
- Node Metrics Card component
- useDAGIAudit hook

### Scripts
- check-invariants.py - deploy verification
- node-bootstrap.sh - node self-registration
- node-guardian-loop.py - continuous self-healing
- dagi_agent_audit.py - DAGI audit utility

### Migrations
- 034: Agent prompts seed
- 035: Agent DAGI audit
- 036: Node metrics extended
- 037: Node agents complete
- 038: Agent prompts full coverage
- 039: Node registry self-healing

### Tests
- test_infra_smoke.py
- test_agent_prompts_runtime.py
- test_dagi_router_api.py

### Documentation
- DEPLOY_CHECKLIST_2024_11_30.md
- Multiple TASK_PHASE docs
2025-11-30 13:52:01 -08:00
Apple
0c7836af5a docs: MicroDAO Rooms Integration report 2025-11-30 12:03:07 -08:00
Apple
a7adddb60d feat: MicroDAO Rooms Integration
Backend:
- GET /city/microdao/{slug}/agents - list agents with roles
- Seed: 6 rooms for DAARION, 3+ rooms for each District

Task doc: TASK_PHASE_MICRODAO_ROOMS_INTEGRATION_v1.md
2025-11-30 11:48:44 -08:00
Apple
6908569ac7 docs: District Portals report
Verified on daarion.space:
- /districts shows 3 districts with cards
- /districts/soul shows lead agent, core team, rooms
- /soul, /greenfood, /energy-union shortcuts work
- All data from DB (no hardcodes)
2025-11-30 11:43:13 -08:00