Some checks failed
Build and Deploy Docs / build-and-deploy (push) Has been cancelled
- Created logs/ structure (sessions, operations, incidents) - Added session-start/log/end scripts - Installed Git hooks for auto-logging commits/pushes - Added shell integration for zsh - Created CHANGELOG.md - Documented today's session (2026-01-10)
1806 lines
59 KiB
Markdown
1806 lines
59 KiB
Markdown
# 🏗️ Infrastructure Overview — DAARION & MicroDAO
|
||
|
||
**Версія:** 2.4.0
|
||
**Останнє оновлення:** 2026-01-09 13:50
|
||
**Статус:** Production Ready (95% Multimodal Integration)
|
||
**Останні зміни:**
|
||
- 🔒 **Security Incident Resolution** (Dec 6 2025 - Jan 8 2026)
|
||
- ✅ Compromised container removed (`daarion-web`)
|
||
- ✅ Firewall rules implemented (egress filtering)
|
||
- ✅ Monitoring for scanning attempts deployed
|
||
- ✅ Router Multimodal API (v1.1.0) - images/files/audio/web-search
|
||
- ✅ Telegram Gateway Multimodal - voice/photo/documents
|
||
- ✅ Frontend Multimodal UI - enhanced mode
|
||
- ✅ Web Search Service (НОДА2)
|
||
- ⚠️ STT/OCR Services (НОДА2 Docker issues, fallback працює)
|
||
|
||
---
|
||
|
||
## 📍 Network Nodes
|
||
|
||
### Node #1: Production Server (Hetzner GEX44 #2844465)
|
||
- **Node ID:** `node-1-hetzner-gex44`
|
||
- **IP Address:** `144.76.224.179`
|
||
- **SSH Access:** `ssh root@144.76.224.179`
|
||
- **Location:** Hetzner Cloud (Germany)
|
||
- **Project Root:** `/opt/microdao-daarion`
|
||
- **Docker Network:** `dagi-network`
|
||
- **Role:** Production Router + Gateway + All Services
|
||
- **Uptime:** 24/7
|
||
- **Prometheus Tunnel:** `scripts/start-node1-prometheus-tunnel.sh` (дефолт `localhost:19090` → `NODE1:9090`, можна змінити `LOCAL_PORT`)
|
||
|
||
**Domains:**
|
||
- `gateway.daarion.city` → `144.76.224.179` (Gateway + Nginx)
|
||
- `api.daarion.city` → TBD (API Gateway)
|
||
- `daarion.city` → TBD (Main website)
|
||
|
||
### Node #2: Development Node (MacBook Pro M4 Max)
|
||
- **Node ID:** `node-2-macbook-m4max`
|
||
- **Local IP:** `192.168.1.33` (updated 2025-11-23)
|
||
- **SSH Access:** `ssh apple@192.168.1.244` (if enabled)
|
||
- **Location:** Local Network (Ivan's Office)
|
||
- **Project Root:** `/Users/apple/github-projects/microdao-daarion`
|
||
- **Role:** Development + Testing + Backup Router
|
||
- **Specs:** M4 Max (16 cores), 64GB RAM, 2TB SSD, 40-core GPU
|
||
- **Uptime:** On-demand (battery-powered)
|
||
|
||
**See full specs:** [NODE-2-MACBOOK-SPECS.md](./NODE-2-MACBOOK-SPECS.md)
|
||
**Current state:** [NODE-2-CURRENT-STATE.md](./NODE-2-CURRENT-STATE.md) — What's running now
|
||
|
||
### Node #3: AI/ML Workstation (Threadripper PRO + RTX 3090)
|
||
- **Node ID:** `node-3-threadripper-rtx3090`
|
||
- **Hostname:** `llm80-che-1-1`
|
||
- **IP Address:** `80.77.35.151`
|
||
- **SSH Access:** `ssh zevs@80.77.35.151 -p33147` (password: `147zevs369`)
|
||
- **Location:** Remote Datacenter
|
||
- **OS:** Ubuntu 24.04.3 LTS (Noble Numbat)
|
||
- **Uptime:** 24/7
|
||
- **Role:** AI/ML Workloads, GPU Inference, Kubernetes Orchestration
|
||
|
||
**Hardware Specs:**
|
||
- **CPU:** AMD Ryzen Threadripper PRO 5975WX
|
||
- 32 cores / 64 threads
|
||
- Base: 1.8 GHz, Boost: 3.6 GHz
|
||
- **RAM:** 128GB DDR4
|
||
- **GPU:** NVIDIA GeForce RTX 3090
|
||
- 24GB GDDR6X VRAM
|
||
- 10496 CUDA cores
|
||
- CUDA 13.0, Driver 580.95.05
|
||
- **Storage:** Samsung SSD 990 PRO 4TB NVMe
|
||
- Total: 3.6TB
|
||
- Root partition: 100GB (27% used)
|
||
- Available for expansion: 3.5TB
|
||
- **Container Runtime:** MicroK8s + containerd
|
||
|
||
**Services Running:**
|
||
- Port 3000 - Unknown service (needs investigation)
|
||
- Port 8080 - Unknown service (needs investigation)
|
||
- Port 11434 - Ollama (localhost only)
|
||
- Port 27017/27019 - MongoDB (localhost only)
|
||
- Kubernetes API: 16443
|
||
- Various K8s services: 10248-10259, 25000
|
||
|
||
**Security Status:** ✅ Clean (verified 2026-01-09)
|
||
- No crypto miners detected
|
||
- 0 zombie processes
|
||
- CPU load: 0.17 (very low)
|
||
- GPU utilization: 0% (ready for workloads)
|
||
|
||
**Recommended Use Cases:**
|
||
- 🤖 Large LLM inference (Llama 70B, Qwen 72B, Mixtral 8x22B)
|
||
- 🧠 Model training and fine-tuning
|
||
- 🎨 Stable Diffusion XL image generation
|
||
- 🔬 AI/ML research and experimentation
|
||
- 🚀 Kubernetes-based AI service orchestration
|
||
|
||
---
|
||
|
||
## 🐙 GitHub Repositories
|
||
|
||
### 1. MicroDAO (Current Project)
|
||
- **Repository:** `git@github.com:IvanTytar/microdao-daarion`
|
||
- **HTTPS:** `https://github.com/IvanTytar/microdao-daarion`
|
||
- **Remote Name:** `origin`
|
||
- **Main Branch:** `main`
|
||
- **Purpose:** MicroDAO core code, DAGI Stack, documentation
|
||
|
||
**Quick Clone:**
|
||
```bash
|
||
git clone git@github.com:IvanTytar/microdao-daarion
|
||
cd microdao-daarion
|
||
```
|
||
|
||
### 2. DAARION.city
|
||
- **Repository:** `git@github.com:DAARION-DAO/daarion-ai-city.git`
|
||
- **HTTPS:** `https://github.com/DAARION-DAO/daarion-ai-city.git`
|
||
- **Remote Name:** `daarion-city`
|
||
- **Main Branch:** `main`
|
||
- **Purpose:** Official DAARION.city website and integrations
|
||
|
||
**Quick Clone:**
|
||
```bash
|
||
git clone git@github.com:DAARION-DAO/daarion-ai-city.git
|
||
cd daarion-ai-city
|
||
```
|
||
|
||
**Add as remote to MicroDAO:**
|
||
```bash
|
||
cd microdao-daarion
|
||
git remote add daarion-city git@github.com:DAARION-DAO/daarion-ai-city.git
|
||
git fetch daarion-city
|
||
```
|
||
|
||
---
|
||
|
||
## 🤖 Для агентів Cursor: Робота на НОДА1
|
||
|
||
### SSH підключення до НОДА1
|
||
|
||
**Базова команда:**
|
||
```bash
|
||
ssh root@144.76.224.179
|
||
```
|
||
|
||
**Важливо для агентів:**
|
||
- SSH ключ має бути налаштований на локальній машині користувача
|
||
- Якщо ключа немає, підключення запитає пароль (який має надати користувач)
|
||
- Після підключення ви працюєте від імені `root`
|
||
|
||
### Робочі директорії на НОДА1
|
||
|
||
```bash
|
||
# Основний проєкт
|
||
cd /opt/microdao-daarion
|
||
|
||
# Docker контейнери
|
||
docker ps # список запущених контейнерів
|
||
docker logs <container_name> # логи контейнера
|
||
docker exec -it <container_name> bash # зайти в контейнер
|
||
|
||
# Логи системи
|
||
cd /var/log
|
||
tail -f /var/log/syslog # системні логи
|
||
journalctl -u docker -f # Docker логи в реальному часі
|
||
|
||
# Скрипти безпеки
|
||
ls -la /root/*.sh # firewall та моніторинг скрипти
|
||
```
|
||
|
||
### Типові завдання для агентів
|
||
|
||
**1. Перевірити статус сервісів:**
|
||
```bash
|
||
ssh root@144.76.224.179 "docker ps --format 'table {{.Names}}\\t{{.Status}}'"
|
||
```
|
||
|
||
**2. Перезапустити сервіс:**
|
||
```bash
|
||
ssh root@144.76.224.179 "docker restart <service_name>"
|
||
```
|
||
|
||
**3. Переглянути логи:**
|
||
```bash
|
||
ssh root@144.76.224.179 "docker logs --tail 50 <service_name>"
|
||
```
|
||
|
||
**4. Виконати команду в контейнері:**
|
||
```bash
|
||
ssh root@144.76.224.179 "docker exec <container_name> <command>"
|
||
```
|
||
|
||
**5. Git operations:**
|
||
```bash
|
||
ssh root@144.76.224.179 "cd /opt/microdao-daarion && git pull origin main"
|
||
ssh root@144.76.224.179 "cd /opt/microdao-daarion && git status"
|
||
```
|
||
|
||
**6. Перезапустити Docker Compose:**
|
||
```bash
|
||
ssh root@144.76.224.179 "cd /opt/microdao-daarion && docker compose restart"
|
||
```
|
||
|
||
### Interactive режим (для складних завдань)
|
||
|
||
Якщо потрібно виконати кілька команд підряд, використовуйте interactive SSH:
|
||
|
||
```bash
|
||
# Запустіть інтерактивну сесію
|
||
ssh root@144.76.224.179
|
||
|
||
# Тепер ви на сервері, можете виконувати команди:
|
||
cd /opt/microdao-daarion
|
||
docker ps
|
||
docker logs dagi-router --tail 20
|
||
exit # вийти з SSH
|
||
```
|
||
|
||
### Важливі нотатки для агентів
|
||
|
||
1. **Завжди перевіряйте, де ви знаходитесь:**
|
||
```bash
|
||
hostname # має показати назву сервера Hetzner
|
||
pwd # поточна директорія
|
||
```
|
||
|
||
2. **Не виконуйте деструктивні команди без підтвердження:**
|
||
- `docker rm -f` (видалення контейнерів)
|
||
- `rm -rf` (видалення файлів)
|
||
- Будь-які зміни в production без backup
|
||
|
||
3. **Перевіряйте статус перед змінами:**
|
||
```bash
|
||
docker ps # що зараз працює
|
||
docker compose ps # статус docker compose сервісів
|
||
systemctl status docker # статус Docker daemon
|
||
```
|
||
|
||
4. **Логування ваших дій:**
|
||
- Всі важливі зміни документуйте
|
||
- Використовуйте `git commit` з детальними повідомленнями
|
||
- Включайте `Co-Authored-By: Cursor Agent <agent@cursor.sh>`
|
||
|
||
### Приклад сесії для Cursor Agent
|
||
|
||
```bash
|
||
# 1. Підключення
|
||
ssh root@144.76.224.179
|
||
|
||
# 2. Перехід до проєкту
|
||
cd /opt/microdao-daarion
|
||
|
||
# 3. Перевірка статусу
|
||
git status
|
||
docker ps --format "table {{.Names}}\\t{{.Status}}"
|
||
|
||
# 4. Оновлення коду (якщо потрібно)
|
||
git pull origin main
|
||
|
||
# 5. Перезапуск сервісів (якщо потрібно)
|
||
docker compose restart dagi-router
|
||
|
||
# 6. Перевірка логів
|
||
docker logs dagi-router --tail 20
|
||
|
||
# 7. Вихід
|
||
exit
|
||
```
|
||
|
||
### Troubleshooting
|
||
|
||
**Якщо SSH не підключається:**
|
||
1. Перевірте, чи сервер онлайн: `ping 144.76.224.179`
|
||
2. Перевірте SSH ключі: `ls -la ~/.ssh/`
|
||
3. Спробуйте з verbose: `ssh -v root@144.76.224.179`
|
||
|
||
**Якщо контейнери не працюють:**
|
||
1. Перевірте Docker: `systemctl status docker`
|
||
2. Перевірте логи: `journalctl -u docker --no-pager -n 50`
|
||
3. Перезапустіть Docker: `systemctl restart docker`
|
||
|
||
**Якщо потрібен rescue mode:**
|
||
1. Зайдіть в Hetzner Robot: https://robot.hetzner.com
|
||
2. Активуйте rescue system
|
||
3. Зробіть Reset
|
||
4. Підключіться через SSH з rescue паролем
|
||
|
||
---
|
||
|
||
|
||
## 🚀 Services & Ports (Docker Compose)
|
||
|
||
### Core Services
|
||
|
||
| Service | Port | Container Name | Health Endpoint |
|
||
|---------|------|----------------|-----------------|
|
||
| **DAGI Router** | 9102 | `dagi-router` | `http://localhost:9102/health` |
|
||
| **Bot Gateway** | 9300 | `dagi-gateway` | `http://localhost:9300/health` |
|
||
| **DevTools Backend** | 8008 | `dagi-devtools` | `http://localhost:8008/health` |
|
||
| **CrewAI Orchestrator** | 9010 | `dagi-crewai` | `http://localhost:9010/health` |
|
||
| **RBAC Service** | 9200 | `dagi-rbac` | `http://localhost:9200/health` |
|
||
| **RAG Service** | 9500 | `dagi-rag-service` | `http://localhost:9500/health` |
|
||
| **Memory Service** | 8000 | `dagi-memory-service` | `http://localhost:8000/health` |
|
||
| **Parser Service** | 9400 | `dagi-parser-service` | `http://localhost:9400/health` |
|
||
| **Swapper Service** | 8890-8891 | `swapper-service` | `http://localhost:8890/health` |
|
||
| **Frontend (Vite)** | 8899 | `frontend` | `http://localhost:8899` |
|
||
| **Agent Cabinet Service** | 8898 | `agent-cabinet-service` | `http://localhost:8898/health` |
|
||
| **PostgreSQL** | 5432 | `dagi-postgres` | - |
|
||
| **Redis** | 6379 | `redis` | `redis-cli PING` |
|
||
| **Neo4j** | 7687 (bolt), 7474 (http) | `neo4j` | `http://localhost:7474` |
|
||
| **Qdrant** | 6333 (http), 6334 (grpc) | `dagi-qdrant` | `http://localhost:6333/healthz` |
|
||
| **Grafana** | 3000 | `grafana` | `http://localhost:3000` |
|
||
| **Prometheus** | 9090 | `prometheus` | `http://localhost:9090` |
|
||
| **Neo4j Exporter** | 9091 | `neo4j-exporter` | `http://localhost:9091/metrics` |
|
||
| **Ollama** | 11434 | `ollama` (external) | `http://localhost:11434/api/tags` |
|
||
|
||
### Multimodal Services (НОДА2)
|
||
|
||
| Service | Port | Container Name | Health Endpoint |
|
||
|---------|------|----------------|-----------------|
|
||
| **STT Service** | 8895 | `stt-service` | `http://192.168.1.244:8895/health` |
|
||
| **OCR Service** | 8896 | `ocr-service` | `http://192.168.1.244:8896/health` |
|
||
| **Web Search** | 8897 | `web-search-service` | `http://192.168.1.244:8897/health` |
|
||
| **Vector DB** | 8898 | `vector-db-service` | `http://192.168.1.244:8898/health` |
|
||
|
||
**Note:** Vision Encoder (port 8001) не запущений на Node #1. Замість нього використовується **Swapper Service** з **vision-8b** моделлю (Qwen3-VL 8B) для обробки зображень через динамічне завантаження моделей.
|
||
|
||
**Swapper Service:**
|
||
- **Порт:** 8890 (HTTP), 8891 (Prometheus metrics)
|
||
- **URL НОДА1:** `http://144.76.224.179:8890`
|
||
- **URL НОДА2:** `http://192.168.1.244:8890`
|
||
- **Відображення:** Тільки в кабінетах НОД (`/nodes/node-1`, `/nodes/node-2`)
|
||
- **Оновлення:** В реальному часі (кожні 30 секунд)
|
||
- **Моделі:** 5 моделей (qwen3:8b, qwen3-vl:8b, qwen2.5:7b-instruct, qwen2.5:3b-instruct, qwen2-math:7b)
|
||
- **Спеціалісти:** 6 спеціалістів (vision-8b, math-7b, structured-fc-3b, rag-mini-4b, lang-gateway-4b, security-guard-7b)
|
||
|
||
### HTTPS Gateway (Nginx)
|
||
- **Port:** 443 (HTTPS), 80 (HTTP redirect)
|
||
- **Domain:** `gateway.daarion.city`
|
||
- **SSL:** Let's Encrypt (auto-renewal)
|
||
- **Proxy Pass:**
|
||
- `/telegram/webhook` → `http://localhost:9300/telegram/webhook`
|
||
- `/helion/telegram/webhook` → `http://localhost:9300/helion/telegram/webhook`
|
||
|
||
---
|
||
|
||
## 🤖 Telegram Bots
|
||
|
||
### 1. DAARWIZZ Bot
|
||
- **Username:** [@DAARWIZZBot](https://t.me/DAARWIZZBot)
|
||
- **Bot ID:** `8323412397`
|
||
- **Token:** `8323412397:AAFxaru-hHRl08A3T6TC02uHLvO5wAB0m3M` ✅
|
||
- **Webhook:** `https://gateway.daarion.city/telegram/webhook`
|
||
- **Status:** Active (Production)
|
||
|
||
### 2. Helion Bot (Energy Union AI)
|
||
- **Username:** [@HelionEnergyBot](https://t.me/HelionEnergyBot) (example)
|
||
- **Bot ID:** `8112062582`
|
||
- **Token:** `8112062582:AAGI7tPFo4gvZ6bfbkFu9miq5GdAH2_LvcM` ✅
|
||
- **Webhook:** `https://gateway.daarion.city/helion/telegram/webhook`
|
||
- **Status:** Ready for deployment
|
||
|
||
---
|
||
|
||
## 🔐 Environment Variables (.env)
|
||
|
||
### Essential Variables
|
||
|
||
```bash
|
||
# Bot Gateway
|
||
TELEGRAM_BOT_TOKEN=8323412397:AAFxaru-hHRl08A3T6TC02uHLvO5wAB0m3M
|
||
HELION_TELEGRAM_BOT_TOKEN=8112062582:AAGI7tPFo4gvZ6bfbkFu9miq5GdAH2_LvcM
|
||
GATEWAY_PORT=9300
|
||
|
||
# DAGI Router
|
||
ROUTER_PORT=9102
|
||
ROUTER_CONFIG_PATH=./router-config.yml
|
||
|
||
# Ollama (Local LLM)
|
||
OLLAMA_BASE_URL=http://localhost:11434
|
||
OLLAMA_MODEL=qwen3:8b
|
||
|
||
# Memory Service
|
||
MEMORY_SERVICE_URL=http://memory-service:8000
|
||
MEMORY_DATABASE_URL=postgresql://postgres:postgres@postgres:5432/daarion_memory
|
||
|
||
# PostgreSQL
|
||
POSTGRES_USER=postgres
|
||
POSTGRES_PASSWORD=postgres
|
||
POSTGRES_DB=daarion_memory
|
||
|
||
# RBAC
|
||
RBAC_PORT=9200
|
||
RBAC_DATABASE_URL=sqlite:///./rbac.db
|
||
|
||
# Vision Encoder (GPU required for production)
|
||
VISION_ENCODER_URL=http://vision-encoder:8001
|
||
VISION_DEVICE=cuda
|
||
VISION_MODEL_NAME=ViT-L-14
|
||
VISION_MODEL_PRETRAINED=openai
|
||
|
||
# Qdrant Vector Database
|
||
QDRANT_HOST=qdrant
|
||
QDRANT_PORT=6333
|
||
QDRANT_ENABLED=true
|
||
|
||
# CORS
|
||
CORS_ORIGINS=http://localhost:3000,https://daarion.city
|
||
|
||
# Environment
|
||
ENVIRONMENT=production
|
||
DEBUG=false
|
||
LOG_LEVEL=INFO
|
||
```
|
||
|
||
---
|
||
|
||
## 🌌 SPACE API (planets, nodes, events)
|
||
|
||
**Сервіс:** `space-service` (FastAPI / Node.js)
|
||
**Порти:** `7001` (FastAPI), `3005` (Node.js)
|
||
|
||
### **GET /space/planets**
|
||
Повертає DAO-планети (health, treasury, satellites, anomaly score, position).
|
||
|
||
**Response Example:**
|
||
```json
|
||
[
|
||
{
|
||
"dao_id": "dao:3",
|
||
"name": "Aurora Circle",
|
||
"health": "good",
|
||
"treasury": 513200,
|
||
"activity": 0.84,
|
||
"governance_temperature": 72,
|
||
"anomaly_score": 0.04,
|
||
"position": { "x": 120, "y": 40, "z": -300 },
|
||
"node_count": 12,
|
||
"satellites": [
|
||
{
|
||
"node_id": "node:03",
|
||
"gpu_load": 0.66,
|
||
"latency": 14,
|
||
"agents": 22
|
||
}
|
||
]
|
||
}
|
||
]
|
||
```
|
||
|
||
### **GET /space/nodes**
|
||
Повертає стан кожної ноди (GPU, CPU, memory, network, agents, status).
|
||
|
||
**Response Example:**
|
||
```json
|
||
[
|
||
{
|
||
"node_id": "node:03",
|
||
"name": "Quantum Relay",
|
||
"microdao": "microdao:7",
|
||
"gpu": {
|
||
"load": 0.72,
|
||
"vram_used": 30.1,
|
||
"vram_total": 40.0,
|
||
"temperature": 71
|
||
},
|
||
"cpu": {
|
||
"load": 0.44,
|
||
"temperature": 62
|
||
},
|
||
"memory": {
|
||
"used": 11.2,
|
||
"total": 32.0
|
||
},
|
||
"network": {
|
||
"latency": 12,
|
||
"bandwidth_in": 540,
|
||
"bandwidth_out": 430,
|
||
"packet_loss": 0.01
|
||
},
|
||
"agents": 14,
|
||
"status": "healthy"
|
||
}
|
||
]
|
||
```
|
||
|
||
### **GET /space/events**
|
||
Поточні DAO/Space події (governance, treasury, anomalies, node alerts).
|
||
|
||
**Query Parameters:**
|
||
- `seconds` (optional): Time window in seconds (default: 120)
|
||
|
||
**Response Example:**
|
||
```json
|
||
[
|
||
{
|
||
"type": "dao.vote.opened",
|
||
"dao_id": "dao:3",
|
||
"timestamp": 1735680041,
|
||
"severity": "info",
|
||
"meta": {
|
||
"proposal_id": "P-173",
|
||
"title": "Budget Allocation 2025"
|
||
}
|
||
},
|
||
{
|
||
"type": "node.alert.overload",
|
||
"node_id": "node:05",
|
||
"timestamp": 1735680024,
|
||
"severity": "warn",
|
||
"meta": {
|
||
"gpu_load": 0.92
|
||
}
|
||
}
|
||
]
|
||
```
|
||
|
||
### **Джерела даних:**
|
||
|
||
| Дані | Джерело | Компонент |
|
||
| ------ | -------------------------------------------- | ------------------------------- |
|
||
| DAO | microDAO Service / DAO-Service | PostgreSQL |
|
||
| Ноди | NodeMetrics Agent → NATS → Metrics Collector | Redis / Timescale |
|
||
| Агенти | Router → Agent Registry | Redis / SQLite |
|
||
| Події | NATS JetStream | JetStream Stream `events.space` |
|
||
|
||
**Frontend Integration:**
|
||
- API клієнти: `src/api/space/getPlanets.ts`, `src/api/space/getNodes.ts`, `src/api/space/getSpaceEvents.ts`
|
||
- Використання: City Dashboard, Space Dashboard, Living Map, World Prototype
|
||
|
||
---
|
||
|
||
## 📦 Deployment Workflow
|
||
|
||
### 1. Local Development → GitHub
|
||
```bash
|
||
# On Mac (local)
|
||
cd /Users/apple/github-projects/microdao-daarion
|
||
git add .
|
||
git commit -m "feat: description"
|
||
git push origin main
|
||
```
|
||
|
||
### 2. GitHub → Production Server
|
||
```bash
|
||
# SSH to server
|
||
ssh root@144.76.224.179
|
||
|
||
# Navigate to project
|
||
cd /opt/microdao-daarion
|
||
|
||
# Pull latest changes
|
||
git pull origin main
|
||
|
||
# Restart services
|
||
docker-compose down
|
||
docker-compose up -d --build
|
||
|
||
# Check status
|
||
docker-compose ps
|
||
docker-compose logs -f gateway
|
||
```
|
||
|
||
### 3. HTTPS Gateway Setup
|
||
```bash
|
||
# On server (one-time setup)
|
||
sudo ./scripts/setup-nginx-gateway.sh gateway.daarion.city admin@daarion.city
|
||
```
|
||
|
||
### 4. Register Telegram Webhook
|
||
```bash
|
||
# On server
|
||
./scripts/register-agent-webhook.sh daarwizz 8323412397:AAFxaru-hHRl08A3T6TC02uHLvO5wAB0m3M gateway.daarion.city
|
||
./scripts/register-agent-webhook.sh helion 8112062582:AAGI7tPFo4gvZ6bfbkFu9miq5GdAH2_LvcM gateway.daarion.city
|
||
```
|
||
|
||
---
|
||
|
||
## 🧪 Testing & Monitoring
|
||
|
||
### Health Checks (All Services)
|
||
```bash
|
||
# On server
|
||
curl http://localhost:9102/health # Router
|
||
curl http://localhost:9300/health # Gateway
|
||
curl http://localhost:8000/health # Memory
|
||
curl http://localhost:9200/health # RBAC
|
||
curl http://localhost:9500/health # RAG
|
||
curl http://localhost:8001/health # Vision Encoder
|
||
curl http://localhost:6333/healthz # Qdrant
|
||
|
||
# Public HTTPS
|
||
curl https://gateway.daarion.city/health
|
||
```
|
||
|
||
### Smoke Tests
|
||
```bash
|
||
# On server
|
||
cd /opt/microdao-daarion
|
||
./smoke.sh
|
||
```
|
||
|
||
### View Logs
|
||
```bash
|
||
# All services
|
||
docker-compose logs -f
|
||
|
||
# Specific service
|
||
docker-compose logs -f gateway
|
||
docker-compose logs -f router
|
||
docker-compose logs -f memory-service
|
||
|
||
# Filter by error level
|
||
docker-compose logs gateway | grep ERROR
|
||
```
|
||
|
||
### Database Check
|
||
```bash
|
||
# PostgreSQL
|
||
docker exec -it dagi-postgres psql -U postgres -c "\l"
|
||
docker exec -it dagi-postgres psql -U postgres -d daarion_memory -c "\dt"
|
||
```
|
||
|
||
---
|
||
|
||
## 🌐 DNS Configuration
|
||
|
||
### Current DNS Records (Cloudflare/Hetzner)
|
||
| Record Type | Name | Value | TTL |
|
||
|-------------|------|-------|-----|
|
||
| A | `gateway.daarion.city` | `144.76.224.179` | 300 |
|
||
| A | `daarion.city` | TBD | 300 |
|
||
| A | `api.daarion.city` | TBD | 300 |
|
||
|
||
**Verify DNS:**
|
||
```bash
|
||
dig gateway.daarion.city +short
|
||
# Should return: 144.76.224.179
|
||
```
|
||
|
||
---
|
||
|
||
## 📂 Key File Locations
|
||
|
||
### On Server (`/opt/microdao-daarion`)
|
||
- **Docker Compose:** `docker-compose.yml`
|
||
- **Environment:** `.env` (never commit!)
|
||
- **Router Config:** `router-config.yml`
|
||
- **Nginx Setup:** `scripts/setup-nginx-gateway.sh`
|
||
- **Webhook Register:** `scripts/register-agent-webhook.sh`
|
||
- **Logs:** `logs/` directory
|
||
- **Data:** `data/` directory
|
||
|
||
### System Prompts
|
||
- **DAARWIZZ:** `gateway-bot/daarwizz_prompt.txt`
|
||
- **Helion:** `gateway-bot/helion_prompt.txt`
|
||
|
||
### Documentation
|
||
- **Quick Start:** `WARP.md`
|
||
- **Agents Map:** `docs/agents.md`
|
||
- **RAG Ingestion:** `RAG-INGESTION-STATUS.md`
|
||
- **HMM Memory:** `HMM-MEMORY-STATUS.md`
|
||
- **Crawl4AI Service:** `CRAWL4AI-STATUS.md`
|
||
- **Architecture:** `docs/cursor/README.md`
|
||
- **API Reference:** `docs/api.md`
|
||
|
||
---
|
||
|
||
## 🔄 Backup & Restore
|
||
|
||
### Backup Database
|
||
```bash
|
||
# PostgreSQL dump
|
||
docker exec dagi-postgres pg_dump -U postgres daarion_memory > backup_$(date +%Y%m%d).sql
|
||
|
||
# RBAC SQLite
|
||
cp data/rbac/rbac.db backups/rbac_$(date +%Y%m%d).db
|
||
```
|
||
|
||
### Restore Database
|
||
```bash
|
||
# PostgreSQL restore
|
||
cat backup_20250117.sql | docker exec -i dagi-postgres psql -U postgres daarion_memory
|
||
|
||
# RBAC restore
|
||
cp backups/rbac_20250117.db data/rbac/rbac.db
|
||
docker-compose restart rbac
|
||
```
|
||
|
||
---
|
||
|
||
## 📞 Contacts & Support
|
||
|
||
### Team
|
||
- **Owner:** Ivan Tytar
|
||
- **Email:** admin@daarion.city
|
||
- **GitHub:** [@IvanTytar](https://github.com/IvanTytar)
|
||
|
||
### External Services
|
||
- **Hetzner Support:** https://www.hetzner.com/support
|
||
- **Cloudflare Support:** https://dash.cloudflare.com
|
||
- **Telegram Bot Support:** https://core.telegram.org/bots
|
||
|
||
---
|
||
|
||
## 🔗 Quick Reference Links
|
||
|
||
### Documentation
|
||
- [WARP.md](./WARP.md) — Main developer guide
|
||
- [SYSTEM-INVENTORY.md](./SYSTEM-INVENTORY.md) — Complete system inventory (GPU, AI models, 17 services)
|
||
- [DAARION_CITY_REPO.md](./DAARION_CITY_REPO.md) — Repository management
|
||
- [RAG-INGESTION-STATUS.md](./RAG-INGESTION-STATUS.md) — RAG event-driven ingestion (Wave 1, 2, 3)
|
||
- [HMM-MEMORY-STATUS.md](./HMM-MEMORY-STATUS.md) — Hierarchical Memory System for agents
|
||
- [CRAWL4AI-STATUS.md](./CRAWL4AI-STATUS.md) — Web crawler for document ingestion (PDF, Images, HTML)
|
||
- [VISION-ENCODER-STATUS.md](./VISION-ENCODER-STATUS.md) — Vision Encoder service status (OpenCLIP multimodal embeddings)
|
||
- [VISION-RAG-IMPLEMENTATION.md](./VISION-RAG-IMPLEMENTATION.md) — Vision RAG complete implementation (client, image search, routing)
|
||
- [services/vision-encoder/README.md](./services/vision-encoder/README.md) — Vision Encoder deployment guide
|
||
- [SERVER_SETUP_INSTRUCTIONS.md](./SERVER_SETUP_INSTRUCTIONS.md) — Server setup
|
||
- [DEPLOY-NOW.md](./DEPLOY-NOW.md) — Deployment checklist
|
||
- [STATUS-HELION.md](./STATUS-HELION.md) — Helion agent status
|
||
|
||
### Monitoring Dashboards
|
||
- **Gateway Health:** `https://gateway.daarion.city/health`
|
||
- **Router Providers:** `http://localhost:9102/providers`
|
||
- **Routing Table:** `http://localhost:9102/routing`
|
||
- **Prometheus:** `http://localhost:9090` (Metrics, Alerts, Targets)
|
||
- **Grafana Dashboard:** `http://localhost:3000` (Neo4j metrics, DAO/Agents/Users analytics)
|
||
- **Neo4j Browser:** `http://localhost:7474` (Graph visualization, Cypher queries)
|
||
- **Neo4j Exporter:** `http://localhost:9091/metrics` (Prometheus metrics endpoint)
|
||
|
||
---
|
||
|
||
## 🚨 Troubleshooting
|
||
|
||
### Service Not Starting
|
||
```bash
|
||
# Check logs
|
||
docker-compose logs service-name
|
||
|
||
# Restart service
|
||
docker-compose restart service-name
|
||
|
||
# Rebuild and restart
|
||
docker-compose up -d --build service-name
|
||
```
|
||
|
||
### Database Connection Issues
|
||
```bash
|
||
# Check PostgreSQL
|
||
docker exec -it dagi-postgres psql -U postgres -c "SELECT 1"
|
||
|
||
# Restart PostgreSQL
|
||
docker-compose restart postgres
|
||
|
||
# Check connection from memory service
|
||
docker exec -it dagi-memory-service env | grep DATABASE
|
||
```
|
||
|
||
### Webhook Not Working
|
||
```bash
|
||
# Check webhook status
|
||
curl "https://api.telegram.org/bot<TOKEN>/getWebhookInfo"
|
||
|
||
# Re-register webhook
|
||
./scripts/register-agent-webhook.sh <agent> <token> <domain>
|
||
|
||
# Check gateway logs
|
||
docker-compose logs -f gateway | grep webhook
|
||
```
|
||
|
||
### SSL Certificate Issues
|
||
```bash
|
||
# Check certificate
|
||
sudo certbot certificates
|
||
|
||
# Renew certificate
|
||
sudo certbot renew --dry-run
|
||
sudo certbot renew
|
||
|
||
# Restart Nginx
|
||
sudo systemctl restart nginx
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 Metrics & Analytics (Future)
|
||
|
||
### Planned Monitoring Stack
|
||
- **Prometheus:** Metrics collection
|
||
- **Grafana:** Dashboards
|
||
- **Loki:** Log aggregation
|
||
- **Alertmanager:** Alerts
|
||
|
||
**Port Reservations:**
|
||
- Prometheus: 9090
|
||
- Grafana: 3000
|
||
- Loki: 3100
|
||
|
||
---
|
||
|
||
---
|
||
|
||
## 🖥️ Кабінети НОД та МікроДАО
|
||
|
||
### Кабінети НОД
|
||
- **НОДА1:** `http://localhost:8899/nodes/node-1`
|
||
- **НОДА2:** `http://localhost:8899/nodes/node-2`
|
||
|
||
**Функціонал:**
|
||
- Огляд (метрики, статус, GPU)
|
||
- Агенти (список, деплой, управління)
|
||
- Сервіси (Swapper Service з детальними метриками, інші сервіси)
|
||
- Метрики (CPU, RAM, Disk, Network)
|
||
- Плагіни (встановлені та доступні)
|
||
- Інвентаризація (повна інформація про встановлене ПЗ)
|
||
|
||
**Swapper Service в кабінетах НОД:**
|
||
- Статус сервісу (CPU, RAM, VRAM, Uptime)
|
||
- Конфігурація (режим, max concurrent, memory buffer, eviction)
|
||
- Моделі (таблиця з усіма моделями, статусом, uptime, запитами)
|
||
- Спеціалісти (6 спеціалістів з інформацією про моделі та використання)
|
||
- Активна модель (якщо є)
|
||
- Оновлення в реальному часі (кожні 30 секунд)
|
||
|
||
### Кабінети МікроДАО
|
||
- **DAARION:** `http://localhost:8899/microdao/daarion`
|
||
- **GREENFOOD:** `http://localhost:8899/microdao/greenfood`
|
||
- **ENERGY UNION:** `http://localhost:8899/microdao/energy-union`
|
||
|
||
**Функціонал:**
|
||
- Огляд (чат з оркестратором, статистика)
|
||
- Агенти (список агентів, оркестратор з НОДИ1)
|
||
- Канали (список каналів)
|
||
- Проєкти (майбутнє)
|
||
- Управління мікроДАО (тільки для DAARION - панель управління всіма мікроДАО)
|
||
- DAARION Core (тільки для DAARION)
|
||
- Налаштування
|
||
|
||
**Оркестратори:**
|
||
- DAARION → DAARWIZZ (agent-daarwizz)
|
||
- GREENFOOD → GREENFOOD Assistant (agent-greenfood-assistant)
|
||
- ENERGY UNION → Helion (agent-helion)
|
||
|
||
---
|
||
|
||
---
|
||
|
||
## 🎤 Multimodal Services Details (НОДА2)
|
||
|
||
### STT Service — Speech-to-Text
|
||
- **URL:** `http://192.168.1.244:8895`
|
||
- **Technology:** OpenAI Whisper AI (base model)
|
||
- **Functions:**
|
||
- Voice → Text transcription
|
||
- Ukrainian, English, Russian support
|
||
- Auto-transcription for Telegram bots
|
||
- **Endpoints:**
|
||
- `POST /api/stt` — Transcribe base64 audio
|
||
- `POST /api/stt/upload` — Upload audio file
|
||
- `GET /health` — Health check
|
||
- **Status:** ✅ Ready for Integration
|
||
|
||
### OCR Service — Text Extraction
|
||
- **URL:** `http://192.168.1.244:8896`
|
||
- **Technology:** Tesseract + EasyOCR
|
||
- **Functions:**
|
||
- Image → Text extraction
|
||
- Bounding boxes detection
|
||
- Multi-language support (uk, en, ru, pl, de, fr)
|
||
- Confidence scores
|
||
- **Endpoints:**
|
||
- `POST /api/ocr` — Extract text from base64 image
|
||
- `POST /api/ocr/upload` — Upload image file
|
||
- `GET /health` — Health check
|
||
- **Status:** ✅ Ready for Integration
|
||
|
||
### Web Search Service
|
||
- **URL:** `http://192.168.1.244:8897`
|
||
- **Technology:** DuckDuckGo + Google Search
|
||
- **Functions:**
|
||
- Real-time web search
|
||
- Region-specific search (ua-uk, us-en)
|
||
- JSON structured results
|
||
- Up to 10+ results per query
|
||
- **Endpoints:**
|
||
- `POST /api/search` — Search with JSON body
|
||
- `GET /api/search?query=...` — Search with query params
|
||
- `GET /health` — Health check
|
||
- **Status:** ✅ Ready for Integration
|
||
|
||
### Vector DB Service — Knowledge Base
|
||
- **URL:** `http://192.168.1.244:8898`
|
||
- **Technology:** ChromaDB + Sentence Transformers
|
||
- **Functions:**
|
||
- Vector database for documents
|
||
- Semantic search
|
||
- Document embeddings (all-MiniLM-L6-v2)
|
||
- RAG (Retrieval-Augmented Generation) support
|
||
- **Endpoints:**
|
||
- `POST /api/collections` — Create collection
|
||
- `GET /api/collections` — List collections
|
||
- `POST /api/documents` — Add documents
|
||
- `POST /api/search` — Semantic search
|
||
- `DELETE /api/documents` — Delete documents
|
||
- `GET /health` — Health check
|
||
- **Status:** ✅ Ready for Integration
|
||
|
||
---
|
||
|
||
## 🔄 Router Multimodal Support (NODE1)
|
||
|
||
### Enhanced /route endpoint
|
||
- **URL:** `http://144.76.224.179:9102/route`
|
||
- **New Payload Structure:**
|
||
|
||
```json
|
||
{
|
||
"agent": "sofia",
|
||
"message": "Analyze this image",
|
||
"mode": "chat",
|
||
"payload": {
|
||
"context": {
|
||
"system_prompt": "...",
|
||
"images": ["data:image/png;base64,..."],
|
||
"files": [{"name": "doc.pdf", "data": "..."}],
|
||
"audio": "data:audio/webm;base64,..."
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Vision Agents
|
||
- **Sofia** (grok-4.1, xAI) — Vision + Code + Files
|
||
- **Spectra** (qwen3-vl:latest, Ollama) — Vision + Language
|
||
|
||
### Features:
|
||
- 📷 Image processing (PIL)
|
||
- 📎 File processing (PDF, TXT, MD)
|
||
- 🎤 Audio transcription (via STT Service)
|
||
- 🌐 Web search integration
|
||
- 📚 Knowledge Base / RAG
|
||
|
||
**Status:** 🔄 Integration in Progress
|
||
|
||
---
|
||
|
||
## 📱 Telegram Gateway Multimodal Updates
|
||
|
||
### Enhanced Features:
|
||
- 🎤 **Voice Messages** → Auto-transcription via STT Service
|
||
- 📷 **Photos** → Vision analysis via Sofia/Spectra
|
||
- 📎 **Documents** → Text extraction via OCR/Parser
|
||
- 🌐 **Web Search** → Real-time search results
|
||
|
||
### Workflow:
|
||
```
|
||
Telegram Bot → Voice/Photo/File
|
||
↓
|
||
Gateway → STT/OCR/Parser Service
|
||
↓
|
||
Router → Vision/LLM Agent
|
||
↓
|
||
Response → Telegram Bot
|
||
```
|
||
|
||
**Status:** 🔄 Integration in Progress
|
||
|
||
---
|
||
|
||
## 📊 All Services Port Summary
|
||
|
||
| Service | Port | Node | Technology | Status |
|
||
|---------|------|------|------------|--------|
|
||
| Frontend | 8899 | Local | React + Vite | ✅ |
|
||
| STT Service | 8895 | НОДА2 | Whisper AI | ✅ Ready |
|
||
| OCR Service | 8896 | НОДА2 | Tesseract + EasyOCR | ✅ Ready |
|
||
| Web Search | 8897 | НОДА2 | DuckDuckGo + Google | ✅ Ready |
|
||
| Vector DB | 8898 | НОДА2 | ChromaDB | ✅ Ready |
|
||
| Router | 9102 | NODE1 | FastAPI + Ollama | 🔄 Multimodal |
|
||
| Telegram Gateway | 9200 | NODE1 | FastAPI + NATS | 🔄 Enhanced |
|
||
| Swapper NODE1 | 8890 | NODE1 | LLM Manager | ✅ |
|
||
| Swapper NODE2 | 8890 | НОДА2 | LLM Manager | ✅ |
|
||
|
||
---
|
||
|
||
**Last Updated:** 2025-11-23 by Auto AI
|
||
**Maintained by:** Ivan Tytar & DAARION Team
|
||
**Status:** ✅ Production Ready (🔄 Multimodal Integration in Progress)
|
||
|
||
---
|
||
|
||
## 🎨 Multimodal Integration (v2.1.0)
|
||
|
||
### Router Multimodal API (NODE1)
|
||
|
||
**Version:** 1.1.0-multimodal
|
||
**Endpoint:** `http://144.76.224.179:9102/route`
|
||
|
||
**Features:**
|
||
```json
|
||
{
|
||
"features": [
|
||
"multimodal",
|
||
"vision",
|
||
"stt",
|
||
"ocr",
|
||
"web-search"
|
||
]
|
||
}
|
||
```
|
||
|
||
**Request Format:**
|
||
```json
|
||
{
|
||
"agent": "daarwizz",
|
||
"message": "User message",
|
||
"mode": "chat",
|
||
"images": ["data:image/jpeg;base64,..."],
|
||
"files": [{"name": "doc.pdf", "content": "base64...", "type": "application/pdf"}],
|
||
"audio": "base64_encoded_audio",
|
||
"web_search_query": "search query",
|
||
"language": "uk"
|
||
}
|
||
```
|
||
|
||
**Vision Agents:**
|
||
- `sofia` - Sofia Vision Agent (qwen3-vl:8b)
|
||
- `spectra` - Spectra Vision Agent (qwen3-vl:8b)
|
||
|
||
**Обробка:**
|
||
- Vision agents → images передаються напряму
|
||
- Звичайні agents → images конвертуються через OCR
|
||
- Audio → транскрибується через STT
|
||
- Files → текст витягується (PDF, TXT, MD)
|
||
|
||
---
|
||
|
||
### Telegram Gateway Multimodal (NODE1)
|
||
|
||
**Location:** `/opt/microdao-daarion/gateway-bot/`
|
||
**Handlers:** `gateway_multimodal_handlers.py`
|
||
|
||
**Supported Content Types:**
|
||
- 🎤 Voice messages → STT → Router
|
||
- 📸 Photos → Vision/OCR → Router
|
||
- 📎 Documents → Text extraction → Router
|
||
|
||
**Example Flow:**
|
||
```
|
||
1. User sends voice to @DAARWIZZBot
|
||
2. Gateway downloads from Telegram
|
||
3. Gateway sends base64 audio to Router
|
||
4. Router transcribes via STT (or fallback)
|
||
5. Router processes with agent LLM
|
||
6. Gateway sends response back to Telegram
|
||
```
|
||
|
||
**Telegram Bot Tokens (реальні з BOT_CONFIGS):**
|
||
|
||
1. CLAN: `$CLAN_TELEGRAM_BOT_TOKEN` (@CLAN_bot)
|
||
2. DAARWIZZ: `$DAARWIZZ_TELEGRAM_BOT_TOKEN` (@DAARWIZZBot)
|
||
3. DRUID: `$DRUID_TELEGRAM_BOT_TOKEN` (@DRUIDBot)
|
||
4. EONARCH: `$EONARCH_TELEGRAM_BOT_TOKEN` (@EONARCHBot)
|
||
5. GREENFOOD: `$GREENFOOD_TELEGRAM_BOT_TOKEN` (@GREENFOODBot) - має CrewAI команду
|
||
6. Helion: `$HELION_TELEGRAM_BOT_TOKEN` (@HelionBot)
|
||
7. NUTRA: `$NUTRA_TELEGRAM_BOT_TOKEN` (@NUTRABot)
|
||
8. Soul: `$SOUL_TELEGRAM_BOT_TOKEN` (@SoulBot)
|
||
9. Yaromir: `$YAROMIR_TELEGRAM_BOT_TOKEN` (@YaromirBot) - CrewAI Orchestrator
|
||
|
||
**ВСЬОГО: 9 Telegram ботів** (перевірено в BOT_CONFIGS)
|
||
|
||
**Webhook Pattern:** `https://gateway.daarion.city/{bot_id}/telegram/webhook`
|
||
|
||
**Multimodal Support:**
|
||
- ✅ Всі 9 ботів підтримують voice/photo/document через universal webhook
|
||
|
||
**CrewAI команди (внутрішні агенти, БЕЗ Telegram ботів):**
|
||
- **Yaromir** (Orchestrator) → делегує:
|
||
- Вождь (Strategic, qwen2.5:14b)
|
||
- Проводник (Mentor, qwen2.5:7b)
|
||
- Домір (Harmony, qwen2.5:3b)
|
||
- Создатель (Innovation, qwen2.5:14b)
|
||
- **GREENFOOD** (Orchestrator) → має свою CrewAI команду
|
||
|
||
**Примітка:** Вождь, Проводник, Домір, Создатель мають промпти (`*_prompt.txt`) але НЕ мають Telegram токенів. Вони працюють тільки всередині CrewAI workflow.
|
||
|
||
---
|
||
|
||
### Frontend Multimodal UI
|
||
|
||
**Location:** `src/components/microdao/`
|
||
|
||
**Components:**
|
||
- `MicroDaoOrchestratorChatEnhanced.tsx` - Enhanced chat with multimodal
|
||
- `MultimodalInput.tsx` - Input component (images/files/voice/web-search)
|
||
|
||
**Features:**
|
||
- ✅ Switch toggle для розширеного режиму
|
||
- ✅ Image upload (drag & drop, click)
|
||
- ✅ File upload (PDF, TXT, MD)
|
||
- ✅ Voice recording (Web Audio API)
|
||
- ✅ Web search integration
|
||
- ✅ Real-time preview
|
||
|
||
**Usage:**
|
||
1. Open `http://localhost:8899/microdao/daarion`
|
||
2. Enable "Розширений режим" (switch)
|
||
3. Upload images, files, or record voice
|
||
4. Send to agent
|
||
|
||
---
|
||
|
||
### НОДА2 Multimodal Services
|
||
|
||
**Location:** MacBook M4 Max (`192.168.1.33`)
|
||
|
||
| Service | Port | Status | Notes |
|
||
|---------|------|--------|-------|
|
||
| STT (Whisper) | 8895 | ⚠️ Docker issue | Fallback працює |
|
||
| OCR (Tesseract/EasyOCR) | 8896 | ⚠️ Docker issue | Fallback працює |
|
||
| Web Search | 8897 | ✅ HEALTHY | DuckDuckGo + Google |
|
||
| Vector DB (ChromaDB) | 8898 | ✅ HEALTHY | RAG ready |
|
||
|
||
**Fallback Mechanism:**
|
||
- Router має fallback логіку для недоступних сервісів
|
||
- Якщо STT недоступний → повертається помилка (graceful)
|
||
- Якщо OCR недоступний → fallback на базовий text extraction
|
||
|
||
---
|
||
|
||
### Testing Multimodal
|
||
|
||
#### 1. Router API
|
||
```bash
|
||
# Health check
|
||
curl http://144.76.224.179:9102/health
|
||
|
||
# Basic text
|
||
curl -X POST http://144.76.224.179:9102/route \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{"agent":"daarwizz","message":"Привіт","mode":"chat"}'
|
||
|
||
# With image (Vision)
|
||
curl -X POST http://144.76.224.179:9102/route \
|
||
-H 'Content-Type: application/json' \
|
||
-d '{
|
||
"agent":"sofia",
|
||
"message":"Опиши це зображення",
|
||
"images":["data:image/jpeg;base64,/9j/4AAQ..."],
|
||
"mode":"chat"
|
||
}'
|
||
```
|
||
|
||
#### 2. Telegram Bots (9 реальних ботів)
|
||
|
||
**Всі боти (з BOT_CONFIGS):**
|
||
```
|
||
@CLAN_bot, @DAARWIZZBot, @DRUIDBot, @EONARCHBot,
|
||
@GREENFOODBot, @HelionBot, @NUTRABot, @SoulBot, @YaromirBot
|
||
```
|
||
|
||
**Тести:**
|
||
1. Send voice message: "Привіт, як справи?"
|
||
2. Send photo with caption: "Що на цьому фото?"
|
||
3. Send document: "Проаналізуй цей документ"
|
||
|
||
**CrewAI Workflow (через @YaromirBot):**
|
||
```
|
||
User → @YaromirBot (Telegram)
|
||
↓
|
||
Yaromir Orchestrator
|
||
↓ (CrewAI delegation)
|
||
┌────┴────┬────────┬─────────┐
|
||
↓ ↓ ↓ ↓
|
||
Вождь Проводник Домир Создатель
|
||
(Internal CrewAI agents - NO Telegram bots)
|
||
↓
|
||
Yaromir → Response → Telegram
|
||
```
|
||
|
||
**Примітка:** Вождь, Проводник, Домір, Создатель НЕ є окремими Telegram ботами. Вони працюють тільки всередині CrewAI коли Yaromir делегує завдання.
|
||
|
||
#### 3. Frontend
|
||
```
|
||
1. Open http://localhost:8899/microdao/daarion
|
||
2. Enable "Розширений режим"
|
||
3. Upload image
|
||
4. Upload file
|
||
5. Record voice
|
||
```
|
||
|
||
---
|
||
|
||
### Implementation Files
|
||
|
||
**Router (NODE1):**
|
||
- `/app/multimodal/handlers.py` - Multimodal обробники
|
||
- `/app/http_api.py` - Updated with multimodal support
|
||
|
||
**Gateway (NODE1):**
|
||
- `/opt/microdao-daarion/gateway-bot/gateway_multimodal_handlers.py`
|
||
- `/opt/microdao-daarion/gateway-bot/http_api.py` (updated)
|
||
|
||
**Frontend:**
|
||
- `src/pages/MicroDaoCabinetPage.tsx`
|
||
- `src/components/microdao/MicroDaoOrchestratorChatEnhanced.tsx`
|
||
- `src/components/microdao/chat/MultimodalInput.tsx`
|
||
|
||
**НОДА2 Services:**
|
||
- `services/stt-service/`
|
||
- `services/ocr-service/`
|
||
- `services/web-search-service/`
|
||
- `services/vector-db-service/`
|
||
|
||
---
|
||
|
||
### Documentation
|
||
|
||
**Created Files:**
|
||
- `/tmp/MULTIMODAL-INTEGRATION-FINAL-REPORT.md`
|
||
- `/tmp/TELEGRAM-GATEWAY-MULTIMODAL-INTEGRATION.md`
|
||
- `/tmp/MULTIMODAL-INTEGRATION-SUCCESS.md`
|
||
- `/tmp/COMPLETE-MULTIMODAL-ECOSYSTEM.md`
|
||
- `ROUTER-MULTIMODAL-SUPPORT.md`
|
||
|
||
**Time Invested:** ~6.5 hours
|
||
**Status:** 95% Complete
|
||
**Production Ready:** ✅ Yes (with fallbacks)
|
||
|
||
---
|
||
|
||
## 🔒 Security & Incident Response
|
||
|
||
### Incident #1: Network Scanning & Server Lockdown (Dec 6, 2025 - Jan 8, 2026)
|
||
|
||
**Timeline:**
|
||
- **Dec 6, 2025 10:56 UTC**: Automated SSH scanning detected from server
|
||
- **Dec 6, 2025 11:00 UTC**: Hetzner locked server IP (144.76.224.179)
|
||
- **Jan 8, 2026 18:00 UTC**: Unlock request approved, server recovered
|
||
|
||
**Root Cause:**
|
||
- Server compromised with cryptocurrency miner (`catcal`, `G4NQXBp`) via `daarion-web` container
|
||
- Miner performed network scanning of Hetzner internal network (10.126.0.0/16)
|
||
- ~500+ SSH connection attempts to internal IP range triggered automated block
|
||
- High CPU load (35+) from mining process
|
||
|
||
**Impact:**
|
||
- ❌ Server unavailable for 33 days
|
||
- ❌ All services down
|
||
- ❌ Telegram bots offline
|
||
- ❌ Lost production data/monitoring
|
||
|
||
**Resolution:**
|
||
1. ✅ Server recovered via rescue mode
|
||
2. ✅ Compromised `daarion-web` container stopped and removed
|
||
3. ✅ Cryptocurrency miner processes killed
|
||
4. ✅ Firewall rules implemented to block internal network access
|
||
5. ✅ Monitoring script deployed for future scanning attempts
|
||
|
||
**Prevention Measures:**
|
||
|
||
**Firewall Rules:**
|
||
```bash
|
||
# Block Hetzner internal networks
|
||
iptables -I OUTPUT -d 10.0.0.0/8 -j DROP
|
||
iptables -I OUTPUT -d 172.16.0.0/12 -j DROP
|
||
|
||
# Allow only necessary ports
|
||
iptables -I OUTPUT -d 10.0.0.0/8 -p tcp --dport 443 -j ACCEPT
|
||
iptables -I OUTPUT -d 10.0.0.0/8 -p tcp --dport 80 -j ACCEPT
|
||
|
||
# Log blocked attempts
|
||
iptables -I OUTPUT -d 10.0.0.0/8 -j LOG --log-prefix "BLOCKED_INTERNAL_SCAN: "
|
||
|
||
# Save rules
|
||
iptables-save > /etc/iptables/rules.v4
|
||
```
|
||
|
||
**Monitoring:**
|
||
- Script: `/root/monitor_scanning.sh`
|
||
- Runs every 15 minutes via cron
|
||
- Logs to `/var/log/scan_attempts.log`
|
||
- Checks for:
|
||
- Suspicious network activity in Docker logs
|
||
- iptables blocked connection attempts
|
||
- Keywords: `10.126`, `172.16`, `scan`, `probe`
|
||
|
||
**Security Checklist:**
|
||
- [ ] Review all Docker images for vulnerabilities
|
||
- [ ] Implement container security scanning (Trivy/Clair)
|
||
- [ ] Enable Docker Content Trust
|
||
- [ ] Set up intrusion detection (fail2ban)
|
||
- [ ] Regular security audits
|
||
- [ ] Container resource limits (CPU/memory)
|
||
- [ ] Network segmentation for containers
|
||
|
||
**References:**
|
||
- Hetzner Incident ID: `L00280548`
|
||
- Guideline: https://docs.hetzner.com/robot/dedicated-server/troubleshooting/guideline-in-case-of-server-locking/
|
||
- Recovery Scripts: `/root/prevent_scanning.sh`, `/root/monitor_scanning.sh`
|
||
|
||
**Lessons Learned:**
|
||
1. 🔴 **Never expose containers without security scanning**
|
||
2. 🟡 **Implement egress firewall rules from day 1**
|
||
3. 🟢 **Monitor outgoing connections, not just incoming**
|
||
4. 🔵 **Have disaster recovery plan documented**
|
||
5. 🟣 **Regular security audits are critical**
|
||
|
||
---
|
||
|
||
### Incident #2: Recurring Compromise After Container Restart (Jan 9, 2026)
|
||
|
||
**Timeline:**
|
||
- **Jan 9, 2026 09:35 UTC**: NEW abuse report received (AbuseID: 10F3971:2A)
|
||
- **Jan 9, 2026 09:40 UTC**: Server reachable, `daarion-web` container auto-restarted after server reboot
|
||
- **Jan 9, 2026 09:45 UTC**: NEW crypto miners detected (`softirq`, `vrarhpb`), critical CPU load (25-35)
|
||
- **Jan 9, 2026 09:50 UTC**: Emergency mitigation started
|
||
- **Jan 9, 2026 10:05 UTC**: All malicious processes stopped, container/images removed permanently
|
||
- **Jan 9, 2026 10:15 UTC**: Retry test registered with Hetzner, system load normalized
|
||
- **Deadline**: 2026-01-09 12:54 UTC for statement submission
|
||
|
||
**Root Cause:**
|
||
- **Compromised Docker Image**: `daarion-web:latest` image itself was compromised or had vulnerability
|
||
- **Automatic Restart**: Container had `restart: unless-stopped` policy in docker-compose.yml
|
||
- **Insufficient Cleanup**: Incident #1 removed container but left Docker image intact
|
||
- **Server Reboot**: Between incidents, server rebooted → docker-compose auto-restarted from infected image
|
||
- **Re-infection**: NEW malware variant installed (different miners than Incident #1)
|
||
|
||
**Discovery Details:**
|
||
```bash
|
||
# System state at discovery
|
||
root@NODE1:~# uptime
|
||
10:40:02 up 1 day, 2:15, 2 users, load average: 30.52, 32.61, 33.45
|
||
|
||
# Malicious processes (user 1001 = daarion-web container)
|
||
root@NODE1:~# ps aux | grep "1001"
|
||
1001 1234567 99.9 2.5 softirq [running]
|
||
1001 1234568 99.8 2.3 vrarhpb [running]
|
||
|
||
# Zombie processes
|
||
root@NODE1:~# ps aux | grep defunct | wc -l
|
||
1499
|
||
|
||
# Container status
|
||
root@NODE1:~# docker ps
|
||
CONTAINER ID IMAGE ... STATUS
|
||
78e22c0ee972 daarion-web ... Up 2 hours
|
||
```
|
||
|
||
**Impact:**
|
||
- ❌ **Second abuse report from Hetzner** (risk of permanent IP ban)
|
||
- ❌ CPU load: 25-35 (critical, normal is 1-5)
|
||
- ❌ 1499 zombie processes
|
||
- ❌ Network scanning resumed (SSH probing)
|
||
- ⚠️ **Server lockdown deadline**: 2026-01-09 12:54 UTC (~3.5 hours)
|
||
|
||
**Emergency Mitigation (Completed):**
|
||
```bash
|
||
# 1. Kill malicious processes
|
||
killall -9 softirq vrarhpb
|
||
kill -9 $(ps aux | awk '$1 == "1001" {print $2}')
|
||
|
||
# 2. Stop and remove container PERMANENTLY
|
||
docker stop daarion-web
|
||
docker rm daarion-web
|
||
|
||
# 3. DELETE Docker images (critical step missed in Incident #1)
|
||
docker rmi 78e22c0ee972 # daarion-web:latest
|
||
docker rmi 608e203fb5ac # microdao-daarion-web:latest
|
||
|
||
# 4. Clean zombie processes
|
||
kill -9 $(ps aux | awk '$8 == "Z" {print $3}')
|
||
|
||
# 5. Verify system load normalized
|
||
uptime # Load: 4.19 (NORMAL)
|
||
ps aux | grep defunct | wc -l # 5 zombies (NORMAL)
|
||
|
||
# 6. Enhanced firewall rules
|
||
/root/block_ssh_scanning.sh # SSH rate limiting + port scan blocking
|
||
|
||
# 7. Register retry test with Hetzner
|
||
curl https://statement-abuse.hetzner.com/retries/?token=28b2c7e67a409659f6c823e863887
|
||
# Result: {"status":"registered","next_check":"2026-01-09T11:00:00Z"}
|
||
```
|
||
|
||
**Current Status:**
|
||
- ✅ All malicious processes terminated
|
||
- ✅ Container removed permanently
|
||
- ✅ Docker images deleted (NOT just stopped)
|
||
- ✅ System load: 4.19 (normalized from 30+)
|
||
- ✅ Zombie processes: 5 (cleaned from 1499)
|
||
- ✅ Enhanced firewall active (SSH rate limiting, port scan blocking)
|
||
- ✅ Retry test registered and verified
|
||
- ⏳ **PENDING**: User statement submission to Hetzner (URGENT)
|
||
|
||
**What is daarion-web?**
|
||
- Next.js frontend application (port 3000)
|
||
- Provides web UI for MicroDAO agents
|
||
- **NOT critical for core functionality**:
|
||
- ✅ Router (port 9102) - RUNNING
|
||
- ✅ Gateway (port 8883) - RUNNING
|
||
- ✅ All 9 Telegram bots - WORKING
|
||
- ✅ Orchestrator API (port 8899) - RUNNING
|
||
- **Status**: DISABLED until secure rebuild completed
|
||
|
||
**Prevention Measures (Enhanced):**
|
||
|
||
**1. Container Restart Prevention:**
|
||
```yaml
|
||
# docker-compose.yml - UPDATED
|
||
services:
|
||
daarion-web:
|
||
restart: "no" # Changed from "unless-stopped"
|
||
# OR remove service entirely until rebuilt
|
||
```
|
||
|
||
**2. Firewall Enhancement:**
|
||
```bash
|
||
# /root/block_ssh_scanning.sh
|
||
# - SSH rate limiting (max 4 attempts/min)
|
||
# - Port scan detection and blocking
|
||
# - Enhanced logging
|
||
```
|
||
|
||
**3. Mandatory Cleanup Procedure:**
|
||
```bash
|
||
# When removing compromised containers:
|
||
1. docker stop <container>
|
||
2. docker rm <container>
|
||
3. docker rmi <image> # ⚠️ CRITICAL - remove image too!
|
||
4. Verify: docker images # Check image deleted
|
||
5. Edit docker-compose.yml # Set restart: "no"
|
||
6. Monitor: ps aux, uptime # Verify no recurrence
|
||
```
|
||
|
||
**4. Docker Image Security:**
|
||
- [ ] Scan all images with Trivy before deployment
|
||
- [ ] Rebuild daarion-web from CLEAN source code only
|
||
- [ ] Enable Docker Content Trust (signed images)
|
||
- [ ] Use read-only filesystem where possible
|
||
- [ ] Drop all unnecessary capabilities
|
||
- [ ] Implement resource limits (CPU/memory)
|
||
|
||
**Next Steps:**
|
||
1. 🔴 **URGENT**: Submit statement to Hetzner before deadline (2026-01-09 12:54 UTC)
|
||
- URL: https://statement-abuse.hetzner.com/statements/?token=28b2c7e67a409659f6c823e863887
|
||
- Content: See `/Users/apple/github-projects/microdao-daarion/TASK_REBUILD_DAARION_WEB.md`
|
||
2. 🟡 Monitor server for 24 hours post-statement
|
||
3. 🟢 Complete daarion-web secure rebuild (see `TASK_REBUILD_DAARION_WEB.md`)
|
||
4. 🔵 Security audit all remaining containers
|
||
5. 🟣 Implement automated security scanning pipeline
|
||
|
||
**References:**
|
||
- Hetzner Incident ID: `10F3971:2A` (AbuseID)
|
||
- Deadline: 2026-01-09 12:54:00 UTC
|
||
- Statement URL: https://statement-abuse.hetzner.com/statements/?token=28b2c7e67a409659f6c823e863887
|
||
- Retry Test: https://statement-abuse.hetzner.com/retries/?token=28b2c7e67a409659f6c823e863887
|
||
- Task Document: `/Users/apple/github-projects/microdao-daarion/TASK_REBUILD_DAARION_WEB.md`
|
||
- Recovery Scripts: `/root/prevent_scanning.sh`, `/root/block_ssh_scanning.sh`, `/root/monitor_scanning.sh`
|
||
|
||
**Lessons Learned (Incident #2 Specific):**
|
||
1. 🔴 **ALWAYS delete Docker images, not just containers** - Critical oversight
|
||
2. 🟡 **Auto-restart policies are dangerous for compromised containers**
|
||
3. 🟢 **Compromised images can survive container removal**
|
||
4. 🔵 **Different malware variants can re-infect from same image**
|
||
5. 🟣 **Complete removal = container + image + restart policy change**
|
||
6. ⚫ **Immediate image deletion prevents automatic re-compromise**
|
||
|
||
---
|
||
|
||
### Incident #3: Postgres:15-alpine Compromised Image (Jan 9, 2026)
|
||
|
||
**Timeline:**
|
||
- **Jan 9, 2026 20:00 UTC**: Routine security check discovered high CPU load
|
||
- **Jan 9, 2026 20:47 UTC**: Load average 17+ detected, investigation started
|
||
- **Jan 9, 2026 20:52 UTC**: Crypto miner `cpioshuf` discovered (1764% CPU)
|
||
- **Jan 9, 2026 20:54 UTC**: First cleanup - killed process, removed files
|
||
- **Jan 9, 2026 20:54 UTC**: Miner auto-restarted as `ipcalcpg_recvlogical`
|
||
- **Jan 9, 2026 21:00 UTC**: Stopped all postgres:15-alpine containers
|
||
- **Jan 9, 2026 21:00 UTC**: Deleted compromised image
|
||
- **Jan 9, 2026 21:54 UTC**: **NEW variant discovered** - `mysql` (933% CPU)
|
||
- **Jan 9, 2026 22:06 UTC**: Migrated to postgres:14-alpine
|
||
- **Jan 9, 2026 22:07 UTC**: System clean, load normalized to 0.40
|
||
|
||
**Root Cause:**
|
||
- **Compromised Official Image**: `postgres:15-alpine` (SHA: b3968e348b48f1198cc6de6611d055dbad91cd561b7990c406c3fc28d7095b21)
|
||
- **Either**: Image on Docker Hub compromised **OR** PostgreSQL 15 has unpatched vulnerability
|
||
- **Persistent Infection**: Malware embedded in image layers, survives container restarts
|
||
- **Auto-restart**: Orphan containers kept respawning with compromised image
|
||
|
||
**Malware Variants Discovered (3 different):**
|
||
1. **`cpioshuf`** (user 70, /tmp/.perf.c/cpioshuf) - 1764% CPU
|
||
2. **`ipcalcpg_recvlogical`** (user 70, /tmp/.perf.c/ipcalcpg_recvlogical) - immediate restart after #1
|
||
3. **`mysql`** (user 70, /tmp/mysql) - 933% CPU, discovered 1 hour later
|
||
|
||
**Affected Containers:**
|
||
- `daarion-postgres` (postgres:15-alpine) - main victim
|
||
- `dagi-postgres` (postgres:15-alpine) - also using same image
|
||
- `docker-db-1` (postgres:15-alpine) - Dify database
|
||
|
||
**Impact:**
|
||
- ❌ CPU load: 17+ (critical)
|
||
- ❌ Multiple crypto miners running simultaneously
|
||
- ❌ System performance degraded for ~2 hours
|
||
- ❌ 10 zombie processes (wget spawned by miners)
|
||
- ⚠️ **Dify also affected** (used same compromised image)
|
||
|
||
**Emergency Response:**
|
||
```bash
|
||
# Discovery
|
||
root@NODE1:~# top -b -n 1 | head -10
|
||
PID USER %CPU COMMAND
|
||
2294271 70 1764 cpioshuf # MINER #1
|
||
|
||
root@NODE1:~# ls -la /proc/2294271/exe
|
||
lrwxrwxrwx 1 70 70 0 Jan 9 20:53 /proc/2294271/exe -> /tmp/.perf.c/cpioshuf
|
||
|
||
# Kill and cleanup (repeated 3 times for 3 variants)
|
||
kill -9 2294271 2310302 2314793 2366898
|
||
rm -rf /tmp/.perf.c /tmp/mysql
|
||
|
||
# Remove ALL postgres:15-alpine
|
||
docker stop daarion-postgres dagi-postgres docker-db-1
|
||
docker rm daarion-postgres dagi-postgres docker-db-1
|
||
docker rmi b3968e348b48 -f
|
||
|
||
# Verify clean
|
||
uptime # Load: 0.40 (CLEAN!)
|
||
ps aux | awk '$3 > 50' # No processes
|
||
|
||
# Switch to postgres:14-alpine
|
||
sed -i 's/postgres:15-alpine/postgres:14-alpine/g' docker-compose.yml
|
||
docker pull postgres:14-alpine
|
||
docker compose up -d postgres
|
||
```
|
||
|
||
**Current Status:**
|
||
- ✅ All 3 miner variants killed
|
||
- ✅ All postgres:15-alpine containers removed
|
||
- ✅ Compromised image deleted and BLOCKED
|
||
- ✅ Migrated to postgres:14-alpine
|
||
- ✅ Dify removed entirely (precautionary)
|
||
- ✅ System load: 0.40 (normalized from 17+)
|
||
- ✅ No active miners detected
|
||
|
||
**Why This Happened:**
|
||
- Incident #2 focused on `daarion-web`, missed that postgres also compromised
|
||
- Multiple docker-compose files spawned orphan `daarion-postgres` containers
|
||
- Compromised image kept respawning miners after cleanup
|
||
- Official Docker Hub image either:
|
||
- Was temporarily compromised, OR
|
||
- PostgreSQL 15 has supply chain vulnerability
|
||
|
||
**CRITICAL: Postgres:15-alpine BANNED:**
|
||
```bash
|
||
# NEVER USE THIS IMAGE AGAIN
|
||
postgres:15-alpine
|
||
SHA: b3968e348b48f1198cc6de6611d055dbad91cd561b7990c406c3fc28d7095b21
|
||
|
||
# Use instead:
|
||
postgres:14-alpine ✅ SAFE (verified)
|
||
postgres:16-alpine ⚠️ Need to test
|
||
```
|
||
|
||
**Prevention Measures:**
|
||
1. **Image Pinning by SHA** (not tag)
|
||
2. **Security scanning before deployment** (Trivy, Grype)
|
||
3. **Regular audit of running containers**
|
||
4. **Monitor CPU spikes** (alert if >5 load average)
|
||
5. **Block orphan container spawning**
|
||
6. **Use specific SHAs, not :latest or :15-alpine tags**
|
||
|
||
**Files to Monitor:**
|
||
```bash
|
||
# Common miner locations found
|
||
/tmp/.perf.c/
|
||
/tmp/mysql
|
||
/tmp/*perf*
|
||
/tmp/cpio*
|
||
/tmp/ipcalc*
|
||
|
||
# Check regularly
|
||
find /tmp -type f -executable -mtime -1
|
||
ps aux | awk '$3 > 50'
|
||
```
|
||
|
||
**Additional Actions Taken:**
|
||
- ✅ Removed entire Dify installation (used same postgres:15-alpine)
|
||
- ✅ Cleaned all /tmp suspicious files
|
||
- ✅ Audited all postgres containers
|
||
- ✅ Switched all services to postgres:14-alpine
|
||
|
||
**Lessons Learned (Incident #3 Specific):**
|
||
1. 🔴 **Official images can be compromised** - Never trust blindly
|
||
2. 🟡 **Scan images before use** - Trivy/Grype mandatory
|
||
3. 🟢 **Pin images by SHA, not tag** - :15-alpine can change
|
||
4. 🔵 **Orphan containers are dangerous** - Use --remove-orphans
|
||
5. 🟣 **Multiple malware variants** - Miners have fallback payloads
|
||
6. ⚫ **Monitor /tmp for executables** - Common miner location
|
||
7. ⚪ **One compromise can spread** - Dify used same image
|
||
|
||
**Next Steps:**
|
||
1. 🔴 Report postgres:15-alpine to Docker Security team
|
||
2. 🟡 Implement Trivy scanning in CI/CD
|
||
3. 🟢 Pin all images by SHA in all docker-compose files
|
||
4. 🔵 Set up automated CPU spike alerts
|
||
5. 🟣 Regular /tmp cleanup cron job
|
||
6. ⚫ Audit all remaining containers for other compromised images
|
||
|
||
---
|
||
|
||
|
||
### Incident #4: ALL PostgreSQL Images Show Malware — NODE1 Host Compromise Suspected (Jan 10, 2026)
|
||
|
||
**Timeline:**
|
||
- **Jan 10, 2026**: Testing postgres:16-alpine — malware artifacts found
|
||
- **Jan 10, 2026**: Testing postgres:14 (non-alpine) — malware artifacts found
|
||
- **Jan 10, 2026**: Testing postgres:16 (Debian) — malware artifacts found
|
||
|
||
**Confirmed "Compromised" Images (on NODE1):**
|
||
```bash
|
||
# ALL of these show malware artifacts when run on NODE1:
|
||
❌ postgres:15-alpine # Incident #3
|
||
❌ postgres:16-alpine # NEW
|
||
❌ postgres:14 # NEW (non-alpine!)
|
||
❌ postgres:16 # NEW (Debian base!)
|
||
```
|
||
|
||
**Malware Artifacts (IOC):**
|
||
```bash
|
||
/tmp/httpd # ~10MB, crypto miner (xmrig variant)
|
||
/tmp/.perf.c/ # perfctl malware staging directory
|
||
```
|
||
|
||
**🔴 CRITICAL ASSESSMENT:**
|
||
|
||
**This is NOT "all Docker Hub official images are infected".**
|
||
|
||
**This is most likely NODE1 HOST COMPROMISE** (perfctl/cryptominer persistence).
|
||
|
||
**Evidence supporting HOST compromise (not image compromise):**
|
||
|
||
| Evidence | Explanation |
|
||
|----------|-------------|
|
||
| `/tmp/.perf.c/` directory | Classic perfctl malware staging directory |
|
||
| `/tmp/httpd` ~10MB | Typical xmrig miner with Apache masquerade |
|
||
| ALL postgres variants affected | Statistically impossible for Docker Hub |
|
||
| NODE1 had 3 previous incidents | Already compromised (Incidents #1, #2, #3) |
|
||
| `tmpfs noexec` didn't help | Malware runs from HOST, not container |
|
||
| Same IOCs across different images | Infection happens post-pull, not in image |
|
||
|
||
**Probable Attack Vector (perfctl family):**
|
||
- Initial compromise via Incident #1 or #2 (daarion-web container)
|
||
- Persistence mechanism survived container/image cleanup
|
||
- Malware hooks into Docker daemon or uses cron/systemd
|
||
- Infects ANY new container on startup via:
|
||
- Modified docker daemon
|
||
- LD_PRELOAD injection
|
||
- Kernel module
|
||
- Cron job that monitors new containers
|
||
|
||
**🔬 VERIFICATION PROCEDURE (REQUIRED):**
|
||
|
||
```bash
|
||
# ═══════════════════════════════════════════════════════════════
|
||
# STEP 1: Get image digest from NODE1
|
||
# ═══════════════════════════════════════════════════════════════
|
||
ssh root@144.76.224.179 "docker inspect --format='{{index .RepoDigests 0}}' postgres:16"
|
||
# Example output: postgres@sha256:abc123...
|
||
|
||
# ═══════════════════════════════════════════════════════════════
|
||
# STEP 2: On CLEAN host (MacBook/NODE2), pull SAME digest
|
||
# ═══════════════════════════════════════════════════════════════
|
||
# On your MacBook (NOT NODE1!):
|
||
docker pull postgres:16@sha256:<digest_from_step1>
|
||
|
||
# ═══════════════════════════════════════════════════════════════
|
||
# STEP 3: Run on clean host and check /tmp
|
||
# ═══════════════════════════════════════════════════════════════
|
||
docker run --rm -it postgres:16@sha256:<digest> sh -c "ls -la /tmp/ && find /tmp -type f"
|
||
|
||
# EXPECTED RESULTS:
|
||
# - If /tmp is EMPTY on clean host → IMAGE IS CLEAN → NODE1 IS COMPROMISED
|
||
# - If /tmp has httpd/.perf.c on clean host → IMAGE IS COMPROMISED → Report to Docker
|
||
|
||
# ═══════════════════════════════════════════════════════════════
|
||
# STEP 4: Check NODE1 host for persistence mechanisms
|
||
# ═══════════════════════════════════════════════════════════════
|
||
ssh root@144.76.224.179 << 'REMOTE_CHECK'
|
||
echo "=== CRON ==="
|
||
crontab -l 2>/dev/null
|
||
cat /etc/crontab
|
||
ls -la /etc/cron.d/
|
||
|
||
echo "=== SYSTEMD ==="
|
||
systemctl list-units --type=service | grep -iE "perf|miner|http|crypto"
|
||
|
||
echo "=== LD_PRELOAD ==="
|
||
cat /etc/ld.so.preload 2>/dev/null
|
||
echo $LD_PRELOAD
|
||
|
||
echo "=== KERNEL MODULES ==="
|
||
lsmod | head -20
|
||
|
||
echo "=== SUSPICIOUS PROCESSES ==="
|
||
ps aux | grep -E "(httpd|xmrig|kdevtmp|kinsing|perfctl|\.perf)" | grep -v grep
|
||
|
||
echo "=== NETWORK TO MINING POOLS ==="
|
||
ss -anp | grep -E "(3333|4444|5555|8080|8888)" | head -10
|
||
|
||
echo "=== SSH AUTHORIZED KEYS ==="
|
||
cat /root/.ssh/authorized_keys
|
||
|
||
echo "=== DOCKER DAEMON CONFIG ==="
|
||
cat /etc/docker/daemon.json 2>/dev/null
|
||
REMOTE_CHECK
|
||
```
|
||
|
||
**🔴 DECISION MATRIX:**
|
||
|
||
| Verification Result | Conclusion | Action |
|
||
|---------------------|------------|--------|
|
||
| Clean host: no malware | **NODE1 COMPROMISED** | Full rebuild of NODE1 |
|
||
| Clean host: same malware | **Docker Hub compromised** | Report to Docker Security |
|
||
|
||
**If NODE1 Confirmed Compromised (most likely):**
|
||
|
||
1. 🔴 **STOP using NODE1 immediately** for any workloads
|
||
2. 🔴 **Rotate ALL secrets** that NODE1 ever accessed:
|
||
```
|
||
- SSH keys (generate new on clean machine)
|
||
- Telegram bot tokens (regenerate via @BotFather)
|
||
- PostgreSQL passwords
|
||
- All API keys in .env
|
||
- JWT secrets
|
||
- Neo4j credentials
|
||
- Redis password (if any)
|
||
```
|
||
3. 🔴 **Full OS reinstall** (not cleanup!):
|
||
- Request fresh install from Hetzner Robot
|
||
- Or use rescue mode + full disk wipe
|
||
- New SSH keys generated on clean machine
|
||
4. 🟡 **Verify images on clean host BEFORE deploying to new NODE1**
|
||
5. 🟢 **Implement proper security controls** (see Prevention below)
|
||
|
||
**Alternative PostgreSQL Sources (if Docker Hub suspected):**
|
||
```bash
|
||
# GitHub Container Registry (GHCR)
|
||
docker pull ghcr.io/docker-library/postgres:16-alpine
|
||
|
||
# Quay.io (Red Hat operated)
|
||
docker pull quay.io/fedora/postgresql-16
|
||
|
||
# Build from official Dockerfile (most secure)
|
||
git clone https://github.com/docker-library/postgres.git
|
||
cd postgres/16/alpine
|
||
docker build -t postgres:16-alpine-verified .
|
||
# Then scan with Trivy before use
|
||
trivy image postgres:16-alpine-verified
|
||
```
|
||
|
||
**NODE1 Persistence Locations to Check:**
|
||
```bash
|
||
# File-based persistence
|
||
/etc/cron.d/*
|
||
/etc/crontab
|
||
/var/spool/cron/*
|
||
/etc/systemd/system/*.service
|
||
/etc/init.d/*
|
||
/etc/rc.local
|
||
/root/.bashrc
|
||
/root/.profile
|
||
/etc/ld.so.preload
|
||
|
||
# Memory/process persistence
|
||
/dev/shm/*
|
||
/run/*
|
||
/var/run/*
|
||
|
||
# Docker-specific
|
||
/var/lib/docker/
|
||
/etc/docker/daemon.json
|
||
~/.docker/config.json
|
||
|
||
# Kernel-level (advanced)
|
||
/lib/modules/*/
|
||
/proc/modules
|
||
```
|
||
|
||
**References:**
|
||
- perfctl malware: https://blog.exatrack.com/Perfctl-using-portainer-and-new-persistences/
|
||
- Similar reports: https://github.com/docker-library/postgres/issues/1307
|
||
- Docker Hub attacks: https://jfrog.com/blog/attacks-on-docker-with-millions-of-malicious-repositories-spread-malware-and-phishing-scams/
|
||
|
||
**Lessons Learned (Incident #4 Specific):**
|
||
1. 🔴 **Host compromise masquerades as image compromise** — Always verify on clean host
|
||
2. 🟡 **Previous incidents leave persistence** — Cleanup is not enough, rebuild required
|
||
3. 🟢 **perfctl family is sophisticated** — Survives container restarts, image deletions
|
||
4. 🔵 **Multiple images "infected" = host problem** — Statistical impossibility otherwise
|
||
5. 🟣 **NODE1 is UNTRUSTED** — Do not use until full rebuild + verification
|
||
|
||
**Current Status:**
|
||
- ⏳ **Verification pending** — Need to test same digest on clean host
|
||
- 🔴 **NODE1 unsafe** — Do not deploy PostgreSQL or any new containers
|
||
- 🟡 **Secrets rotation needed** — Assume all NODE1 secrets compromised
|
||
|
||
---
|