- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated)
- FastAPI app with text/image embedding endpoints (768-dim)
- Docker support with NVIDIA GPU runtime
- Port 8001, health checks, model info API
- Qdrant Vector Database integration
- Port 6333/6334 (HTTP/gRPC)
- Image embeddings storage (768-dim, Cosine distance)
- Auto collection creation
- Vision RAG implementation
- VisionEncoderClient (Python client for API)
- Image Search module (text-to-image, image-to-image)
- Vision RAG routing in DAGI Router (mode: image_search)
- VisionEncoderProvider integration
- Documentation (5000+ lines)
- SYSTEM-INVENTORY.md - Complete system inventory
- VISION-ENCODER-STATUS.md - Service status
- VISION-RAG-IMPLEMENTATION.md - Implementation details
- vision_encoder_deployment_task.md - Deployment checklist
- services/vision-encoder/README.md - Deployment guide
- Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook
- Testing
- test-vision-encoder.sh - Smoke tests (6 tests)
- Unit tests for client, image search, routing
- Services: 17 total (added Vision Encoder + Qdrant)
- AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3)
- GPU Services: 2 (Vision Encoder, Ollama)
- VRAM Usage: ~10 GB (concurrent)
Status: Production Ready ✅
31 KiB
🧠 HMM Memory System — Status
Версія: 1.0.0
Останнє оновлення: 2025-01-17
Статус: ✅ Implementation Complete
🎯 Overview
HMM (Hierarchical Multi-Modal Memory) — триярусна система пам'яті для агентів з автоматичною самарізацією та векторним пошуком. Система забезпечує контекстну пам'ять для діалогів з автоматичним управлінням токенами.
Документація:
- HMM Memory Implementation Task — Детальне завдання з TODO
- HMM Memory Summary — Підсумок реалізації
✅ Implementation Complete
Дата завершення: 2025-01-17
Core Modules
1. HMM Memory Module
Location Options:
- ✅
gateway-bot/hmm_memory.py— Gateway Bot implementation (complete) - ✅
services/memory/memory.py— Router Service implementation (complete)
Router Implementation (services/memory/memory.py):
ShortMemory:
- ✅ Останні N повідомлень (default: 20)
- ✅ Redis backend з in-memory fallback
- ✅ FIFO queue для обмеження розміру
- ✅ Функції:
add_message(),get_recent(),clear()
MediumMemory:
- ✅ Самарі діалогів (останні 20)
- ✅ Redis list зі збереженням часу
- ✅ Автоматична ротація старих самарі
- ✅ Функції:
add_summary(),get_summaries(),clear()
LongMemory:
- ✅ Векторна пам'ять (ChromaDB або RAG Service)
- ✅ Пошук по схожості
- ✅ Fallback до RAG Service API
- ✅ Функції:
add_memory(),search()
GraphMemory (Neo4j):
- ✅ Графова пам'ять для зв'язків між діалогами
- ✅ Вузли: User, Agent, DAO, Dialog, Summary, Topic
- ✅ Зв'язки: PARTICIPATED_IN, ABOUT, CONTAINS, MENTIONS
- ✅ Feature flag:
GRAPH_MEMORY_ENABLED - ✅ Fallback якщо Neo4j недоступний
- ✅ Функції:
upsert_dialog_context()— інжест самарі в графquery_relevant_summaries_for_dialog()— останні самаріquery_related_context_for_user()— контекст користувачаquery_summaries_by_dao()— самарі DAOquery_summaries_by_topic()— пошук за темою
Infrastructure:
- ✅ Автоматична ініціалізація всіх backends
- ✅ Graceful fallback при помилках
- ✅ Connection pooling для Redis
- ✅ TTL для short/medium memory
2. Dialogue Management (gateway-bot/dialogue.py)
Functions:
continue_dialogue():
- ✅ Продовження діалогу з автоматичною самарізацією
- ✅ Перевірка токенів (max 24k)
- ✅ Формування контексту (самарі + short memory)
- ✅ Виклик Router → LLM
- ✅ Збереження відповіді
smart_reply():
- ✅ Розумна відповідь з автоматичним RAG
- ✅ Виявлення запитів нагадування ("Що я казав про...", "Нагадай мені...")
- ✅ Пошук у long memory при потребі
- ✅ Fallback до
continue_dialogue()
summarize_dialogue():
- ✅ Самарізація через LLM
- ✅ Визначення емоцій
- ✅ Виділення ключових моментів
- ✅ Збереження в medium та long memory
Helper Functions:
- ✅
_detect_reminder_request()— виявлення запитів нагадування - ✅
_estimate_tokens()— приблизний підрахунок токенів - ✅
_should_summarize()— перевірка необхідності самарізації
3. Configuration & Dependencies
Environment Variables (docker-compose.yml):
environment:
- REDIS_URL=redis://redis:6379/0
- CHROMA_PATH=/data/chroma
- RAG_SERVICE_URL=http://rag-service:9500
- ROUTER_URL=http://router:9102
- HMM_SHORT_MEMORY_SIZE=20
- HMM_MEDIUM_MEMORY_SIZE=20
- HMM_MAX_TOKENS=24000
# Neo4j Graph Memory
- NEO4J_URI=bolt://neo4j:7687
- NEO4J_USER=neo4j
- NEO4J_PASSWORD=password
- GRAPH_MEMORY_ENABLED=true
Dependencies:
gateway-bot/requirements.txt:
- ✅
redis>=5.0.0— Short/Medium Memory - ✅
chromadb>=0.4.0— Long Memory (local) - ✅
httpx>=0.24.0— RAG Service API calls - ✅
pydantic>=2.0.0— Data validation
services/memory/requirements.txt:
- ✅
redis>=5.0.0— Short/Medium Memory - ✅
chromadb>=0.4.0— Long Memory (local) - ✅
httpx>=0.24.0— RAG Service API calls - ✅
neo4j>=5.15.0— Graph Memory
Docker (gateway-bot/Dockerfile):
- ✅ Updated to use
requirements.txt - ✅ Multi-stage build for optimization
- ✅ Python 3.11 base image
🏗️ Architecture
Memory Hierarchy
┌─────────────────────────────────────────┐
│ User Message │
└──────────────┬──────────────────────────┘
│
▼
┌──────────────────────┐
│ smart_reply() │
│ - Detect reminder │
│ - Load short mem │
└──────────┬───────────┘
│
┌───────┴───────┐
│ │
▼ ▼
Reminder? Normal
│ │
▼ ▼
┌─────────────┐ ┌──────────────┐
│ Long Memory │ │ Short Memory │
│ RAG Search │ │ Recent msgs │
└──────┬──────┘ └──────┬───────┘
│ │
└────────┬───────┘
▼
┌─────────────┐
│ Token Check │
│ > 24k? │
└──────┬──────┘
│
┌──────┴──────┐
│ │
Yes No
│ │
▼ ▼
┌────────────┐ ┌─────────────┐
│ Summarize │ │ Continue │
│ Dialogue │ │ Dialogue │
└─────┬──────┘ └──────┬──────┘
│ │
▼ │
┌────────────┐ │
│ Medium Mem │ │
│ Save Sum │ │
└─────┬──────┘ │
│ │
└────────┬───────┘
▼
┌─────────────┐
│ Router/LLM │
│ Generate │
└──────┬──────┘
▼
┌─────────────┐
│ Short Memory│
│ Save Reply │
└─────────────┘
Data Flow
1. Normal Message:
user_message
→ smart_reply()
→ load short_memory
→ check tokens
→ if < 24k: continue_dialogue()
→ if > 24k: summarize_dialogue() → continue_dialogue()
→ Router/LLM
→ save to short_memory
→ return response
2. Reminder Request:
user_message ("Що я казав про X?")
→ smart_reply()
→ detect_reminder_request() → True
→ search long_memory(query="X")
→ retrieve relevant memories
→ continue_dialogue(context=memories)
→ Router/LLM
→ return response
3. Summarization Trigger:
tokens > 24k
→ summarize_dialogue(short_memory)
→ Router/LLM (summarize)
→ save to medium_memory
→ save to long_memory (vector)
→ clear old short_memory
→ continue with new context
📦 File Structure
Gateway Bot Implementation
gateway-bot/
├── hmm_memory.py # Core HMM Memory module
│ ├── ShortMemory # Redis/in-memory recent messages
│ ├── MediumMemory # Redis summaries
│ └── LongMemory # ChromaDB/RAG vector search
│
├── dialogue.py # Dialogue management
│ ├── continue_dialogue() # Main dialogue flow
│ ├── smart_reply() # Smart reply with RAG
│ ├── summarize_dialogue() # LLM summarization
│ └── helper functions # Token estimation, reminder detection
│
├── http_api.py # HTTP endpoints (to be updated)
│ └── /telegram/webhook # Message handler
│
├── requirements.txt # Python dependencies
├── Dockerfile # Docker build config
└── README.md # Module documentation
Router Service Implementation
services/
├── memory/
│ ├── memory.py # Core Memory classes
│ │ ├── ShortMemory # Redis/in-memory fallback
│ │ ├── MediumMemory # Redis List summaries
│ │ ├── LongMemory # ChromaDB or RAG Service
│ │ └── Memory # Factory class
│ ├── graph_memory.py # Neo4j Graph Memory
│ │ ├── GraphMemory # Neo4j driver + queries
│ │ └── 5 query methods # Graph traversal
│ ├── init_neo4j.py # Neo4j schema initialization
│ └── __init__.py
│
├── dialogue/ # To be implemented
│ ├── service.py # Dialogue management
│ │ ├── continue_dialogue()
│ │ └── smart_reply()
│ └── __init__.py
│
└── router/
├── router_app.py # Main router (to be updated)
│ ├── POST /route
│ ├── POST /v1/dialogue/continue # To add
│ ├── GET /v1/memory/debug/{id} # To add
│ └── POST /v1/memory/search # To add
└── types.py # RouterRequest (add dialog_id)
🕸️ Neo4j Graph Memory Model
Node Types
User — Користувач системи
- Properties:
user_id,name,created_at
Agent — AI агент
- Properties:
agent_id,name,type
DAO — MicroDAO
- Properties:
dao_id,name,created_at
Dialog — Діалог
- Properties:
dialog_id,started_at,last_message_at
Summary — Самарі діалогу
- Properties:
summary_id,text,emotion,created_at
Topic — Тема/ключове слово
- Properties:
topic,mentioned_count
Relationship Types
PARTICIPATED_IN — User/Agent → Dialog
- Користувач/агент брав участь у діалозі
ABOUT — Dialog → DAO
- Діалог відбувався в контексті DAO
CONTAINS — Dialog → Summary
- Діалог містить самарі
MENTIONS — Summary → Topic
- Самарі згадує тему
Example Graph
(User:tg:123)
│
└─[PARTICIPATED_IN]→ (Dialog:d1)
│
├─[ABOUT]→ (DAO:greenfood)
│
└─[CONTAINS]→ (Summary:s1)
│
├─[MENTIONS]→ (Topic:pizza)
└─[MENTIONS]→ (Topic:delivery)
Cypher Queries
1. Get recent summaries for dialog:
MATCH (d:Dialog {dialog_id: $dialog_id})-[:CONTAINS]->(s:Summary)
RETURN s ORDER BY s.created_at DESC LIMIT 10
2. Get related context for user:
MATCH (u:User {user_id: $user_id})-[:PARTICIPATED_IN]->(d:Dialog)
-[:CONTAINS]->(s:Summary)
RETURN s ORDER BY s.created_at DESC LIMIT 20
3. Search summaries by topic:
MATCH (s:Summary)-[:MENTIONS]->(t:Topic)
WHERE t.topic CONTAINS $topic
RETURN s ORDER BY s.created_at DESC
🔧 Configuration Details
Redis Configuration
Short Memory:
- Key pattern:
hmm:short:{dao_id}:{user_id} - Type: List (FIFO)
- Max size: 20 messages (configurable)
- TTL: 7 days
Medium Memory:
- Key pattern:
hmm:medium:{dao_id}:{user_id} - Type: List of JSON
- Max size: 20 summaries
- TTL: 30 days
ChromaDB Configuration
Collection: hmm_long_memory
- Distance metric: Cosine similarity
- Embedding model: Automatic (via ChromaDB)
- Metadata fields:
dao_id: DAO identifieruser_id: User identifiertimestamp: Creation timeemotion: Detected emotionkey_points: List of key topics
RAG Service Integration
Endpoint: POST /search
- Request:
{ "query": "user query text", "dao_id": "dao-id", "user_id": "user-id", "top_k": 5 } - Response:
{ "results": [ {"text": "...", "score": 0.95, "metadata": {...}} ] }
🧪 Testing
Unit Tests (To be implemented)
tests/test_hmm_memory.py:
# Test ShortMemory
- test_add_message()
- test_get_recent()
- test_fifo_rotation()
- test_redis_fallback()
# Test MediumMemory
- test_add_summary()
- test_get_summaries()
- test_rotation()
# Test LongMemory
- test_add_memory()
- test_search()
- test_rag_fallback()
tests/test_dialogue.py:
# Test dialogue functions
- test_continue_dialogue()
- test_smart_reply()
- test_summarize_dialogue()
- test_detect_reminder()
- test_token_estimation()
Integration Tests
Test Scenario 1: Normal Dialogue
# 1. Send message
response = smart_reply(
user_id="test_user",
dao_id="test_dao",
message="Hello!"
)
# 2. Verify short memory updated
assert len(short_memory.get_recent(...)) == 2 # user + assistant
# 3. Verify response
assert "Hello" in response
Test Scenario 2: Reminder Request
# 1. Add some memories
long_memory.add_memory(text="User likes pizza", ...)
# 2. Ask reminder
response = smart_reply(
user_id="test_user",
dao_id="test_dao",
message="What did I say about pizza?"
)
# 3. Verify long memory searched
assert "pizza" in response
Test Scenario 3: Auto-summarization
# 1. Add many messages (>24k tokens)
for i in range(100):
short_memory.add_message(...)
# 2. Send message
response = smart_reply(...)
# 3. Verify summarization triggered
assert len(medium_memory.get_summaries(...)) > 0
assert len(short_memory.get_recent(...)) < 100
E2E Test via Gateway
# 1. Send normal message
curl -X POST http://localhost:9300/telegram/webhook \
-H "Content-Type: application/json" \
-d '{
"message": {
"from": {"id": 123, "username": "test"},
"chat": {"id": 123},
"text": "Hello bot!"
}
}'
# 2. Send many messages to trigger summarization
for i in {1..50}; do
curl -X POST http://localhost:9300/telegram/webhook ...
done
# 3. Send reminder request
curl -X POST http://localhost:9300/telegram/webhook \
-H "Content-Type: application/json" \
-d '{
"message": {
"from": {"id": 123, "username": "test"},
"chat": {"id": 123},
"text": "What did I say about pizza?"
}
}'
# 4. Verify response contains relevant context
🚀 Integration Status
1. Gateway Bot Integration
✅ Modules Created:
gateway-bot/hmm_memory.pygateway-bot/dialogue.py
⏳ To be integrated:
gateway-bot/http_api.py— Update/telegram/webhookhandler
2. Router Service Integration
✅ Modules Created:
services/memory/memory.py— Core Memory classesShortMemory(Redis/in-memory)MediumMemory(Redis List)LongMemory(ChromaDB or RAG Service)Memory(Factory class)
services/memory/graph_memory.py— Neo4j Graph MemoryGraphMemory(Neo4j driver)- 5 query methods for graph traversal
- Feature flag support
services/memory/init_neo4j.py— Neo4j initialization- Constraints creation
- Indexes creation
⏳ To be implemented:
services/dialogue/service.py— Dialogue managementcontinue_dialogue()smart_reply()
- API endpoints in
router_app.pyorhttp_api.py:POST /v1/dialogue/continueGET /v1/memory/debug/{dialog_id}POST /v1/memory/search
- Update
RouterRequestmodel withdialog_id - Configuration and environment variables
- Tests
📝 Documentation:
- docs/cursor/hmm_memory_router_task.md — Detailed implementation task
🎯 Features:
- ✅ Neo4j not used (left for future)
- ✅ Fallback modes (works without Redis/ChromaDB)
- ✅ RAG Service as ChromaDB alternative
- ✅ Ready for Router integration
Gateway Bot Integration (Original)
Integration Steps
1. Update http_api.py:
# Before:
async def telegram_webhook(update: TelegramUpdate):
message = update.message.text
response = await router_client.route_request(...)
return response
# After:
from dialogue import smart_reply
async def telegram_webhook(update: TelegramUpdate):
message = update.message.text
user_id = f"tg:{update.message.from_.id}"
dao_id = get_dao_id(update) # from context or default
# Use smart_reply instead of direct router call
response = await smart_reply(
user_id=user_id,
dao_id=dao_id,
message=message
)
return response
2. Initialize HMM Memory on startup:
# http_api.py
from hmm_memory import ShortMemory, MediumMemory, LongMemory
@asynccontextmanager
async def lifespan(app: FastAPI):
# Initialize memories
global short_memory, medium_memory, long_memory
short_memory = ShortMemory(redis_url=settings.REDIS_URL)
medium_memory = MediumMemory(redis_url=settings.REDIS_URL)
long_memory = LongMemory(
chroma_path=settings.CHROMA_PATH,
rag_service_url=settings.RAG_SERVICE_URL
)
yield
# Cleanup
await short_memory.close()
await medium_memory.close()
3. Update Docker Compose:
Already done ✅ — environment variables added.
4. Test:
# Restart gateway service
docker-compose restart gateway
# Check logs
docker-compose logs -f gateway | grep "HMM Memory"
# Send test message via Telegram
📊 Monitoring
Metrics to Track
Memory Usage:
- Short memory size (messages per user)
- Medium memory size (summaries per user)
- Long memory collection size
Performance:
- Token estimation time
- Summarization time
- RAG search latency
- Redis response time
Business Metrics:
- Summarization trigger rate
- Reminder request rate
- Average dialogue length (before summarization)
Monitoring Commands
# Redis stats
docker exec -it redis redis-cli INFO memory
# Check short memory keys
docker exec -it redis redis-cli KEYS "hmm:short:*"
# Check medium memory keys
docker exec -it redis redis-cli KEYS "hmm:medium:*"
# ChromaDB stats (if using local)
curl http://localhost:8000/api/v1/collections/hmm_long_memory
📊 Neo4j Visualization & Monitoring
Grafana Dashboard
Status: ✅ Implemented
Setup:
- ✅ Grafana added to
docker-compose.yml - ✅ Automatic Neo4j data source provisioning
- ✅ Pre-configured dashboard with 4 panels
- ✅ Automatic dashboard loading on startup
Dashboard Panels:
- Entity Counts — Кількість DAO/агентів/користувачів
- Average Agents per DAO — Середня кількість агентів
- Users Distribution by DAO — Розподіл користувачів
- Summary Activity Over Time — Активність самарі за часом
File Structure:
grafana/
├── provisioning/
│ ├── datasources/
│ │ └── neo4j.yml # Auto Neo4j connection
│ └── dashboards/
│ └── default.yml # Dashboard config
└── dashboards/
└── dao-agents-users-overview.json # Dashboard JSON
Access:
- URL:
http://localhost:3000 - Default credentials:
admin / admin - Dashboard: Home → Dashboards → "DAO Agents Users Overview"
Quick Start:
# Start Grafana and Neo4j
docker-compose up -d grafana neo4j
# Check logs
docker-compose logs -f grafana
# Open browser
open http://localhost:3000
Neo4j Bloom (Graph Visualization)
Status: ✅ Configuration documented
What is Bloom:
- Visual graph exploration tool
- Natural language queries
- Interactive graph visualization
- Built-in Neo4j Browser (Community Edition)
- Neo4j Bloom (Enterprise Edition)
Access:
- Neo4j Browser:
http://localhost:7474 - Bloom:
http://localhost:7474/bloom(Enterprise only) - Credentials:
neo4j / password
Bloom Perspective Configuration:
Node Styles:
- User — 👤 Blue color,
user_idas caption - Agent — 🤖 Green color,
nameas caption - DAO — 🏢 Orange color,
dao_idas caption - Dialog — 💬 Purple color,
dialog_idas caption - Summary — 📝 Gray color,
summary_idas caption - Topic — 🏷️ Yellow color,
topicas caption
Search Phrases Examples:
-
"Show me all users"
MATCH (u:User) RETURN u LIMIT 50
-
"Find dialogs for {user}"
MATCH (u:User {user_id: $user})-[:PARTICIPATED_IN]->(d:Dialog) RETURN u, d
-
"What topics does {user} discuss?"
MATCH (u:User {user_id: $user})-[:PARTICIPATED_IN]->(d:Dialog)-[:CONTAINS]->(s:Summary)-[:MENTIONS]->(t:Topic) RETURN u, t, COUNT(t) AS mentions
-
"Show me {dao} activity"
MATCH (dao:DAO {dao_id: $dao})<-[:ABOUT]-(d:Dialog) RETURN dao, d LIMIT 20
-
"Who talks about {topic}?"
MATCH (t:Topic {topic: $topic})<-[:MENTIONS]-(s:Summary)<-[:CONTAINS]-(d:Dialog)<-[:PARTICIPATED_IN]-(u:User) RETURN t, u, COUNT(u) AS conversations
Documentation:
- README_NEO4J_VISUALIZATION.md — Quick start guide
- docs/cursor/neo4j_visualization_task.md — Implementation task
- docs/cursor/neo4j_bloom_perspective.md — Bloom configuration
Quick Start:
# Start Neo4j
docker-compose up -d neo4j
# Wait for startup (check logs)
docker-compose logs -f neo4j
# Open Neo4j Browser
open http://localhost:7474
# Login and explore graph
# Use Cypher queries from Neo4j Graph Memory Model section
🔔 Prometheus Monitoring & Alerting
Neo4j Prometheus Exporter
Status: ✅ Implemented
Service: services/neo4j-exporter/
- ✅
neo4j_exporter/main.py— FastAPI exporter with/metricsendpoint - ✅
Dockerfile— Container build - ✅
requirements.txt— Dependencies (fastapi, prometheus-client, neo4j)
Metrics Collected:
1. Health Metrics:
neo4j_up— Доступність Neo4j (1 = up, 0 = down)neo4j_exporter_scrape_duration_seconds— Тривалість scrapeneo4j_exporter_errors_total{type}— Помилки exporterneo4j_cypher_query_duration_seconds{query}— Тривалість Cypher запитів
2. Graph Metrics:
neo4j_nodes_total{label}— Кількість вузлів по labels (User, Agent, DAO, Dialog, Summary, Topic)neo4j_relationships_total{type}— Кількість зв'язків по типах
3. Business Metrics:
neo4j_summaries_per_day{day}— Самарі по днях (останні 7 днів)neo4j_active_daos_last_7d— Активні DAO за 7 днівneo4j_avg_agents_per_dao— Середня кількість агентів на DAOneo4j_avg_users_per_dao— Середня кількість користувачів на DAO
Access:
- Exporter:
http://localhost:9091/metrics - Prometheus:
http://localhost:9090 - Grafana:
http://localhost:3000(Prometheus data source auto-configured)
Quick Start:
# Start exporter, Prometheus, Neo4j
docker-compose up -d neo4j-exporter prometheus neo4j
# Check exporter metrics
curl http://localhost:9091/metrics
# Open Prometheus
open http://localhost:9090
# Check targets status: Status → Targets
Prometheus Configuration
File: prometheus/prometheus.yml
Scrape Configs:
scrape_configs:
- job_name: 'neo4j-exporter'
static_configs:
- targets: ['neo4j-exporter:9091']
scrape_interval: 15s
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
Alerting Rules: alerting/neo4j_alerts.yml (11 rules)
Alertmanager: Optional (can be added for notifications)
Alerting Rules
Status: ✅ Implemented
File: alerting/neo4j_alerts.yml
3 Groups, 11 Rules:
1. Health Alerts (4 rules) — Critical
Neo4jDown:
- Критичний alert коли Neo4j недоступний > 2 хвилин
- Severity:
critical - Action: Check Neo4j logs, restart service
Neo4jExporterHighErrors:
- Alert коли exporter має > 5 помилок за 5 хвилин
- Severity:
warning - Action: Check exporter logs, verify Neo4j connectivity
Neo4jSlowQueries:
- Alert коли Cypher запити > 2 секунд
- Severity:
warning - Action: Optimize queries, add indexes
Neo4jExporterDown:
- Alert коли exporter недоступний > 2 хвилин
- Severity:
warning - Action: Restart exporter container
2. Business Alerts (5 rules) — Monitoring
NoSummariesCreatedToday:
- Alert якщо жодної самарі не створено сьогодні
- Severity:
warning - Action: Check dialogue service, verify memory system
NoActiveDAOsLast7Days:
- Alert якщо жодного активного DAO за 7 днів
- Severity:
info - Action: Marketing campaign, user onboarding
LowAgentsPerDAO:
- Alert якщо середня кількість агентів < 1
- Severity:
info - Action: Promote agent creation, onboarding flows
LowUsersPerDAO:
- Alert якщо середня кількість користувачів < 2
- Severity:
info - Action: User acquisition, engagement campaigns
StalledGrowth:
- Alert якщо немає росту самарі (< 5% change) за 3 дні
- Severity:
info - Action: Analyze trends, engagement campaigns
3. Capacity Alerts (2 rules) — Planning
FastNodeGrowth:
- Alert коли вузли ростуть > 20% за годину
- Severity:
info - Action: Monitor capacity, scale Neo4j
FastRelationshipGrowth:
- Alert коли зв'язки ростуть > 20% за годину
- Severity:
info - Action: Plan storage expansion
Grafana Dashboard (Prometheus)
File: grafana/dashboards/neo4j-prometheus-metrics.json
9 Panels:
- Neo4j Health Status — Up/Down status
- Exporter Scrape Duration — Performance monitoring
- Nodes by Label — Graph size over time
- Relationships by Type — Graph structure
- Summaries per Day — Activity trend
- Active DAOs (Last 7 Days) — Engagement
- Average Agents per DAO — Configuration metric
- Average Users per DAO — Adoption metric
- Query Duration — Performance optimization
Access: Grafana → Dashboards → "Neo4j Prometheus Metrics"
Documentation
- README_NEO4J_EXPORTER.md — Quick start guide
- docs/cursor/neo4j_prometheus_exporter_task.md — Implementation task
- docs/cursor/neo4j_alerting_rules_task.md — Alerting rules documentation
📝 Next Steps
Phase 1: Router Service Integration (Current Priority)
- Create Dialogue Service —
services/dialogue/service.pycontinue_dialogue()— Main dialogue flow with auto-summarizationsmart_reply()— Smart reply with RAG search- Integrate GraphMemory:
- Call
graph_memory.upsert_dialog_context()після самарізації - Call
graph_memory.query_relevant_summaries_for_dialog()для контексту
- Call
- Add API Endpoints — Update
router_app.pyPOST /v1/dialogue/continue— Continue dialogueGET /v1/memory/debug/{dialog_id}— Debug memory statePOST /v1/memory/search— Search in long memoryGET /v1/memory/graph/{dialog_id}— Graph visualization data
- Update RouterRequest — Add
dialog_idfield - Configuration — Add environment variables
- Initialize Neo4j Schema — Run
init_neo4j.pyon startup - Tests — Unit + integration tests for all memory layers
Phase 2: Gateway Bot Integration
- Integrate with Gateway Bot — Update
gateway-bot/http_api.py - Unit tests — Test all memory functions
- Integration tests — Test full dialogue flow
- E2E smoke test — Test via Telegram webhook
Phase 2: Enhancements
- Accurate token counting — Use
tiktokenfor exact count - Emotion detection — Better emotion analysis in summarization
- Memory analytics — Dashboard for memory usage
- User preferences — Per-user memory settings
- ✅ Neo4j Visualization — Grafana dashboard + Bloom configuration (complete)
- Graph-based recommendations — Suggest related dialogues/topics
- Additional Grafana panels — More insights and metrics
Phase 3: Advanced Features
- Memory search API — External API for memory queries
- Cross-user memory — Team/DAO level memory via graph
- Memory export — Export user memory for GDPR
- Memory versioning — Track memory changes over time
- Graph ML — Graph embeddings for better context retrieval
- Temporal queries — Time-based graph traversal
🔗 Related Documentation
- INFRASTRUCTURE.md — Server infrastructure
- RAG-INGESTION-STATUS.md — RAG system status
- WARP.md — Developer guide
- docs/cursor/hmm_memory_implementation_task.md — HMM Memory (Gateway Bot)
- docs/cursor/hmm_memory_router_task.md — HMM Memory (Router Service)
- docs/cursor/neo4j_graph_memory_task.md — Neo4j Graph Memory
- docs/cursor/HMM_MEMORY_SUMMARY.md — Implementation summary
Статус: ✅ Core Modules Complete
✅ Gateway Bot: hmm_memory.py, dialogue.py complete
✅ Router Service: memory.py, graph_memory.py, init_neo4j.py complete
⏳ Next: Dialogue Service + API endpoints + Neo4j integration
Last Updated: 2025-01-17 by WARP AI
Maintained by: Ivan Tytar & DAARION Team