- Vision Encoder Service (OpenCLIP ViT-L/14, GPU-accelerated)
- FastAPI app with text/image embedding endpoints (768-dim)
- Docker support with NVIDIA GPU runtime
- Port 8001, health checks, model info API
- Qdrant Vector Database integration
- Port 6333/6334 (HTTP/gRPC)
- Image embeddings storage (768-dim, Cosine distance)
- Auto collection creation
- Vision RAG implementation
- VisionEncoderClient (Python client for API)
- Image Search module (text-to-image, image-to-image)
- Vision RAG routing in DAGI Router (mode: image_search)
- VisionEncoderProvider integration
- Documentation (5000+ lines)
- SYSTEM-INVENTORY.md - Complete system inventory
- VISION-ENCODER-STATUS.md - Service status
- VISION-RAG-IMPLEMENTATION.md - Implementation details
- vision_encoder_deployment_task.md - Deployment checklist
- services/vision-encoder/README.md - Deployment guide
- Updated WARP.md, INFRASTRUCTURE.md, Jupyter Notebook
- Testing
- test-vision-encoder.sh - Smoke tests (6 tests)
- Unit tests for client, image search, routing
- Services: 17 total (added Vision Encoder + Qdrant)
- AI Models: 3 (qwen3:8b, OpenCLIP ViT-L/14, BAAI/bge-m3)
- GPU Services: 2 (Vision Encoder, Ollama)
- VRAM Usage: ~10 GB (concurrent)
Status: Production Ready ✅
1055 lines
31 KiB
Markdown
1055 lines
31 KiB
Markdown
# 🧠 HMM Memory System — Status
|
||
|
||
**Версія:** 1.0.0
|
||
**Останнє оновлення:** 2025-01-17
|
||
**Статус:** ✅ Implementation Complete
|
||
|
||
---
|
||
|
||
## 🎯 Overview
|
||
|
||
**HMM (Hierarchical Multi-Modal Memory)** — триярусна система пам'яті для агентів з автоматичною самарізацією та векторним пошуком. Система забезпечує контекстну пам'ять для діалогів з автоматичним управлінням токенами.
|
||
|
||
**Документація:**
|
||
- [HMM Memory Implementation Task](./docs/cursor/hmm_memory_implementation_task.md) — Детальне завдання з TODO
|
||
- [HMM Memory Summary](./docs/cursor/HMM_MEMORY_SUMMARY.md) — Підсумок реалізації
|
||
|
||
---
|
||
|
||
## ✅ Implementation Complete
|
||
|
||
**Дата завершення:** 2025-01-17
|
||
|
||
### Core Modules
|
||
|
||
#### 1. HMM Memory Module
|
||
|
||
**Location Options:**
|
||
- ✅ `gateway-bot/hmm_memory.py` — Gateway Bot implementation (complete)
|
||
- ✅ `services/memory/memory.py` — Router Service implementation (complete)
|
||
|
||
**Router Implementation (`services/memory/memory.py`):**
|
||
|
||
**ShortMemory:**
|
||
- ✅ Останні N повідомлень (default: 20)
|
||
- ✅ Redis backend з in-memory fallback
|
||
- ✅ FIFO queue для обмеження розміру
|
||
- ✅ Функції: `add_message()`, `get_recent()`, `clear()`
|
||
|
||
**MediumMemory:**
|
||
- ✅ Самарі діалогів (останні 20)
|
||
- ✅ Redis list зі збереженням часу
|
||
- ✅ Автоматична ротація старих самарі
|
||
- ✅ Функції: `add_summary()`, `get_summaries()`, `clear()`
|
||
|
||
**LongMemory:**
|
||
- ✅ Векторна пам'ять (ChromaDB або RAG Service)
|
||
- ✅ Пошук по схожості
|
||
- ✅ Fallback до RAG Service API
|
||
- ✅ Функції: `add_memory()`, `search()`
|
||
|
||
**GraphMemory (Neo4j):**
|
||
- ✅ Графова пам'ять для зв'язків між діалогами
|
||
- ✅ Вузли: User, Agent, DAO, Dialog, Summary, Topic
|
||
- ✅ Зв'язки: PARTICIPATED_IN, ABOUT, CONTAINS, MENTIONS
|
||
- ✅ Feature flag: `GRAPH_MEMORY_ENABLED`
|
||
- ✅ Fallback якщо Neo4j недоступний
|
||
- ✅ Функції:
|
||
- `upsert_dialog_context()` — інжест самарі в граф
|
||
- `query_relevant_summaries_for_dialog()` — останні самарі
|
||
- `query_related_context_for_user()` — контекст користувача
|
||
- `query_summaries_by_dao()` — самарі DAO
|
||
- `query_summaries_by_topic()` — пошук за темою
|
||
|
||
**Infrastructure:**
|
||
- ✅ Автоматична ініціалізація всіх backends
|
||
- ✅ Graceful fallback при помилках
|
||
- ✅ Connection pooling для Redis
|
||
- ✅ TTL для short/medium memory
|
||
|
||
---
|
||
|
||
#### 2. Dialogue Management (`gateway-bot/dialogue.py`)
|
||
|
||
**Functions:**
|
||
|
||
**`continue_dialogue()`:**
|
||
- ✅ Продовження діалогу з автоматичною самарізацією
|
||
- ✅ Перевірка токенів (max 24k)
|
||
- ✅ Формування контексту (самарі + short memory)
|
||
- ✅ Виклик Router → LLM
|
||
- ✅ Збереження відповіді
|
||
|
||
**`smart_reply()`:**
|
||
- ✅ Розумна відповідь з автоматичним RAG
|
||
- ✅ Виявлення запитів нагадування ("Що я казав про...", "Нагадай мені...")
|
||
- ✅ Пошук у long memory при потребі
|
||
- ✅ Fallback до `continue_dialogue()`
|
||
|
||
**`summarize_dialogue()`:**
|
||
- ✅ Самарізація через LLM
|
||
- ✅ Визначення емоцій
|
||
- ✅ Виділення ключових моментів
|
||
- ✅ Збереження в medium та long memory
|
||
|
||
**Helper Functions:**
|
||
- ✅ `_detect_reminder_request()` — виявлення запитів нагадування
|
||
- ✅ `_estimate_tokens()` — приблизний підрахунок токенів
|
||
- ✅ `_should_summarize()` — перевірка необхідності самарізації
|
||
|
||
---
|
||
|
||
#### 3. Configuration & Dependencies
|
||
|
||
**Environment Variables (`docker-compose.yml`):**
|
||
```yaml
|
||
environment:
|
||
- REDIS_URL=redis://redis:6379/0
|
||
- CHROMA_PATH=/data/chroma
|
||
- RAG_SERVICE_URL=http://rag-service:9500
|
||
- ROUTER_URL=http://router:9102
|
||
- HMM_SHORT_MEMORY_SIZE=20
|
||
- HMM_MEDIUM_MEMORY_SIZE=20
|
||
- HMM_MAX_TOKENS=24000
|
||
# Neo4j Graph Memory
|
||
- NEO4J_URI=bolt://neo4j:7687
|
||
- NEO4J_USER=neo4j
|
||
- NEO4J_PASSWORD=password
|
||
- GRAPH_MEMORY_ENABLED=true
|
||
```
|
||
|
||
**Dependencies:**
|
||
|
||
`gateway-bot/requirements.txt`:
|
||
- ✅ `redis>=5.0.0` — Short/Medium Memory
|
||
- ✅ `chromadb>=0.4.0` — Long Memory (local)
|
||
- ✅ `httpx>=0.24.0` — RAG Service API calls
|
||
- ✅ `pydantic>=2.0.0` — Data validation
|
||
|
||
`services/memory/requirements.txt`:
|
||
- ✅ `redis>=5.0.0` — Short/Medium Memory
|
||
- ✅ `chromadb>=0.4.0` — Long Memory (local)
|
||
- ✅ `httpx>=0.24.0` — RAG Service API calls
|
||
- ✅ `neo4j>=5.15.0` — Graph Memory
|
||
|
||
**Docker (`gateway-bot/Dockerfile`):**
|
||
- ✅ Updated to use `requirements.txt`
|
||
- ✅ Multi-stage build for optimization
|
||
- ✅ Python 3.11 base image
|
||
|
||
---
|
||
|
||
## 🏗️ Architecture
|
||
|
||
### Memory Hierarchy
|
||
|
||
```
|
||
┌─────────────────────────────────────────┐
|
||
│ User Message │
|
||
└──────────────┬──────────────────────────┘
|
||
│
|
||
▼
|
||
┌──────────────────────┐
|
||
│ smart_reply() │
|
||
│ - Detect reminder │
|
||
│ - Load short mem │
|
||
└──────────┬───────────┘
|
||
│
|
||
┌───────┴───────┐
|
||
│ │
|
||
▼ ▼
|
||
Reminder? Normal
|
||
│ │
|
||
▼ ▼
|
||
┌─────────────┐ ┌──────────────┐
|
||
│ Long Memory │ │ Short Memory │
|
||
│ RAG Search │ │ Recent msgs │
|
||
└──────┬──────┘ └──────┬───────┘
|
||
│ │
|
||
└────────┬───────┘
|
||
▼
|
||
┌─────────────┐
|
||
│ Token Check │
|
||
│ > 24k? │
|
||
└──────┬──────┘
|
||
│
|
||
┌──────┴──────┐
|
||
│ │
|
||
Yes No
|
||
│ │
|
||
▼ ▼
|
||
┌────────────┐ ┌─────────────┐
|
||
│ Summarize │ │ Continue │
|
||
│ Dialogue │ │ Dialogue │
|
||
└─────┬──────┘ └──────┬──────┘
|
||
│ │
|
||
▼ │
|
||
┌────────────┐ │
|
||
│ Medium Mem │ │
|
||
│ Save Sum │ │
|
||
└─────┬──────┘ │
|
||
│ │
|
||
└────────┬───────┘
|
||
▼
|
||
┌─────────────┐
|
||
│ Router/LLM │
|
||
│ Generate │
|
||
└──────┬──────┘
|
||
▼
|
||
┌─────────────┐
|
||
│ Short Memory│
|
||
│ Save Reply │
|
||
└─────────────┘
|
||
```
|
||
|
||
### Data Flow
|
||
|
||
**1. Normal Message:**
|
||
```python
|
||
user_message
|
||
→ smart_reply()
|
||
→ load short_memory
|
||
→ check tokens
|
||
→ if < 24k: continue_dialogue()
|
||
→ if > 24k: summarize_dialogue() → continue_dialogue()
|
||
→ Router/LLM
|
||
→ save to short_memory
|
||
→ return response
|
||
```
|
||
|
||
**2. Reminder Request:**
|
||
```python
|
||
user_message ("Що я казав про X?")
|
||
→ smart_reply()
|
||
→ detect_reminder_request() → True
|
||
→ search long_memory(query="X")
|
||
→ retrieve relevant memories
|
||
→ continue_dialogue(context=memories)
|
||
→ Router/LLM
|
||
→ return response
|
||
```
|
||
|
||
**3. Summarization Trigger:**
|
||
```python
|
||
tokens > 24k
|
||
→ summarize_dialogue(short_memory)
|
||
→ Router/LLM (summarize)
|
||
→ save to medium_memory
|
||
→ save to long_memory (vector)
|
||
→ clear old short_memory
|
||
→ continue with new context
|
||
```
|
||
|
||
---
|
||
|
||
## 📦 File Structure
|
||
|
||
### Gateway Bot Implementation
|
||
|
||
```
|
||
gateway-bot/
|
||
├── hmm_memory.py # Core HMM Memory module
|
||
│ ├── ShortMemory # Redis/in-memory recent messages
|
||
│ ├── MediumMemory # Redis summaries
|
||
│ └── LongMemory # ChromaDB/RAG vector search
|
||
│
|
||
├── dialogue.py # Dialogue management
|
||
│ ├── continue_dialogue() # Main dialogue flow
|
||
│ ├── smart_reply() # Smart reply with RAG
|
||
│ ├── summarize_dialogue() # LLM summarization
|
||
│ └── helper functions # Token estimation, reminder detection
|
||
│
|
||
├── http_api.py # HTTP endpoints (to be updated)
|
||
│ └── /telegram/webhook # Message handler
|
||
│
|
||
├── requirements.txt # Python dependencies
|
||
├── Dockerfile # Docker build config
|
||
└── README.md # Module documentation
|
||
```
|
||
|
||
### Router Service Implementation
|
||
|
||
```
|
||
services/
|
||
├── memory/
|
||
│ ├── memory.py # Core Memory classes
|
||
│ │ ├── ShortMemory # Redis/in-memory fallback
|
||
│ │ ├── MediumMemory # Redis List summaries
|
||
│ │ ├── LongMemory # ChromaDB or RAG Service
|
||
│ │ └── Memory # Factory class
|
||
│ ├── graph_memory.py # Neo4j Graph Memory
|
||
│ │ ├── GraphMemory # Neo4j driver + queries
|
||
│ │ └── 5 query methods # Graph traversal
|
||
│ ├── init_neo4j.py # Neo4j schema initialization
|
||
│ └── __init__.py
|
||
│
|
||
├── dialogue/ # To be implemented
|
||
│ ├── service.py # Dialogue management
|
||
│ │ ├── continue_dialogue()
|
||
│ │ └── smart_reply()
|
||
│ └── __init__.py
|
||
│
|
||
└── router/
|
||
├── router_app.py # Main router (to be updated)
|
||
│ ├── POST /route
|
||
│ ├── POST /v1/dialogue/continue # To add
|
||
│ ├── GET /v1/memory/debug/{id} # To add
|
||
│ └── POST /v1/memory/search # To add
|
||
└── types.py # RouterRequest (add dialog_id)
|
||
```
|
||
|
||
---
|
||
|
||
## 🕸️ Neo4j Graph Memory Model
|
||
|
||
### Node Types
|
||
|
||
**User** — Користувач системи
|
||
- Properties: `user_id`, `name`, `created_at`
|
||
|
||
**Agent** — AI агент
|
||
- Properties: `agent_id`, `name`, `type`
|
||
|
||
**DAO** — MicroDAO
|
||
- Properties: `dao_id`, `name`, `created_at`
|
||
|
||
**Dialog** — Діалог
|
||
- Properties: `dialog_id`, `started_at`, `last_message_at`
|
||
|
||
**Summary** — Самарі діалогу
|
||
- Properties: `summary_id`, `text`, `emotion`, `created_at`
|
||
|
||
**Topic** — Тема/ключове слово
|
||
- Properties: `topic`, `mentioned_count`
|
||
|
||
### Relationship Types
|
||
|
||
**PARTICIPATED_IN** — User/Agent → Dialog
|
||
- Користувач/агент брав участь у діалозі
|
||
|
||
**ABOUT** — Dialog → DAO
|
||
- Діалог відбувався в контексті DAO
|
||
|
||
**CONTAINS** — Dialog → Summary
|
||
- Діалог містить самарі
|
||
|
||
**MENTIONS** — Summary → Topic
|
||
- Самарі згадує тему
|
||
|
||
### Example Graph
|
||
|
||
```
|
||
(User:tg:123)
|
||
│
|
||
└─[PARTICIPATED_IN]→ (Dialog:d1)
|
||
│
|
||
├─[ABOUT]→ (DAO:greenfood)
|
||
│
|
||
└─[CONTAINS]→ (Summary:s1)
|
||
│
|
||
├─[MENTIONS]→ (Topic:pizza)
|
||
└─[MENTIONS]→ (Topic:delivery)
|
||
```
|
||
|
||
### Cypher Queries
|
||
|
||
**1. Get recent summaries for dialog:**
|
||
```cypher
|
||
MATCH (d:Dialog {dialog_id: $dialog_id})-[:CONTAINS]->(s:Summary)
|
||
RETURN s ORDER BY s.created_at DESC LIMIT 10
|
||
```
|
||
|
||
**2. Get related context for user:**
|
||
```cypher
|
||
MATCH (u:User {user_id: $user_id})-[:PARTICIPATED_IN]->(d:Dialog)
|
||
-[:CONTAINS]->(s:Summary)
|
||
RETURN s ORDER BY s.created_at DESC LIMIT 20
|
||
```
|
||
|
||
**3. Search summaries by topic:**
|
||
```cypher
|
||
MATCH (s:Summary)-[:MENTIONS]->(t:Topic)
|
||
WHERE t.topic CONTAINS $topic
|
||
RETURN s ORDER BY s.created_at DESC
|
||
```
|
||
|
||
---
|
||
|
||
## 🔧 Configuration Details
|
||
|
||
### Redis Configuration
|
||
|
||
**Short Memory:**
|
||
- **Key pattern:** `hmm:short:{dao_id}:{user_id}`
|
||
- **Type:** List (FIFO)
|
||
- **Max size:** 20 messages (configurable)
|
||
- **TTL:** 7 days
|
||
|
||
**Medium Memory:**
|
||
- **Key pattern:** `hmm:medium:{dao_id}:{user_id}`
|
||
- **Type:** List of JSON
|
||
- **Max size:** 20 summaries
|
||
- **TTL:** 30 days
|
||
|
||
### ChromaDB Configuration
|
||
|
||
**Collection:** `hmm_long_memory`
|
||
- **Distance metric:** Cosine similarity
|
||
- **Embedding model:** Automatic (via ChromaDB)
|
||
- **Metadata fields:**
|
||
- `dao_id`: DAO identifier
|
||
- `user_id`: User identifier
|
||
- `timestamp`: Creation time
|
||
- `emotion`: Detected emotion
|
||
- `key_points`: List of key topics
|
||
|
||
### RAG Service Integration
|
||
|
||
**Endpoint:** `POST /search`
|
||
- **Request:**
|
||
```json
|
||
{
|
||
"query": "user query text",
|
||
"dao_id": "dao-id",
|
||
"user_id": "user-id",
|
||
"top_k": 5
|
||
}
|
||
```
|
||
- **Response:**
|
||
```json
|
||
{
|
||
"results": [
|
||
{"text": "...", "score": 0.95, "metadata": {...}}
|
||
]
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 🧪 Testing
|
||
|
||
### Unit Tests (To be implemented)
|
||
|
||
**`tests/test_hmm_memory.py`:**
|
||
```bash
|
||
# Test ShortMemory
|
||
- test_add_message()
|
||
- test_get_recent()
|
||
- test_fifo_rotation()
|
||
- test_redis_fallback()
|
||
|
||
# Test MediumMemory
|
||
- test_add_summary()
|
||
- test_get_summaries()
|
||
- test_rotation()
|
||
|
||
# Test LongMemory
|
||
- test_add_memory()
|
||
- test_search()
|
||
- test_rag_fallback()
|
||
```
|
||
|
||
**`tests/test_dialogue.py`:**
|
||
```bash
|
||
# Test dialogue functions
|
||
- test_continue_dialogue()
|
||
- test_smart_reply()
|
||
- test_summarize_dialogue()
|
||
- test_detect_reminder()
|
||
- test_token_estimation()
|
||
```
|
||
|
||
### Integration Tests
|
||
|
||
**Test Scenario 1: Normal Dialogue**
|
||
```python
|
||
# 1. Send message
|
||
response = smart_reply(
|
||
user_id="test_user",
|
||
dao_id="test_dao",
|
||
message="Hello!"
|
||
)
|
||
|
||
# 2. Verify short memory updated
|
||
assert len(short_memory.get_recent(...)) == 2 # user + assistant
|
||
|
||
# 3. Verify response
|
||
assert "Hello" in response
|
||
```
|
||
|
||
**Test Scenario 2: Reminder Request**
|
||
```python
|
||
# 1. Add some memories
|
||
long_memory.add_memory(text="User likes pizza", ...)
|
||
|
||
# 2. Ask reminder
|
||
response = smart_reply(
|
||
user_id="test_user",
|
||
dao_id="test_dao",
|
||
message="What did I say about pizza?"
|
||
)
|
||
|
||
# 3. Verify long memory searched
|
||
assert "pizza" in response
|
||
```
|
||
|
||
**Test Scenario 3: Auto-summarization**
|
||
```python
|
||
# 1. Add many messages (>24k tokens)
|
||
for i in range(100):
|
||
short_memory.add_message(...)
|
||
|
||
# 2. Send message
|
||
response = smart_reply(...)
|
||
|
||
# 3. Verify summarization triggered
|
||
assert len(medium_memory.get_summaries(...)) > 0
|
||
assert len(short_memory.get_recent(...)) < 100
|
||
```
|
||
|
||
### E2E Test via Gateway
|
||
|
||
```bash
|
||
# 1. Send normal message
|
||
curl -X POST http://localhost:9300/telegram/webhook \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"message": {
|
||
"from": {"id": 123, "username": "test"},
|
||
"chat": {"id": 123},
|
||
"text": "Hello bot!"
|
||
}
|
||
}'
|
||
|
||
# 2. Send many messages to trigger summarization
|
||
for i in {1..50}; do
|
||
curl -X POST http://localhost:9300/telegram/webhook ...
|
||
done
|
||
|
||
# 3. Send reminder request
|
||
curl -X POST http://localhost:9300/telegram/webhook \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"message": {
|
||
"from": {"id": 123, "username": "test"},
|
||
"chat": {"id": 123},
|
||
"text": "What did I say about pizza?"
|
||
}
|
||
}'
|
||
|
||
# 4. Verify response contains relevant context
|
||
```
|
||
|
||
---
|
||
|
||
## 🚀 Integration Status
|
||
|
||
### 1. Gateway Bot Integration
|
||
|
||
**✅ Modules Created:**
|
||
- `gateway-bot/hmm_memory.py`
|
||
- `gateway-bot/dialogue.py`
|
||
|
||
**⏳ To be integrated:**
|
||
- `gateway-bot/http_api.py` — Update `/telegram/webhook` handler
|
||
|
||
---
|
||
|
||
### 2. Router Service Integration
|
||
|
||
**✅ Modules Created:**
|
||
- `services/memory/memory.py` — Core Memory classes
|
||
- `ShortMemory` (Redis/in-memory)
|
||
- `MediumMemory` (Redis List)
|
||
- `LongMemory` (ChromaDB or RAG Service)
|
||
- `Memory` (Factory class)
|
||
- `services/memory/graph_memory.py` — Neo4j Graph Memory
|
||
- `GraphMemory` (Neo4j driver)
|
||
- 5 query methods for graph traversal
|
||
- Feature flag support
|
||
- `services/memory/init_neo4j.py` — Neo4j initialization
|
||
- Constraints creation
|
||
- Indexes creation
|
||
|
||
**⏳ To be implemented:**
|
||
- `services/dialogue/service.py` — Dialogue management
|
||
- `continue_dialogue()`
|
||
- `smart_reply()`
|
||
- API endpoints in `router_app.py` or `http_api.py`:
|
||
- `POST /v1/dialogue/continue`
|
||
- `GET /v1/memory/debug/{dialog_id}`
|
||
- `POST /v1/memory/search`
|
||
- Update `RouterRequest` model with `dialog_id`
|
||
- Configuration and environment variables
|
||
- Tests
|
||
|
||
**📝 Documentation:**
|
||
- [docs/cursor/hmm_memory_router_task.md](./docs/cursor/hmm_memory_router_task.md) — Detailed implementation task
|
||
|
||
**🎯 Features:**
|
||
- ✅ Neo4j not used (left for future)
|
||
- ✅ Fallback modes (works without Redis/ChromaDB)
|
||
- ✅ RAG Service as ChromaDB alternative
|
||
- ✅ Ready for Router integration
|
||
|
||
---
|
||
|
||
### Gateway Bot Integration (Original)
|
||
|
||
### Integration Steps
|
||
|
||
**1. Update `http_api.py`:**
|
||
|
||
```python
|
||
# Before:
|
||
async def telegram_webhook(update: TelegramUpdate):
|
||
message = update.message.text
|
||
response = await router_client.route_request(...)
|
||
return response
|
||
|
||
# After:
|
||
from dialogue import smart_reply
|
||
|
||
async def telegram_webhook(update: TelegramUpdate):
|
||
message = update.message.text
|
||
user_id = f"tg:{update.message.from_.id}"
|
||
dao_id = get_dao_id(update) # from context or default
|
||
|
||
# Use smart_reply instead of direct router call
|
||
response = await smart_reply(
|
||
user_id=user_id,
|
||
dao_id=dao_id,
|
||
message=message
|
||
)
|
||
|
||
return response
|
||
```
|
||
|
||
**2. Initialize HMM Memory on startup:**
|
||
|
||
```python
|
||
# http_api.py
|
||
from hmm_memory import ShortMemory, MediumMemory, LongMemory
|
||
|
||
@asynccontextmanager
|
||
async def lifespan(app: FastAPI):
|
||
# Initialize memories
|
||
global short_memory, medium_memory, long_memory
|
||
|
||
short_memory = ShortMemory(redis_url=settings.REDIS_URL)
|
||
medium_memory = MediumMemory(redis_url=settings.REDIS_URL)
|
||
long_memory = LongMemory(
|
||
chroma_path=settings.CHROMA_PATH,
|
||
rag_service_url=settings.RAG_SERVICE_URL
|
||
)
|
||
|
||
yield
|
||
|
||
# Cleanup
|
||
await short_memory.close()
|
||
await medium_memory.close()
|
||
```
|
||
|
||
**3. Update Docker Compose:**
|
||
|
||
Already done ✅ — environment variables added.
|
||
|
||
**4. Test:**
|
||
|
||
```bash
|
||
# Restart gateway service
|
||
docker-compose restart gateway
|
||
|
||
# Check logs
|
||
docker-compose logs -f gateway | grep "HMM Memory"
|
||
|
||
# Send test message via Telegram
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 Monitoring
|
||
|
||
### Metrics to Track
|
||
|
||
**Memory Usage:**
|
||
- Short memory size (messages per user)
|
||
- Medium memory size (summaries per user)
|
||
- Long memory collection size
|
||
|
||
**Performance:**
|
||
- Token estimation time
|
||
- Summarization time
|
||
- RAG search latency
|
||
- Redis response time
|
||
|
||
**Business Metrics:**
|
||
- Summarization trigger rate
|
||
- Reminder request rate
|
||
- Average dialogue length (before summarization)
|
||
|
||
### Monitoring Commands
|
||
|
||
```bash
|
||
# Redis stats
|
||
docker exec -it redis redis-cli INFO memory
|
||
|
||
# Check short memory keys
|
||
docker exec -it redis redis-cli KEYS "hmm:short:*"
|
||
|
||
# Check medium memory keys
|
||
docker exec -it redis redis-cli KEYS "hmm:medium:*"
|
||
|
||
# ChromaDB stats (if using local)
|
||
curl http://localhost:8000/api/v1/collections/hmm_long_memory
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 Neo4j Visualization & Monitoring
|
||
|
||
### Grafana Dashboard
|
||
|
||
**Status:** ✅ Implemented
|
||
|
||
**Setup:**
|
||
- ✅ Grafana added to `docker-compose.yml`
|
||
- ✅ Automatic Neo4j data source provisioning
|
||
- ✅ Pre-configured dashboard with 4 panels
|
||
- ✅ Automatic dashboard loading on startup
|
||
|
||
**Dashboard Panels:**
|
||
1. **Entity Counts** — Кількість DAO/агентів/користувачів
|
||
2. **Average Agents per DAO** — Середня кількість агентів
|
||
3. **Users Distribution by DAO** — Розподіл користувачів
|
||
4. **Summary Activity Over Time** — Активність самарі за часом
|
||
|
||
**File Structure:**
|
||
```
|
||
grafana/
|
||
├── provisioning/
|
||
│ ├── datasources/
|
||
│ │ └── neo4j.yml # Auto Neo4j connection
|
||
│ └── dashboards/
|
||
│ └── default.yml # Dashboard config
|
||
└── dashboards/
|
||
└── dao-agents-users-overview.json # Dashboard JSON
|
||
```
|
||
|
||
**Access:**
|
||
- **URL:** `http://localhost:3000`
|
||
- **Default credentials:** `admin / admin`
|
||
- **Dashboard:** Home → Dashboards → "DAO Agents Users Overview"
|
||
|
||
**Quick Start:**
|
||
```bash
|
||
# Start Grafana and Neo4j
|
||
docker-compose up -d grafana neo4j
|
||
|
||
# Check logs
|
||
docker-compose logs -f grafana
|
||
|
||
# Open browser
|
||
open http://localhost:3000
|
||
```
|
||
|
||
---
|
||
|
||
### Neo4j Bloom (Graph Visualization)
|
||
|
||
**Status:** ✅ Configuration documented
|
||
|
||
**What is Bloom:**
|
||
- Visual graph exploration tool
|
||
- Natural language queries
|
||
- Interactive graph visualization
|
||
- Built-in Neo4j Browser (Community Edition)
|
||
- Neo4j Bloom (Enterprise Edition)
|
||
|
||
**Access:**
|
||
- **Neo4j Browser:** `http://localhost:7474`
|
||
- **Bloom:** `http://localhost:7474/bloom` (Enterprise only)
|
||
- **Credentials:** `neo4j / password`
|
||
|
||
**Bloom Perspective Configuration:**
|
||
|
||
**Node Styles:**
|
||
- **User** — 👤 Blue color, `user_id` as caption
|
||
- **Agent** — 🤖 Green color, `name` as caption
|
||
- **DAO** — 🏢 Orange color, `dao_id` as caption
|
||
- **Dialog** — 💬 Purple color, `dialog_id` as caption
|
||
- **Summary** — 📝 Gray color, `summary_id` as caption
|
||
- **Topic** — 🏷️ Yellow color, `topic` as caption
|
||
|
||
**Search Phrases Examples:**
|
||
1. **"Show me all users"**
|
||
- `MATCH (u:User) RETURN u LIMIT 50`
|
||
|
||
2. **"Find dialogs for {user}"**
|
||
- `MATCH (u:User {user_id: $user})-[:PARTICIPATED_IN]->(d:Dialog) RETURN u, d`
|
||
|
||
3. **"What topics does {user} discuss?"**
|
||
- `MATCH (u:User {user_id: $user})-[:PARTICIPATED_IN]->(d:Dialog)-[:CONTAINS]->(s:Summary)-[:MENTIONS]->(t:Topic) RETURN u, t, COUNT(t) AS mentions`
|
||
|
||
4. **"Show me {dao} activity"**
|
||
- `MATCH (dao:DAO {dao_id: $dao})<-[:ABOUT]-(d:Dialog) RETURN dao, d LIMIT 20`
|
||
|
||
5. **"Who talks about {topic}?"**
|
||
- `MATCH (t:Topic {topic: $topic})<-[:MENTIONS]-(s:Summary)<-[:CONTAINS]-(d:Dialog)<-[:PARTICIPATED_IN]-(u:User) RETURN t, u, COUNT(u) AS conversations`
|
||
|
||
**Documentation:**
|
||
- [README_NEO4J_VISUALIZATION.md](./README_NEO4J_VISUALIZATION.md) — Quick start guide
|
||
- [docs/cursor/neo4j_visualization_task.md](./docs/cursor/neo4j_visualization_task.md) — Implementation task
|
||
- [docs/cursor/neo4j_bloom_perspective.md](./docs/cursor/neo4j_bloom_perspective.md) — Bloom configuration
|
||
|
||
**Quick Start:**
|
||
```bash
|
||
# Start Neo4j
|
||
docker-compose up -d neo4j
|
||
|
||
# Wait for startup (check logs)
|
||
docker-compose logs -f neo4j
|
||
|
||
# Open Neo4j Browser
|
||
open http://localhost:7474
|
||
|
||
# Login and explore graph
|
||
# Use Cypher queries from Neo4j Graph Memory Model section
|
||
```
|
||
|
||
---
|
||
|
||
## 🔔 Prometheus Monitoring & Alerting
|
||
|
||
### Neo4j Prometheus Exporter
|
||
|
||
**Status:** ✅ Implemented
|
||
|
||
**Service:** `services/neo4j-exporter/`
|
||
- ✅ `neo4j_exporter/main.py` — FastAPI exporter with `/metrics` endpoint
|
||
- ✅ `Dockerfile` — Container build
|
||
- ✅ `requirements.txt` — Dependencies (fastapi, prometheus-client, neo4j)
|
||
|
||
**Metrics Collected:**
|
||
|
||
**1. Health Metrics:**
|
||
- `neo4j_up` — Доступність Neo4j (1 = up, 0 = down)
|
||
- `neo4j_exporter_scrape_duration_seconds` — Тривалість scrape
|
||
- `neo4j_exporter_errors_total{type}` — Помилки exporter
|
||
- `neo4j_cypher_query_duration_seconds{query}` — Тривалість Cypher запитів
|
||
|
||
**2. Graph Metrics:**
|
||
- `neo4j_nodes_total{label}` — Кількість вузлів по labels (User, Agent, DAO, Dialog, Summary, Topic)
|
||
- `neo4j_relationships_total{type}` — Кількість зв'язків по типах
|
||
|
||
**3. Business Metrics:**
|
||
- `neo4j_summaries_per_day{day}` — Самарі по днях (останні 7 днів)
|
||
- `neo4j_active_daos_last_7d` — Активні DAO за 7 днів
|
||
- `neo4j_avg_agents_per_dao` — Середня кількість агентів на DAO
|
||
- `neo4j_avg_users_per_dao` — Середня кількість користувачів на DAO
|
||
|
||
**Access:**
|
||
- **Exporter:** `http://localhost:9091/metrics`
|
||
- **Prometheus:** `http://localhost:9090`
|
||
- **Grafana:** `http://localhost:3000` (Prometheus data source auto-configured)
|
||
|
||
**Quick Start:**
|
||
```bash
|
||
# Start exporter, Prometheus, Neo4j
|
||
docker-compose up -d neo4j-exporter prometheus neo4j
|
||
|
||
# Check exporter metrics
|
||
curl http://localhost:9091/metrics
|
||
|
||
# Open Prometheus
|
||
open http://localhost:9090
|
||
|
||
# Check targets status: Status → Targets
|
||
```
|
||
|
||
---
|
||
|
||
### Prometheus Configuration
|
||
|
||
**File:** `prometheus/prometheus.yml`
|
||
|
||
**Scrape Configs:**
|
||
```yaml
|
||
scrape_configs:
|
||
- job_name: 'neo4j-exporter'
|
||
static_configs:
|
||
- targets: ['neo4j-exporter:9091']
|
||
scrape_interval: 15s
|
||
|
||
- job_name: 'prometheus'
|
||
static_configs:
|
||
- targets: ['localhost:9090']
|
||
```
|
||
|
||
**Alerting Rules:** `alerting/neo4j_alerts.yml` (11 rules)
|
||
|
||
**Alertmanager:** Optional (can be added for notifications)
|
||
|
||
---
|
||
|
||
### Alerting Rules
|
||
|
||
**Status:** ✅ Implemented
|
||
|
||
**File:** `alerting/neo4j_alerts.yml`
|
||
|
||
**3 Groups, 11 Rules:**
|
||
|
||
#### **1. Health Alerts (4 rules) — Critical**
|
||
|
||
**Neo4jDown:**
|
||
- Критичний alert коли Neo4j недоступний > 2 хвилин
|
||
- Severity: `critical`
|
||
- Action: Check Neo4j logs, restart service
|
||
|
||
**Neo4jExporterHighErrors:**
|
||
- Alert коли exporter має > 5 помилок за 5 хвилин
|
||
- Severity: `warning`
|
||
- Action: Check exporter logs, verify Neo4j connectivity
|
||
|
||
**Neo4jSlowQueries:**
|
||
- Alert коли Cypher запити > 2 секунд
|
||
- Severity: `warning`
|
||
- Action: Optimize queries, add indexes
|
||
|
||
**Neo4jExporterDown:**
|
||
- Alert коли exporter недоступний > 2 хвилин
|
||
- Severity: `warning`
|
||
- Action: Restart exporter container
|
||
|
||
#### **2. Business Alerts (5 rules) — Monitoring**
|
||
|
||
**NoSummariesCreatedToday:**
|
||
- Alert якщо жодної самарі не створено сьогодні
|
||
- Severity: `warning`
|
||
- Action: Check dialogue service, verify memory system
|
||
|
||
**NoActiveDAOsLast7Days:**
|
||
- Alert якщо жодного активного DAO за 7 днів
|
||
- Severity: `info`
|
||
- Action: Marketing campaign, user onboarding
|
||
|
||
**LowAgentsPerDAO:**
|
||
- Alert якщо середня кількість агентів < 1
|
||
- Severity: `info`
|
||
- Action: Promote agent creation, onboarding flows
|
||
|
||
**LowUsersPerDAO:**
|
||
- Alert якщо середня кількість користувачів < 2
|
||
- Severity: `info`
|
||
- Action: User acquisition, engagement campaigns
|
||
|
||
**StalledGrowth:**
|
||
- Alert якщо немає росту самарі (< 5% change) за 3 дні
|
||
- Severity: `info`
|
||
- Action: Analyze trends, engagement campaigns
|
||
|
||
#### **3. Capacity Alerts (2 rules) — Planning**
|
||
|
||
**FastNodeGrowth:**
|
||
- Alert коли вузли ростуть > 20% за годину
|
||
- Severity: `info`
|
||
- Action: Monitor capacity, scale Neo4j
|
||
|
||
**FastRelationshipGrowth:**
|
||
- Alert коли зв'язки ростуть > 20% за годину
|
||
- Severity: `info`
|
||
- Action: Plan storage expansion
|
||
|
||
---
|
||
|
||
### Grafana Dashboard (Prometheus)
|
||
|
||
**File:** `grafana/dashboards/neo4j-prometheus-metrics.json`
|
||
|
||
**9 Panels:**
|
||
1. **Neo4j Health Status** — Up/Down status
|
||
2. **Exporter Scrape Duration** — Performance monitoring
|
||
3. **Nodes by Label** — Graph size over time
|
||
4. **Relationships by Type** — Graph structure
|
||
5. **Summaries per Day** — Activity trend
|
||
6. **Active DAOs (Last 7 Days)** — Engagement
|
||
7. **Average Agents per DAO** — Configuration metric
|
||
8. **Average Users per DAO** — Adoption metric
|
||
9. **Query Duration** — Performance optimization
|
||
|
||
**Access:** Grafana → Dashboards → "Neo4j Prometheus Metrics"
|
||
|
||
---
|
||
|
||
### Documentation
|
||
|
||
- [README_NEO4J_EXPORTER.md](./README_NEO4J_EXPORTER.md) — Quick start guide
|
||
- [docs/cursor/neo4j_prometheus_exporter_task.md](./docs/cursor/neo4j_prometheus_exporter_task.md) — Implementation task
|
||
- [docs/cursor/neo4j_alerting_rules_task.md](./docs/cursor/neo4j_alerting_rules_task.md) — Alerting rules documentation
|
||
|
||
---
|
||
|
||
## 📝 Next Steps
|
||
|
||
### Phase 1: Router Service Integration (Current Priority)
|
||
- [ ] **Create Dialogue Service** — `services/dialogue/service.py`
|
||
- [ ] `continue_dialogue()` — Main dialogue flow with auto-summarization
|
||
- [ ] `smart_reply()` — Smart reply with RAG search
|
||
- [ ] **Integrate GraphMemory:**
|
||
- [ ] Call `graph_memory.upsert_dialog_context()` після самарізації
|
||
- [ ] Call `graph_memory.query_relevant_summaries_for_dialog()` для контексту
|
||
- [ ] **Add API Endpoints** — Update `router_app.py`
|
||
- [ ] `POST /v1/dialogue/continue` — Continue dialogue
|
||
- [ ] `GET /v1/memory/debug/{dialog_id}` — Debug memory state
|
||
- [ ] `POST /v1/memory/search` — Search in long memory
|
||
- [ ] `GET /v1/memory/graph/{dialog_id}` — Graph visualization data
|
||
- [ ] **Update RouterRequest** — Add `dialog_id` field
|
||
- [ ] **Configuration** — Add environment variables
|
||
- [ ] **Initialize Neo4j Schema** — Run `init_neo4j.py` on startup
|
||
- [ ] **Tests** — Unit + integration tests for all memory layers
|
||
|
||
### Phase 2: Gateway Bot Integration
|
||
- [ ] **Integrate with Gateway Bot** — Update `gateway-bot/http_api.py`
|
||
- [ ] **Unit tests** — Test all memory functions
|
||
- [ ] **Integration tests** — Test full dialogue flow
|
||
- [ ] **E2E smoke test** — Test via Telegram webhook
|
||
|
||
### Phase 2: Enhancements
|
||
- [ ] **Accurate token counting** — Use `tiktoken` for exact count
|
||
- [ ] **Emotion detection** — Better emotion analysis in summarization
|
||
- [ ] **Memory analytics** — Dashboard for memory usage
|
||
- [ ] **User preferences** — Per-user memory settings
|
||
- ✅ **Neo4j Visualization** — Grafana dashboard + Bloom configuration (complete)
|
||
- [ ] **Graph-based recommendations** — Suggest related dialogues/topics
|
||
- [ ] **Additional Grafana panels** — More insights and metrics
|
||
|
||
### Phase 3: Advanced Features
|
||
- [ ] **Memory search API** — External API for memory queries
|
||
- [ ] **Cross-user memory** — Team/DAO level memory via graph
|
||
- [ ] **Memory export** — Export user memory for GDPR
|
||
- [ ] **Memory versioning** — Track memory changes over time
|
||
- [ ] **Graph ML** — Graph embeddings for better context retrieval
|
||
- [ ] **Temporal queries** — Time-based graph traversal
|
||
|
||
---
|
||
|
||
## 🔗 Related Documentation
|
||
|
||
- [INFRASTRUCTURE.md](./INFRASTRUCTURE.md) — Server infrastructure
|
||
- [RAG-INGESTION-STATUS.md](./RAG-INGESTION-STATUS.md) — RAG system status
|
||
- [WARP.md](./WARP.md) — Developer guide
|
||
- [docs/cursor/hmm_memory_implementation_task.md](./docs/cursor/hmm_memory_implementation_task.md) — HMM Memory (Gateway Bot)
|
||
- [docs/cursor/hmm_memory_router_task.md](./docs/cursor/hmm_memory_router_task.md) — HMM Memory (Router Service)
|
||
- [docs/cursor/neo4j_graph_memory_task.md](./docs/cursor/neo4j_graph_memory_task.md) — Neo4j Graph Memory
|
||
- [docs/cursor/HMM_MEMORY_SUMMARY.md](./docs/cursor/HMM_MEMORY_SUMMARY.md) — Implementation summary
|
||
|
||
---
|
||
|
||
**Статус:** ✅ Core Modules Complete
|
||
**✅ Gateway Bot:** `hmm_memory.py`, `dialogue.py` complete
|
||
**✅ Router Service:** `memory.py`, `graph_memory.py`, `init_neo4j.py` complete
|
||
**⏳ Next:** Dialogue Service + API endpoints + Neo4j integration
|
||
**Last Updated:** 2025-01-17 by WARP AI
|
||
**Maintained by:** Ivan Tytar & DAARION Team
|