Files
microdao-daarion/services/node-registry/README.md

9.0 KiB

Node Registry Service

Version: 0.1.0-stub
Status: 🟡 Stub Implementation (Infrastructure Ready)
Port: 9205 (Internal only)

Central registry for DAGI network nodes (Node #1, Node #2, Node #N).


Overview

Node Registry Service provides:

  • Node Registration — Register new nodes in DAGI network
  • Heartbeat Tracking — Monitor node health and availability
  • Node Discovery — Query available nodes and their capabilities
  • Profile Management — Store node profiles (LLM configs, services, capabilities)

Current Implementation

Completed (Infrastructure)

  • FastAPI application with /health and /metrics endpoints
  • Docker container configuration
  • PostgreSQL database schema
  • docker-compose integration
  • Deployment script for Node #1

🚧 To Be Implemented (by Cursor)

  • Full REST API endpoints
  • Node registration logic
  • Heartbeat mechanism
  • Database integration (SQLAlchemy models)
  • Prometheus metrics export
  • Node discovery algorithms

Quick Start

Local Development

# Install dependencies
cd services/node-registry
pip install -r requirements.txt

# Set environment variables
export NODE_REGISTRY_DB_HOST=localhost
export NODE_REGISTRY_DB_PORT=5432
export NODE_REGISTRY_DB_NAME=node_registry
export NODE_REGISTRY_DB_USER=node_registry_user
export NODE_REGISTRY_DB_PASSWORD=your_password
export NODE_REGISTRY_HTTP_PORT=9205
export NODE_REGISTRY_ENV=development
export NODE_REGISTRY_LOG_LEVEL=debug

# Run service
python -m app.main

Service will start on http://localhost:9205

# Build image
docker-compose build node-registry

# Start service
docker-compose up -d node-registry

# Check logs
docker-compose logs -f node-registry

# Check health
curl http://localhost:9205/health

Deploy to Node #1 (Production)

# From Node #2 (MacBook)
./scripts/deploy-node-registry.sh

This will:

  1. Initialize PostgreSQL database
  2. Configure environment variables
  3. Build Docker image
  4. Start service
  5. Configure firewall rules (internal access only)
  6. Verify deployment

API Endpoints

Health & Monitoring

GET /health

Health check endpoint (used by Docker, Prometheus, etc.)

Response:

{
  "status": "healthy",
  "service": "node-registry",
  "version": "0.1.0-stub",
  "environment": "production",
  "uptime_seconds": 3600.5,
  "timestamp": "2025-01-17T14:30:00Z",
  "database": {
    "connected": true,
    "host": "postgres",
    "port": 5432,
    "database": "node_registry"
  }
}

GET /metrics

Prometheus-compatible metrics endpoint

Response:

{
  "service": "node-registry",
  "uptime_seconds": 3600.5,
  "total_nodes": 2,
  "active_nodes": 1,
  "timestamp": "2025-01-17T14:30:00Z"
}

Node Management (Stub - To Be Implemented)

POST /api/v1/nodes/register

Register a new node

Status: 501 Not Implemented (stub)

POST /api/v1/nodes/{node_id}/heartbeat

Update node heartbeat

Status: 501 Not Implemented (stub)

GET /api/v1/nodes

List all registered nodes

Status: 501 Not Implemented (stub)

GET /api/v1/nodes/{node_id}

Get specific node information

Status: 501 Not Implemented (stub)


Database Schema

Tables

nodes

Core node registry

Column Type Description
id UUID Primary key
node_id VARCHAR(255) Unique node identifier (e.g. node-1-hetzner-gex44)
node_name VARCHAR(255) Human-readable name
node_role VARCHAR(50) production, development, backup
node_type VARCHAR(50) router, gateway, worker, etc.
ip_address INET Public IP
local_ip INET Local network IP
hostname VARCHAR(255) DNS hostname
status VARCHAR(50) online, offline, maintenance, degraded
last_heartbeat TIMESTAMP Last heartbeat time
registered_at TIMESTAMP Registration timestamp
updated_at TIMESTAMP Last update timestamp
metadata JSONB Additional node metadata

node_profiles

Node capabilities and configurations

Column Type Description
id UUID Primary key
node_id UUID Foreign key to nodes.id
profile_name VARCHAR(255) Profile identifier
profile_type VARCHAR(50) llm, service, capability
config JSONB Profile configuration
enabled BOOLEAN Profile active status
created_at TIMESTAMP Creation timestamp
updated_at TIMESTAMP Last update timestamp

heartbeat_log

Historical heartbeat data

Column Type Description
id UUID Primary key
node_id UUID Foreign key to nodes.id
timestamp TIMESTAMP Heartbeat timestamp
status VARCHAR(50) Node status at heartbeat
metrics JSONB System metrics (CPU, RAM, etc.)

Environment Variables

Variable Default Description
NODE_REGISTRY_DB_HOST postgres PostgreSQL host
NODE_REGISTRY_DB_PORT 5432 PostgreSQL port
NODE_REGISTRY_DB_NAME node_registry Database name
NODE_REGISTRY_DB_USER node_registry_user Database user
NODE_REGISTRY_DB_PASSWORD - Database password (required)
NODE_REGISTRY_HTTP_PORT 9205 HTTP server port
NODE_REGISTRY_ENV production Environment (development/production)
NODE_REGISTRY_LOG_LEVEL info Log level (debug/info/warning/error)

Security

Network Access

  • Port 9205: Internal network only (Node #1, Node #2, DAGI nodes)
  • Public Access: Blocked by firewall (UFW rules)
  • Authentication: To be implemented (API keys, JWT)

Firewall Rules (Node #1)

# Allow from local network
ufw allow from 192.168.1.0/24 to any port 9205 proto tcp

# Allow from Docker network
ufw allow from 172.16.0.0/12 to any port 9205 proto tcp

# Deny from external
ufw deny 9205/tcp

Database Initialization

Manual Setup

# On Node #1
ssh root@144.76.224.179

# Copy SQL script to container
docker cp services/node-registry/migrations/init_node_registry.sql dagi-postgres:/tmp/

# Run initialization
docker exec -i dagi-postgres psql -U postgres < /tmp/init_node_registry.sql

# Verify
docker exec dagi-postgres psql -U postgres -d node_registry -c "\dt"

Via Deployment Script

The deploy-node-registry.sh script automatically:

  1. Checks if database exists
  2. Creates database and user if needed
  3. Generates secure password
  4. Saves password to .env

Monitoring & Health

Docker Health Check

docker inspect dagi-node-registry | grep -A 5 Health

Prometheus Scraping

Add to prometheus.yml:

scrape_configs:
  - job_name: 'node-registry'
    static_configs:
      - targets: ['node-registry:9205']
    scrape_interval: 30s

Grafana Dashboard

Add panel with query:

up{job="node-registry"}

Development

Testing Locally

# Run with development settings
export NODE_REGISTRY_ENV=development
python -m app.main

# Access interactive API docs
open http://localhost:9205/docs

Adding New Endpoints

  1. Edit app/main.py
  2. Add route with @app.get() or @app.post()
  3. Add Pydantic models for request/response
  4. Implement database logic (when ready)
  5. Test via /docs or curl
  6. Update this README

Troubleshooting

Service won't start

# Check logs
docker logs dagi-node-registry

# Check database connection
docker exec dagi-postgres pg_isready

# Check environment variables
docker exec dagi-node-registry env | grep NODE_REGISTRY

Database connection error

# Verify database exists
docker exec dagi-postgres psql -U postgres -l | grep node_registry

# Verify user exists
docker exec dagi-postgres psql -U postgres -c "\du" | grep node_registry_user

# Test connection
docker exec dagi-postgres psql -U node_registry_user -d node_registry -c "SELECT 1"

Port not accessible

# Check firewall rules
sudo ufw status | grep 9205

# Check if service is listening
netstat -tlnp | grep 9205

# Test from Node #2
curl http://144.76.224.179:9205/health

Next Steps (for Cursor)

  1. Implement Database Layer

    • SQLAlchemy models for nodes, profiles, heartbeat
    • Database connection pool
    • Migration system (Alembic)
  2. Implement API Endpoints

    • Node registration with validation
    • Heartbeat updates with metrics
    • Node listing with filters
    • Profile CRUD operations
  3. Add Authentication

    • API key-based auth
    • JWT tokens for inter-node communication
    • Rate limiting
  4. Add Monitoring

    • Prometheus metrics export
    • Health check improvements
    • Performance metrics
  5. Add Tests

    • Unit tests (pytest)
    • Integration tests
    • API endpoint tests


Last Updated: 2025-01-17
Maintained by: Ivan Tytar & DAARION Team
Status: 🟡 Infrastructure Ready — Awaiting Cursor implementation