snapshot: NODE1 production state 2026-02-09
Complete snapshot of /opt/microdao-daarion/ from NODE1 (144.76.224.179).
This represents the actual running production code that has diverged
significantly from the previous main branch.
Key changes from old main:
- Gateway (http_api.py): expanded from ~40KB to 164KB with full agent support
- Router: new /v1/agents/{id}/infer endpoint with vision + DeepSeek routing
- Behavior Policy: SOWA v2.2 (3-level: FULL/ACK/SILENT)
- Agent Registry: config/agent_registry.yml as single source of truth
- 13 agents configured (was 3)
- Memory service integration
- CrewAI teams and roles
Excluded from snapshot: venv/, .env, data/, backups, .tgz archives
Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
@@ -1,29 +0,0 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
procps \
|
||||
findutils \
|
||||
curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /app
|
||||
|
||||
# Copy requirements
|
||||
COPY requirements.txt .
|
||||
|
||||
# Install Python dependencies
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Copy agent code
|
||||
COPY security_agent.py .
|
||||
|
||||
# Create log directory
|
||||
RUN mkdir -p /var/log && chmod 777 /var/log
|
||||
|
||||
# Run as non-root user (but needs access to host processes)
|
||||
# Note: Will need --privileged or proper capabilities in docker-compose
|
||||
ENV PYTHONUNBUFFERED=1
|
||||
|
||||
CMD ["python", "security_agent.py"]
|
||||
@@ -1,317 +0,0 @@
|
||||
# 🤖 AI Security Agent - Intelligent Crypto Miner Detection
|
||||
|
||||
AI-powered security agent that uses local LLM (Ollama qwen3:8b) to detect and mitigate cryptocurrency mining malware on NODE1.
|
||||
|
||||
## Features
|
||||
|
||||
### 🔍 Intelligent Detection
|
||||
- **LLM-powered analysis**: Uses Ollama qwen3:8b for contextual threat analysis
|
||||
- **Multi-signal detection**: CPU usage, process names, network connections, filesystem
|
||||
- **Known miner signatures**: Detects patterns from previous incidents
|
||||
- **Fallback rules**: Works even if LLM is unavailable
|
||||
|
||||
### ⚡ Auto-Mitigation
|
||||
- **Automatic response**: Kills malicious processes (>70% confidence)
|
||||
- **File cleanup**: Removes suspicious executables from /tmp
|
||||
- **Selective action**: Manual review for lower confidence threats
|
||||
|
||||
### 📊 Monitoring
|
||||
- **Real-time scanning**: Continuous monitoring every 5 minutes
|
||||
- **Smart optimization**: Skips LLM analysis if system is clean
|
||||
- **Comprehensive logging**: Detailed logs at `/var/log/ai-security-agent.log`
|
||||
|
||||
## Known Threats Detected
|
||||
|
||||
From previous incidents on NODE1:
|
||||
|
||||
**Incident #3 (postgres:15-alpine):**
|
||||
- `cpioshuf` - 1764% CPU
|
||||
- `ipcalcpg_recvlogical` - Auto-restart variant
|
||||
- `mysql` - 933% CPU
|
||||
|
||||
**Incident #4 (postgres:16-alpine):**
|
||||
- `bzip2egrep` - 1694% CPU
|
||||
- `flockresize` - 1628% CPU
|
||||
|
||||
**Common patterns:**
|
||||
- Hidden directories: `/tmp/.perf.c/`
|
||||
- Process masquerading: Disguised as `postgres`, `mysql`, etc.
|
||||
- High CPU usage: >1000% (multi-threaded mining)
|
||||
- Mining pool connections: Ports 3333, 4444, 5555, 7777, 8888, 9999, 14444
|
||||
|
||||
## Installation
|
||||
|
||||
### 1. Deploy to NODE1
|
||||
|
||||
```bash
|
||||
# Copy service to NODE1
|
||||
scp -r services/ai-security-agent root@144.76.224.179:/opt/microdao-daarion/services/
|
||||
|
||||
# SSH to NODE1
|
||||
ssh root@144.76.224.179
|
||||
|
||||
# Navigate to service directory
|
||||
cd /opt/microdao-daarion/services/ai-security-agent
|
||||
|
||||
# Build and start
|
||||
docker compose up -d --build
|
||||
```
|
||||
|
||||
### 2. Verify Deployment
|
||||
|
||||
```bash
|
||||
# Check container status
|
||||
docker ps | grep ai-security-agent
|
||||
|
||||
# View logs
|
||||
docker logs -f ai-security-agent
|
||||
|
||||
# Check log file
|
||||
tail -f logs/ai-security-agent.log
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
Environment variables (in `docker-compose.yml`):
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `OLLAMA_BASE_URL` | `http://host.docker.internal:11434` | Ollama API endpoint |
|
||||
| `OLLAMA_MODEL` | `qwen3:8b` | LLM model for analysis |
|
||||
| `CHECK_INTERVAL` | `300` | Scan interval in seconds (5 min) |
|
||||
| `ALERT_THRESHOLD` | `0.7` | Confidence threshold for auto-mitigation |
|
||||
|
||||
## How It Works
|
||||
|
||||
### 1. Data Collection
|
||||
Every 5 minutes, the agent collects:
|
||||
- System load average and CPU usage
|
||||
- Processes using >50% CPU
|
||||
- Known miner process names
|
||||
- Executable files in `/tmp` (created in last 24h)
|
||||
- Network connections to suspicious ports
|
||||
|
||||
### 2. Quick Check
|
||||
If system is clean (load <5, no suspicious activity):
|
||||
- ✅ Skip LLM analysis
|
||||
- Log "System clean"
|
||||
- Wait for next interval
|
||||
|
||||
### 3. LLM Analysis
|
||||
If suspicious activity detected:
|
||||
- 🧠 Send metrics to Ollama qwen3:8b
|
||||
- LLM analyzes with cybersecurity expertise
|
||||
- Returns JSON with:
|
||||
- `threat_detected`: boolean
|
||||
- `confidence`: 0.0-1.0
|
||||
- `threat_type`: crypto_miner | suspicious_activity | false_positive
|
||||
- `indicators`: List of specific findings
|
||||
- `recommended_actions`: What to do
|
||||
|
||||
### 4. Auto-Mitigation
|
||||
If confidence >= 70%:
|
||||
- ⚡ Kill high CPU processes
|
||||
- ⚡ Kill known miner processes
|
||||
- ⚡ Remove suspicious /tmp files
|
||||
- ⚡ Clean /tmp/.perf.c/
|
||||
- 📝 Log all actions
|
||||
|
||||
If confidence < 70%:
|
||||
- ⚠️ Log for manual review
|
||||
- No automatic action
|
||||
|
||||
### 5. Fallback Mode
|
||||
If LLM fails:
|
||||
- Use rule-based detection
|
||||
- Check: load average, high CPU, known signatures, /tmp files, network
|
||||
- Calculate confidence based on multiple indicators
|
||||
|
||||
## Example Logs
|
||||
|
||||
### Clean System
|
||||
```
|
||||
[2026-01-10 10:00:00] [INFO] 🔍 Starting security scan...
|
||||
[2026-01-10 10:00:01] [INFO] ✅ System clean (quick check)
|
||||
```
|
||||
|
||||
### Threat Detected (Low Confidence)
|
||||
```
|
||||
[2026-01-10 10:05:00] [INFO] 🔍 Starting security scan...
|
||||
[2026-01-10 10:05:01] [INFO] 🧠 Analyzing with AI (suspicious activity detected)...
|
||||
[2026-01-10 10:05:05] [INFO] Analysis complete: threat=True, confidence=45%
|
||||
[2026-01-10 10:05:05] [ALERT] 🚨 THREAT DETECTED (Incident #1)
|
||||
[2026-01-10 10:05:05] [ALERT] Confidence: 45%
|
||||
[2026-01-10 10:05:05] [ALERT] Type: suspicious_activity
|
||||
[2026-01-10 10:05:05] [ALERT] Summary: High CPU process detected but no known signatures
|
||||
[2026-01-10 10:05:05] [ALERT] ⚠️ Confidence 45% below threshold 70%, manual review recommended
|
||||
```
|
||||
|
||||
### Threat Detected (High Confidence - Auto-Mitigation)
|
||||
```
|
||||
[2026-01-10 10:10:00] [INFO] 🔍 Starting security scan...
|
||||
[2026-01-10 10:10:01] [INFO] 🧠 Analyzing with AI (suspicious activity detected)...
|
||||
[2026-01-10 10:10:08] [INFO] Analysis complete: threat=True, confidence=95%
|
||||
[2026-01-10 10:10:08] [ALERT] 🚨 THREAT DETECTED (Incident #2)
|
||||
[2026-01-10 10:10:08] [ALERT] Confidence: 95%
|
||||
[2026-01-10 10:10:08] [ALERT] Type: crypto_miner
|
||||
[2026-01-10 10:10:08] [ALERT] Summary: Known miner signature 'bzip2egrep' detected with high CPU
|
||||
[2026-01-10 10:10:08] [ALERT] 📍 Known miner signature: bzip2egrep (PID 123456)
|
||||
[2026-01-10 10:10:08] [ALERT] 📍 Suspicious executable: /tmp/.perf.c/bzip2egrep
|
||||
[2026-01-10 10:10:08] [ALERT] 📍 High CPU usage: 1694%
|
||||
[2026-01-10 10:10:08] [ALERT] ⚡ EXECUTING AUTO-MITIGATION
|
||||
[2026-01-10 10:10:08] [ACTION] Killing known miner PID 123456 (bzip2egrep)
|
||||
[2026-01-10 10:10:08] [ACTION] Removing /tmp/.perf.c/bzip2egrep
|
||||
[2026-01-10 10:10:08] [ACTION] Cleaning /tmp/.perf.c/
|
||||
[2026-01-10 10:10:09] [ALERT] ✅ AUTO-MITIGATION COMPLETED
|
||||
```
|
||||
|
||||
## Advantages Over Bash Script
|
||||
|
||||
### Old Script (`/root/monitor_scanning.sh`)
|
||||
- ✅ Simple and fast
|
||||
- ✅ No dependencies
|
||||
- ❌ Rule-based only (can miss new variants)
|
||||
- ❌ No contextual analysis
|
||||
- ❌ Manual threshold tuning
|
||||
- ❌ No learning capability
|
||||
|
||||
### New AI Agent
|
||||
- ✅ **Contextual understanding**: LLM analyzes patterns holistically
|
||||
- ✅ **Adaptive**: Can detect new miner variants by behavior
|
||||
- ✅ **Confidence scoring**: Nuanced threat assessment
|
||||
- ✅ **Detailed explanations**: Understands WHY something is suspicious
|
||||
- ✅ **Future-proof**: Can be updated with new threat intelligence
|
||||
- ✅ **Fallback safety**: Works even if LLM fails
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ NODE1 Host System │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────┐ │
|
||||
│ │ AI Security Agent (Container) │ │
|
||||
│ │ │ │
|
||||
│ │ ┌────────────────────────────┐ │ │
|
||||
│ │ │ 1. Metric Collector │ │ │
|
||||
│ │ │ - psutil (CPU, procs) │ │ │
|
||||
│ │ │ - find (/tmp scan) │ │ │
|
||||
│ │ │ - network connections │ │ │
|
||||
│ │ └────────────────────────────┘ │ │
|
||||
│ │ ↓ │ │
|
||||
│ │ ┌────────────────────────────┐ │ │
|
||||
│ │ │ 2. Quick Filter │ │ │
|
||||
│ │ │ - Skip if clean │ │ │
|
||||
│ │ └────────────────────────────┘ │ │
|
||||
│ │ ↓ │ │
|
||||
│ │ ┌────────────────────────────┐ │ │
|
||||
│ │ │ 3. LLM Analyzer │ │ │
|
||||
│ │ │ - Ollama qwen3:8b │←─┼──┼─┐
|
||||
│ │ │ - Contextual AI │ │ │ │
|
||||
│ │ └────────────────────────────┘ │ │ │
|
||||
│ │ ↓ │ │ │
|
||||
│ │ ┌────────────────────────────┐ │ │ │
|
||||
│ │ │ 4. Decision Engine │ │ │ │
|
||||
│ │ │ - Confidence threshold │ │ │ │
|
||||
│ │ └────────────────────────────┘ │ │ │
|
||||
│ │ ↓ │ │ │
|
||||
│ │ ┌────────────────────────────┐ │ │ │
|
||||
│ │ │ 5. Auto-Mitigation │ │ │ │
|
||||
│ │ │ - Kill processes │ │ │ │
|
||||
│ │ │ - Clean files │ │ │ │
|
||||
│ │ └────────────────────────────┘ │ │ │
|
||||
│ └──────────────────────────────────┘ │ │
|
||||
│ │ │
|
||||
│ ┌──────────────────────────────────┐ │ │
|
||||
│ │ Ollama Service │ │ │
|
||||
│ │ localhost:11434 │◄─┼─┘
|
||||
│ │ qwen3:8b (8B params) │ │
|
||||
│ └──────────────────────────────────┘ │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Monitoring Agent Health
|
||||
|
||||
```bash
|
||||
# Check agent status
|
||||
docker ps | grep ai-security-agent
|
||||
|
||||
# View real-time logs
|
||||
docker logs -f ai-security-agent
|
||||
|
||||
# Check log file
|
||||
tail -f /opt/microdao-daarion/services/ai-security-agent/logs/ai-security-agent.log
|
||||
|
||||
# Check resource usage
|
||||
docker stats ai-security-agent
|
||||
|
||||
# Restart if needed
|
||||
cd /opt/microdao-daarion/services/ai-security-agent
|
||||
docker compose restart
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Agent not detecting processes
|
||||
**Issue**: Can't see host processes
|
||||
**Fix**: Ensure `pid: host` in docker-compose.yml
|
||||
|
||||
### Can't kill processes
|
||||
**Issue**: Permission denied
|
||||
**Fix**: Ensure `privileged: true` in docker-compose.yml
|
||||
|
||||
### LLM connection failed
|
||||
**Issue**: Can't reach Ollama
|
||||
**Fix**: Check `OLLAMA_BASE_URL`, ensure Ollama is running
|
||||
```bash
|
||||
curl http://localhost:11434/api/tags
|
||||
```
|
||||
|
||||
### High memory usage
|
||||
**Issue**: Agent using >512MB
|
||||
**Fix**: Reduce `CHECK_INTERVAL` or limit `num_predict` in LLM call
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Privileges
|
||||
- Agent runs with `privileged: true` to kill processes
|
||||
- Has access to host PID namespace
|
||||
- Can modify host /tmp directory
|
||||
|
||||
**Mitigation**: Agent runs in Docker container with resource limits
|
||||
|
||||
### False Positives
|
||||
- Agent requires 70% confidence for auto-kill
|
||||
- Lower confidence threats logged for manual review
|
||||
- Legitimate high-CPU processes might be flagged
|
||||
|
||||
**Mitigation**: Adjust `ALERT_THRESHOLD`, add process whitelist if needed
|
||||
|
||||
## Future Improvements
|
||||
|
||||
- [ ] **Telegram alerts**: Send notifications on threat detection
|
||||
- [ ] **Prometheus metrics**: Expose threat count, confidence scores
|
||||
- [ ] **Process whitelist**: Exclude known-good high-CPU processes
|
||||
- [ ] **Network blocking**: Block mining pool IPs via iptables
|
||||
- [ ] **Image scanning**: Scan Docker images before they run
|
||||
- [ ] **Historical analysis**: Track patterns over time
|
||||
- [ ] **Multi-node**: Extend to NODE2 and NODE3
|
||||
|
||||
## Contributing
|
||||
|
||||
To update threat signatures:
|
||||
|
||||
1. Edit `KNOWN_MINER_SIGNATURES` in `security_agent.py`
|
||||
2. Rebuild container: `docker compose up -d --build`
|
||||
|
||||
To adjust detection logic:
|
||||
|
||||
1. Modify `_fallback_analysis()` for rule-based detection
|
||||
2. Update LLM prompt in `analyze_with_llm()` for AI analysis
|
||||
|
||||
---
|
||||
|
||||
**Version**: 1.0.0
|
||||
**Created**: 2026-01-10
|
||||
**Maintained by**: DAARION Security Team
|
||||
**Status**: ✅ Production Ready
|
||||
@@ -1,56 +0,0 @@
|
||||
version: '3.9'
|
||||
|
||||
services:
|
||||
ai-security-agent:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
container_name: ai-security-agent
|
||||
restart: unless-stopped
|
||||
|
||||
# CRITICAL: Need host PID namespace to see all processes
|
||||
pid: host
|
||||
|
||||
# Need elevated privileges to kill processes
|
||||
privileged: true
|
||||
|
||||
environment:
|
||||
- OLLAMA_BASE_URL=http://172.17.0.1:11434
|
||||
- OLLAMA_MODEL=qwen3:8b
|
||||
- CHECK_INTERVAL=300 # 5 minutes
|
||||
- ALERT_THRESHOLD=0.7 # 70% confidence for auto-mitigation
|
||||
- TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
|
||||
- TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID}
|
||||
|
||||
volumes:
|
||||
# Mount host /tmp to scan for malware
|
||||
- /tmp:/tmp
|
||||
# Mount host /proc for process information
|
||||
- /proc:/host/proc:ro
|
||||
# Persistent logs
|
||||
- ./logs:/var/log
|
||||
|
||||
networks:
|
||||
- dagi-network
|
||||
|
||||
# Resource limits (agent should be lightweight)
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
cpus: '1.0'
|
||||
memory: 512M
|
||||
reservations:
|
||||
cpus: '0.25'
|
||||
memory: 128M
|
||||
|
||||
healthcheck:
|
||||
test: ["CMD", "pgrep", "-f", "security_agent.py"]
|
||||
interval: 60s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 30s
|
||||
|
||||
networks:
|
||||
dagi-network:
|
||||
external: true
|
||||
name: dagi-network
|
||||
@@ -1,2 +0,0 @@
|
||||
psutil==5.9.8
|
||||
requests==2.31.0
|
||||
@@ -1,404 +0,0 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AI Security Agent - NODE1 Crypto Miner Detection
|
||||
Uses local LLM (Ollama qwen3:8b) for intelligent threat detection
|
||||
"""
|
||||
|
||||
import os
|
||||
import json
|
||||
import time
|
||||
import subprocess
|
||||
import psutil
|
||||
import requests
|
||||
from datetime import datetime
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Any
|
||||
|
||||
# Configuration
|
||||
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
|
||||
OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "qwen3:8b")
|
||||
CHECK_INTERVAL = int(os.getenv("CHECK_INTERVAL", "300")) # 5 minutes
|
||||
LOG_FILE = "/var/log/ai-security-agent.log"
|
||||
ALERT_THRESHOLD = float(os.getenv("ALERT_THRESHOLD", "0.7")) # 70% confidence
|
||||
|
||||
# Telegram Configuration
|
||||
TELEGRAM_BOT_TOKEN = os.getenv("TELEGRAM_BOT_TOKEN", "")
|
||||
TELEGRAM_CHAT_ID = os.getenv("TELEGRAM_CHAT_ID", "") # Admin chat ID
|
||||
TELEGRAM_ENABLED = bool(TELEGRAM_BOT_TOKEN and TELEGRAM_CHAT_ID)
|
||||
|
||||
# Known miner signatures from previous incidents
|
||||
KNOWN_MINER_SIGNATURES = [
|
||||
"cpioshuf", "ipcalcpg_recvlogical", "mysql", "softirq", "vrarhpb",
|
||||
"bzip2egrep", "flockresize", "catcal", "G4NQXBp"
|
||||
]
|
||||
|
||||
SUSPICIOUS_PATHS = [
|
||||
"/tmp/.perf.c/", "/tmp/*perf*", "/tmp/.*/"
|
||||
]
|
||||
|
||||
|
||||
class AISecurityAgent:
|
||||
def __init__(self):
|
||||
self.log(f"🤖 AI Security Agent started (model: {OLLAMA_MODEL})")
|
||||
self.incident_count = 0
|
||||
if TELEGRAM_ENABLED:
|
||||
self.log(f"📱 Telegram alerts enabled (chat_id: {TELEGRAM_CHAT_ID})")
|
||||
else:
|
||||
self.log("⚠️ Telegram alerts disabled (no token/chat_id)")
|
||||
|
||||
def log(self, message: str, level: str = "INFO"):
|
||||
"""Log message to file and stdout"""
|
||||
timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
|
||||
log_entry = f"[{timestamp}] [{level}] {message}"
|
||||
print(log_entry)
|
||||
|
||||
try:
|
||||
with open(LOG_FILE, "a") as f:
|
||||
f.write(log_entry + "\n")
|
||||
except Exception as e:
|
||||
print(f"Failed to write to log file: {e}")
|
||||
|
||||
def send_telegram_alert(self, message: str):
|
||||
"""Send alert to Telegram"""
|
||||
if not TELEGRAM_ENABLED:
|
||||
return
|
||||
|
||||
try:
|
||||
url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
|
||||
data = {
|
||||
"chat_id": TELEGRAM_CHAT_ID,
|
||||
"text": f"🚨 *AI Security Agent Alert*\n\n{message}",
|
||||
"parse_mode": "Markdown"
|
||||
}
|
||||
response = requests.post(url, json=data, timeout=10)
|
||||
if response.status_code != 200:
|
||||
self.log(f"Failed to send Telegram alert: {response.text}", "WARNING")
|
||||
except Exception as e:
|
||||
self.log(f"Failed to send Telegram alert: {e}", "WARNING")
|
||||
|
||||
def collect_system_metrics(self) -> Dict[str, Any]:
|
||||
"""Collect system metrics for analysis"""
|
||||
metrics = {
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"cpu": {
|
||||
"load_avg": os.getloadavg(),
|
||||
"percent": psutil.cpu_percent(interval=1),
|
||||
"count": psutil.cpu_count()
|
||||
},
|
||||
"memory": {
|
||||
"percent": psutil.virtual_memory().percent,
|
||||
"available_gb": round(psutil.virtual_memory().available / (1024**3), 2)
|
||||
},
|
||||
"high_cpu_processes": [],
|
||||
"suspicious_processes": [],
|
||||
"tmp_executables": [],
|
||||
"network_connections": []
|
||||
}
|
||||
|
||||
# Find high CPU processes
|
||||
for proc in psutil.process_iter(['pid', 'name', 'username', 'cpu_percent', 'cmdline']):
|
||||
try:
|
||||
info = proc.info
|
||||
if info['cpu_percent'] and info['cpu_percent'] > 50:
|
||||
metrics["high_cpu_processes"].append({
|
||||
"pid": info['pid'],
|
||||
"name": info['name'],
|
||||
"user": info['username'],
|
||||
"cpu": info['cpu_percent'],
|
||||
"cmdline": ' '.join(info['cmdline'] or [])[:200]
|
||||
})
|
||||
|
||||
# Check for known miner signatures
|
||||
if info['name'] in KNOWN_MINER_SIGNATURES:
|
||||
metrics["suspicious_processes"].append({
|
||||
"pid": info['pid'],
|
||||
"name": info['name'],
|
||||
"reason": "Known miner signature",
|
||||
"cmdline": ' '.join(info['cmdline'] or [])[:200]
|
||||
})
|
||||
|
||||
except (psutil.NoSuchProcess, psutil.AccessDenied):
|
||||
continue
|
||||
|
||||
# Check /tmp for suspicious executables
|
||||
try:
|
||||
result = subprocess.run(
|
||||
["find", "/tmp", "-type", "f", "-executable", "-mtime", "-1"],
|
||||
capture_output=True, text=True, timeout=10
|
||||
)
|
||||
if result.returncode == 0:
|
||||
tmp_files = result.stdout.strip().split('\n')
|
||||
metrics["tmp_executables"] = [f for f in tmp_files if f and f != "/tmp/fix_healthcheck.sh"]
|
||||
except Exception as e:
|
||||
self.log(f"Failed to scan /tmp: {e}", "WARNING")
|
||||
|
||||
# Check for suspicious network connections
|
||||
try:
|
||||
for conn in psutil.net_connections(kind='inet'):
|
||||
if conn.status == 'ESTABLISHED' and conn.raddr:
|
||||
# Check for connections to mining pools (common ports)
|
||||
if conn.raddr.port in [3333, 4444, 5555, 7777, 8888, 9999, 14444]:
|
||||
try:
|
||||
proc = psutil.Process(conn.pid)
|
||||
metrics["network_connections"].append({
|
||||
"pid": conn.pid,
|
||||
"process": proc.name(),
|
||||
"remote": f"{conn.raddr.ip}:{conn.raddr.port}",
|
||||
"reason": "Suspicious port (common mining pool)"
|
||||
})
|
||||
except:
|
||||
pass
|
||||
except Exception as e:
|
||||
self.log(f"Failed to check network connections: {e}", "WARNING")
|
||||
|
||||
return metrics
|
||||
|
||||
def analyze_with_llm(self, metrics: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Use LLM to analyze metrics and detect threats"""
|
||||
|
||||
# Prepare prompt for LLM
|
||||
prompt = f"""You are a cybersecurity expert analyzing a Linux server for cryptocurrency mining malware.
|
||||
|
||||
SYSTEM METRICS:
|
||||
- Load Average: {metrics['cpu']['load_avg']}
|
||||
- CPU Usage: {metrics['cpu']['percent']}%
|
||||
- Memory Usage: {metrics['memory']['percent']}%
|
||||
|
||||
HIGH CPU PROCESSES ({len(metrics['high_cpu_processes'])}):
|
||||
{json.dumps(metrics['high_cpu_processes'], indent=2)}
|
||||
|
||||
SUSPICIOUS PROCESSES ({len(metrics['suspicious_processes'])}):
|
||||
{json.dumps(metrics['suspicious_processes'], indent=2)}
|
||||
|
||||
SUSPICIOUS FILES IN /tmp ({len(metrics['tmp_executables'])}):
|
||||
{json.dumps(metrics['tmp_executables'], indent=2)}
|
||||
|
||||
SUSPICIOUS NETWORK CONNECTIONS ({len(metrics['network_connections'])}):
|
||||
{json.dumps(metrics['network_connections'], indent=2)}
|
||||
|
||||
KNOWN MINER PATTERNS:
|
||||
- Process names: {', '.join(KNOWN_MINER_SIGNATURES)}
|
||||
- Common paths: /tmp/.perf.c/, /tmp/.*/ (hidden dirs)
|
||||
- Behavior: High CPU (>1000%), disguised as system processes (postgres, mysql, etc.)
|
||||
|
||||
ANALYZE:
|
||||
1. Is there evidence of cryptocurrency mining?
|
||||
2. What is the confidence level (0.0-1.0)?
|
||||
3. What specific indicators support your conclusion?
|
||||
4. What immediate actions should be taken?
|
||||
|
||||
Respond in JSON format:
|
||||
{{
|
||||
"threat_detected": true/false,
|
||||
"confidence": 0.0-1.0,
|
||||
"threat_type": "crypto_miner|suspicious_activity|false_positive|unknown",
|
||||
"indicators": ["list", "of", "specific", "findings"],
|
||||
"recommended_actions": ["action1", "action2"],
|
||||
"summary": "brief explanation"
|
||||
}}
|
||||
|
||||
Respond ONLY with valid JSON, no additional text."""
|
||||
|
||||
try:
|
||||
response = requests.post(
|
||||
f"{OLLAMA_BASE_URL}/api/generate",
|
||||
json={
|
||||
"model": OLLAMA_MODEL,
|
||||
"prompt": prompt,
|
||||
"stream": False,
|
||||
"temperature": 0.3, # Lower temperature for more deterministic analysis
|
||||
"options": {
|
||||
"num_predict": 512
|
||||
}
|
||||
},
|
||||
timeout=60
|
||||
)
|
||||
|
||||
if response.status_code == 200:
|
||||
result = response.json()
|
||||
llm_response = result.get("response", "")
|
||||
|
||||
# Try to parse JSON from response
|
||||
try:
|
||||
# Find JSON in response (might have extra text)
|
||||
start = llm_response.find('{')
|
||||
end = llm_response.rfind('}') + 1
|
||||
if start >= 0 and end > start:
|
||||
json_str = llm_response[start:end]
|
||||
analysis = json.loads(json_str)
|
||||
return analysis
|
||||
else:
|
||||
self.log(f"No JSON found in LLM response: {llm_response[:200]}", "WARNING")
|
||||
return self._fallback_analysis(metrics)
|
||||
except json.JSONDecodeError as e:
|
||||
self.log(f"Failed to parse LLM JSON: {e}\nResponse: {llm_response[:200]}", "WARNING")
|
||||
return self._fallback_analysis(metrics)
|
||||
else:
|
||||
self.log(f"Ollama API error: {response.status_code}", "ERROR")
|
||||
return self._fallback_analysis(metrics)
|
||||
|
||||
except requests.exceptions.RequestException as e:
|
||||
self.log(f"Failed to connect to Ollama: {e}", "ERROR")
|
||||
return self._fallback_analysis(metrics)
|
||||
|
||||
def _fallback_analysis(self, metrics: Dict[str, Any]) -> Dict[str, Any]:
|
||||
"""Fallback analysis using simple rules if LLM fails"""
|
||||
threat_detected = False
|
||||
confidence = 0.0
|
||||
indicators = []
|
||||
|
||||
# Check load average
|
||||
if metrics['cpu']['load_avg'][0] > 10:
|
||||
threat_detected = True
|
||||
confidence += 0.3
|
||||
indicators.append(f"High load average: {metrics['cpu']['load_avg'][0]}")
|
||||
|
||||
# Check high CPU processes
|
||||
if metrics['high_cpu_processes']:
|
||||
threat_detected = True
|
||||
confidence += 0.3
|
||||
for proc in metrics['high_cpu_processes']:
|
||||
indicators.append(f"High CPU process: {proc['name']} (PID {proc['pid']}, {proc['cpu']}%)")
|
||||
|
||||
# Check suspicious processes
|
||||
if metrics['suspicious_processes']:
|
||||
threat_detected = True
|
||||
confidence += 0.4
|
||||
for proc in metrics['suspicious_processes']:
|
||||
indicators.append(f"Known miner signature: {proc['name']} (PID {proc['pid']})")
|
||||
|
||||
# Check /tmp executables
|
||||
if metrics['tmp_executables']:
|
||||
threat_detected = True
|
||||
confidence += 0.2
|
||||
indicators.append(f"Suspicious executables in /tmp: {len(metrics['tmp_executables'])}")
|
||||
|
||||
# Check network connections
|
||||
if metrics['network_connections']:
|
||||
threat_detected = True
|
||||
confidence += 0.3
|
||||
indicators.append(f"Suspicious network connections: {len(metrics['network_connections'])}")
|
||||
|
||||
confidence = min(confidence, 1.0)
|
||||
|
||||
return {
|
||||
"threat_detected": threat_detected,
|
||||
"confidence": confidence,
|
||||
"threat_type": "crypto_miner" if confidence > 0.6 else "suspicious_activity",
|
||||
"indicators": indicators,
|
||||
"recommended_actions": [
|
||||
"Kill suspicious processes",
|
||||
"Remove /tmp executables",
|
||||
"Block network connections"
|
||||
] if threat_detected else [],
|
||||
"summary": f"Fallback analysis: {len(indicators)} indicators detected" if threat_detected else "No threats detected"
|
||||
}
|
||||
|
||||
def execute_mitigation(self, analysis: Dict[str, Any], metrics: Dict[str, Any]):
|
||||
"""Execute mitigation actions for detected threats"""
|
||||
if not analysis.get("threat_detected"):
|
||||
return
|
||||
|
||||
self.incident_count += 1
|
||||
self.log(f"🚨 THREAT DETECTED (Incident #{self.incident_count})", "ALERT")
|
||||
self.log(f" Confidence: {analysis['confidence']:.2%}", "ALERT")
|
||||
self.log(f" Type: {analysis['threat_type']}", "ALERT")
|
||||
self.log(f" Summary: {analysis['summary']}", "ALERT")
|
||||
|
||||
# Prepare Telegram message
|
||||
telegram_msg = f"*NODE1 Security Incident #{self.incident_count}*\n\n"
|
||||
telegram_msg += f"⚠️ *Confidence:* {analysis['confidence']:.0%}\n"
|
||||
telegram_msg += f"🔍 *Type:* {analysis['threat_type']}\n"
|
||||
telegram_msg += f"📝 *Summary:* {analysis['summary']}\n\n"
|
||||
telegram_msg += "*Indicators:*\n"
|
||||
|
||||
for indicator in analysis['indicators']:
|
||||
self.log(f" 📍 {indicator}", "ALERT")
|
||||
telegram_msg += f"• {indicator}\n"
|
||||
|
||||
# AUTO-MITIGATION (only if high confidence)
|
||||
if analysis['confidence'] >= ALERT_THRESHOLD:
|
||||
self.log("⚡ EXECUTING AUTO-MITIGATION", "ALERT")
|
||||
|
||||
# Kill high CPU processes
|
||||
for proc in metrics['high_cpu_processes']:
|
||||
try:
|
||||
self.log(f" Killing PID {proc['pid']} ({proc['name']})", "ACTION")
|
||||
subprocess.run(["kill", "-9", str(proc['pid'])], check=False)
|
||||
except Exception as e:
|
||||
self.log(f" Failed to kill PID {proc['pid']}: {e}", "ERROR")
|
||||
|
||||
# Kill known miner processes
|
||||
for proc in metrics['suspicious_processes']:
|
||||
try:
|
||||
self.log(f" Killing known miner PID {proc['pid']} ({proc['name']})", "ACTION")
|
||||
subprocess.run(["kill", "-9", str(proc['pid'])], check=False)
|
||||
except Exception as e:
|
||||
self.log(f" Failed to kill PID {proc['pid']}: {e}", "ERROR")
|
||||
|
||||
# Remove /tmp executables
|
||||
for filepath in metrics['tmp_executables']:
|
||||
try:
|
||||
self.log(f" Removing {filepath}", "ACTION")
|
||||
subprocess.run(["rm", "-rf", filepath], check=False)
|
||||
except Exception as e:
|
||||
self.log(f" Failed to remove {filepath}: {e}", "ERROR")
|
||||
|
||||
# Clean /tmp/.perf.c/
|
||||
try:
|
||||
self.log(" Cleaning /tmp/.perf.c/", "ACTION")
|
||||
subprocess.run(["rm", "-rf", "/tmp/.perf.c"], check=False)
|
||||
except Exception as e:
|
||||
self.log(f" Failed to clean /tmp/.perf.c: {e}", "ERROR")
|
||||
|
||||
self.log("✅ AUTO-MITIGATION COMPLETED", "ALERT")
|
||||
telegram_msg += "\n✅ *Auto-mitigation executed*"
|
||||
else:
|
||||
self.log(f"⚠️ Confidence {analysis['confidence']:.2%} below threshold {ALERT_THRESHOLD:.2%}, manual review recommended", "ALERT")
|
||||
telegram_msg += f"\n⚠️ Manual review recommended (below {ALERT_THRESHOLD:.0%} threshold)"
|
||||
|
||||
# Send Telegram alert
|
||||
self.send_telegram_alert(telegram_msg)
|
||||
|
||||
def run(self):
|
||||
"""Main monitoring loop"""
|
||||
self.log(f"Starting monitoring loop (interval: {CHECK_INTERVAL}s)")
|
||||
|
||||
while True:
|
||||
try:
|
||||
self.log("🔍 Starting security scan...")
|
||||
|
||||
# Collect metrics
|
||||
metrics = self.collect_system_metrics()
|
||||
|
||||
# Quick check: if nothing suspicious, skip LLM analysis
|
||||
if (not metrics['high_cpu_processes'] and
|
||||
not metrics['suspicious_processes'] and
|
||||
not metrics['tmp_executables'] and
|
||||
not metrics['network_connections'] and
|
||||
metrics['cpu']['load_avg'][0] < 5):
|
||||
self.log("✅ System clean (quick check)")
|
||||
else:
|
||||
self.log("🧠 Analyzing with AI (suspicious activity detected)...")
|
||||
analysis = self.analyze_with_llm(metrics)
|
||||
|
||||
self.log(f" Analysis complete: threat={analysis['threat_detected']}, confidence={analysis.get('confidence', 0):.2%}")
|
||||
|
||||
if analysis['threat_detected']:
|
||||
self.execute_mitigation(analysis, metrics)
|
||||
else:
|
||||
self.log("✅ No threats detected")
|
||||
|
||||
time.sleep(CHECK_INTERVAL)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
self.log("Received shutdown signal", "INFO")
|
||||
break
|
||||
except Exception as e:
|
||||
self.log(f"Error in monitoring loop: {e}", "ERROR")
|
||||
time.sleep(60) # Wait before retry
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
agent = AISecurityAgent()
|
||||
agent.run()
|
||||
Reference in New Issue
Block a user