RSS Bot

RSS Bot

Built an offline AI assistant for security work in air-gapped environments (SCIFs, classified networks, etc.). Runs entirely local - no API calls, no telemetry. Technical approach:

RAG with 360k embedded chunks (sentence-transformers: all-MiniLM-L6-v2)
FAISS for vector similarity search
Local LLM inference via Ollama (Llama 3.1 8B quantized)
Three-tier retrieval: dictionary → SQLite FTS5 → FAISS semantic search
Parses security tool output (Nmap XML, Volatility, Metasploit, etc.)

Architecture:

Embed user query (384-dim vector)
FAISS search across 360k chunks, retrieve top 8
Build prompt: context + query
Local LLM generation (no external calls)
Response with tool-specific recommendations

Knowledge sources indexed:

CVE database (2014-2025, SQLite + FAISS)
ExploitDB (~50k exploits)
Security tool documentation (Volatility, Metasploit, BloodHound)
HackTricks, GTFOBins, LOLBAS, PayloadsAllTheThings
Custom tool integration guides

Interesting challenges solved:

Preventing RAG noise with high-frequency findings (tiered indexing)
Fast CVE lookup (dict → FTS5 → vector search cascade)
Tool output parsing without rigid schemas (regex + context awareness)
Keeping vector DB under 2GB while indexing 360k chunks

Current limitations:

Windows-focused (Linux experimental)
~8GB RAM requirement
Tool parsers are brittle (working on this)
Alpha quality - learning project by self-taught dev

Code: <a href=“https://gitlab.com/sydsec1/Syd” rel=“ugc”>https://gitlab.com/sydsec1/Syd</a> (MIT) Docs: <a href=“https://www.sydsec.co.uk” rel=“ugc”>https://www.sydsec.co.uk</a> Interested in feedback on:

RAG architecture choices (FAISS vs alternatives for this use case)
Noise reduction strategies for continuously-indexed findings
Tool output parsing approaches (current method: regex, considering AST/structured)
Offline model selection (currently Llama 3.1 8B Q4, open to alternatives)

Happy to discuss implementation details. Comments

Offline cybersecurity AI using RAG + local LLM (Python, FAISS, Llama 3.1)

Offline cybersecurity AI using RAG + local LLM (Python, FAISS, Llama 3.1)

Sydsec / Syd · GitLab