What I Build
Every solution below runs in production, backed by a real project you can explore on this site.
RAG & Knowledge Systems
Semantic search over your own data with citation grounding. Hybrid retrieval (keyword + vector), namespace scoping, multi-tenant isolation.
Built into Catalyst (powers 4 products), π.Law (legal case data), Forensic AI Studio (100K+ documents), and Let There Be RAG (SaaS).
Voice AI & Real-Time Audio
Full-duplex voice agents with sub-300ms latency, turn-taking, and interruption handling. Self-hosted STT/TTS or cloud APIs.
Silicon Smackdown runs live multi-personality voice debates. Catalyst's voice layer handles real-time conversations with tool calling.
AI Agent Systems
Function-calling agents with tool orchestration, multi-step planning, and human-in-the-loop gates. MCP protocol for IDE-native agent tooling.
Forensic AI Studio exposes 39 MCP tools to Copilot. GTO Poker Coach uses function-calling with Monte Carlo simulation. Catalyst orchestrates tools across tenants.
Document Intelligence
Automated parsing, semantic chunking, and structured extraction from PDFs, DOCX, and web content. Entity mapping across large document sets.
Forensic AI Studio processes legal evidence at scale. π.Law handles case documents with zero-leakage proxy architecture.
Self-Hosted AI Infrastructure
Deploy LLM inference, STT, and TTS on your own hardware. No cloud vendor lock-in. VRAM optimization, quantization, multi-service orchestration.
Running vLLM, llama.cpp, Whisper, and Kokoro on bare-metal RTX GPUs with Supervisor and Nginx orchestration.
Have a different problem?
If it involves LLMs, real-time data, or production infrastructure, I can probably help.