What I Build

Every solution below runs in production, backed by a real project you can explore on this site.

RAG & Knowledge Systems

What I Build

Semantic search over your own data with citation grounding. Hybrid retrieval (keyword + vector), namespace scoping, multi-tenant isolation.

Where It Runs

Built into Catalyst (powers 4 products), π.Law (legal case data), Forensic AI Studio (100K+ documents), and Let There Be RAG (SaaS).

Stack
PostgreSQL + pgvectorHybrid SearchFirecrawlRetrieve-then-Rerank
See It In Action

Voice AI & Real-Time Audio

What I Build

Full-duplex voice agents with sub-300ms latency, turn-taking, and interruption handling. Self-hosted STT/TTS or cloud APIs.

Where It Runs

Silicon Smackdown runs live multi-personality voice debates. Catalyst's voice layer handles real-time conversations with tool calling.

Stack
Gemini Live APIWhisper STTKokoro TTSWebSocket Streaming
See It In Action

AI Agent Systems

What I Build

Function-calling agents with tool orchestration, multi-step planning, and human-in-the-loop gates. MCP protocol for IDE-native agent tooling.

Where It Runs

Forensic AI Studio exposes 39 MCP tools to Copilot. GTO Poker Coach uses function-calling with Monte Carlo simulation. Catalyst orchestrates tools across tenants.

Stack
Function CallingMCP SDKTool OrchestrationMulti-Agent Delegation
See It In Action

Document Intelligence

What I Build

Automated parsing, semantic chunking, and structured extraction from PDFs, DOCX, and web content. Entity mapping across large document sets.

Where It Runs

Forensic AI Studio processes legal evidence at scale. π.Law handles case documents with zero-leakage proxy architecture.

Stack
PDF/DOCX ExtractionEntity GraphsVector EmbeddingsStructured Output
See It In Action

Self-Hosted AI Infrastructure

What I Build

Deploy LLM inference, STT, and TTS on your own hardware. No cloud vendor lock-in. VRAM optimization, quantization, multi-service orchestration.

Where It Runs

Running vLLM, llama.cpp, Whisper, and Kokoro on bare-metal RTX GPUs with Supervisor and Nginx orchestration.

Stack
vLLMllama.cppGPTQ/AWQSupervisor + systemd

Have a different problem?

If it involves LLMs, real-time data, or production infrastructure, I can probably help.