VeriStream
What did I actually build
Imagine pointing an AI compiler at a Twitch stream and getting live verdicts on what's real, what's fake, and why. VeriStream is that pipeline.
Tech Stack
TL;DR
- Deepfake scores + fact-check verdicts while the stream is still playing
- Chained compiler: CV → Whisper → LLM → Knowledge Graph
- Built the whole stack: FastAPI, React, Apache Kafka, Apache Spark, Neo4j
| Role | What I shipped | Stack |
|---|---|---|
| Product + ML Engineer | Real-time misinformation scanner + dashboard | FastAPI, PyTorch, Whisper, Groq, Neo4j |
| Distributed Systems | Kafka + Spark streaming backbone | Apache Kafka, ZooKeeper, PySpark |
| Frontend | Analyst-facing console | React, Chart.js, Leaflet |
Why I built this
Election season + AI-generated video chaos = nobody knows what’s legit. Journalists told me their current workflow is “download clip → manually scrub it → Google the claims”. VeriStream short-circuits that by acting like a compiler for live media—tokenizing frames, running inference passes, then linking evidence into a knowledge graph that can be queried instantly.
System Highlights
Dual-path stream compiler
- Path A (FastAPI): ultra-low latency mode (chunked FFmpeg capture → async inference → WebSocket push, ~20s delay)
- Path B (Spark + Kafka): high-throughput mode that chews through ~1,800 frames/min with micro-batches and writes verdicts back to Kafka.
Attention-driven deepfake radar
Fine-tuned DINOv2 ViT produces frame-level probabilities and heatmaps so an analyst can literally see which facial regions look synthetic.
Multilingual narrative watchdog
Whisper → Groq translation pipeline keeps both the source language and the English transcript so fact-checkers don’t lose nuance.
Fact-checking compiler
- Lexical pass: spaCy + regex identify claims worth verifying.
- Evidence pass: Google Fact-Check API + FAISS RAG + Neo4j knowledge graph.
- Synthesis pass: LLM writes a verdict + justification with confidence scores.
Knowledge graph memory
Every processed clip plots entities, claims, and verdicts inside Neo4j so repeat misinformation gets flagged faster next time.
Tech I Actually used
| Layer | Notes |
|---|---|
| Data plane | Apache Kafka (5 MB messages) + Spark 3.5.3 micro-batches (2s trigger, checkpointed) |
| Inference | DINOv2 Vision Transformer, Whisper base, custom BERT political bias classifier |
| LLM | Groq Llama3-8B (translations + verdict synthesis) |
| Storage | Neo4j (graph), FAISS (vector search), temp media store, JSON caches |
| UI | React + WebSockets + Chart.js/Plotly + Leaflet heatmaps |
Impact / Wins
- 92.4% deepfake accuracy with explainable heatmaps (attention overlays saved straight from PyTorch tensors).
- 15–30s stream latency in direct mode, 2–5s frame-to-verdict in Spark mode.
- Detects 116 emotional trigger patterns + 150+ stereotype templates to score manipulation risk.
- Builds a Neo4j knowledge graph per stream, so repeat misinformation gets a “seen before” badge automatically.
What I Learned (and Shipped)
- Getting Apache Kafka + Spark to play nice with OpenCV frames meant inventing a base64 frame codec and dedupe keys.
- Ran all heavy models as singletons inside FastAPI so I don’t nuke RAM every request.
- Built a background fact-check buffer: accumulate 30 seconds of transcription, then kick off Groq verdicts without blocking the stream.
- Designing for analysts meant obsessing over small UX touches (heatmap gallery, political bias gauges, manipulation score pills).
🔗 Links: Watch the walkthrough | GitHub Repo