2025-08-04 · 7 min read

How I built my AI clone that talks like me

A local-first AI clone built with Ollama and ChromaDB for real-time conversations shaped by my writing, projects, and personal style.

GenAIChatbotsOllamaChromaDBVector Embeddings

Goal and constraints

I wanted a chatbot that felt personal, not generic. It had to answer with my tone, my project context, and my preferred problem-solving style.

I also wanted local control for speed and privacy, so I used a local model runtime instead of relying fully on cloud APIs.

I converted my notes, blog drafts, and project documentation into chunks and embedded them into a vector database.

The retrieval layer was designed to fetch the most relevant memories before generation so responses stay grounded in real source material.

Ollama handled local inference while the application orchestrated retrieval, prompt assembly, and response formatting.

I tuned prompts to preserve personality cues without making responses overly verbose or robotic.

To keep interactions fluid, I prioritized low latency over unnecessary prompt complexity.

Streaming output and concise context selection made the chatbot feel responsive enough for daily use.

Retrieval quality was the biggest factor in making the clone feel authentic.

The final system proved that personal assistants can be useful and practical without a heavy backend or expensive serving stack.