Dilip Singh is a Lead AI Architect and AI developer based in Delhi, India. He has 14+ years of experience building enterprise AI chatbots, AI assistants, multi-agent platforms, RAG pipelines, and ontology-driven knowledge systems. He is Lead Software Architect at Hureka Technologies and has delivered 118+ production projects globally.

Is Dilip Singh an AI developer?

Yes. Dilip Singh is a senior AI developer and architect specializing in production AI systems — LLM orchestration, RAG pipelines, AI chatbots, voice AI assistants, and multi-agent platforms. He works with Claude, OpenAI, Ollama, Qdrant, Temporal, Next.js, and FastAPI.

Does Dilip Singh build AI chatbots and AI assistants?

Yes. Dilip builds enterprise AI chatbots and AI assistants with RAG grounding, multi-channel deployment (web, Slack, Teams), human approval workflows, and per-tenant knowledge bases. Flagship projects include Hureka AI (BYOK support platform) and AImind Agent Hub (multi-agent chat, email, and voice).

Does Dilip Singh work with ontology and knowledge graphs for AI?

Yes. Dilip designs semantic ontologies and knowledge graphs to structure AI retrieval — taxonomy design, entity relationships, and RAG grounding for more accurate AI assistant and chatbot responses. His blog covers ontology-driven content architecture for AI systems.

What services does Dilip Singh offer for freelance AI projects?

Dilip Singh offers AI architecture consulting, AI chatbot development, AI assistant systems, ontology/RAG design, multi-agent AI development, voice AI integration, enterprise SaaS architecture, Drupal-to-modern migration, and CTO-as-a-service for startups.

Is Dilip Singh available for remote freelance work?

Yes. Dilip is based in Delhi, India (IST/Asia timezone) and works with clients globally including USA, Canada, Tanzania, and Europe. Engagements include hourly consulting, fixed-price projects, and monthly retainers.

What is the typical project budget for AI architecture work?

Project budgets vary by scope. AI MVP development typically starts from $15,000, multi-agent AI platforms from $30,000, and enterprise AI architecture engagements from $50,000+. Discovery calls are free to scope requirements.

How quickly does Dilip Singh respond to project inquiries?

All inquiries receive a response within 24 hours. Urgent projects can be discussed via email at dilip@hurekatek.com or WhatsApp.

What technologies does Dilip Singh specialize in?

Core expertise includes AI chatbots, AI assistants, multi-agent AI, RAG pipelines (Qdrant, Pinecone), ontology/knowledge graphs, LLM orchestration (Claude, OpenAI, Ollama), voice AI (Pipecat, LiveKit, Whisper), Next.js, FastAPI, Temporal, Docker, Kubernetes, and enterprise Drupal/Laravel systems.

All posts

RAG SystemsIntermediate2026-02-08·10 min read

PostgreSQL pgvector for Production RAG: Indexing, Hybrid Search & Scale

When pgvector beats a dedicated vector database. Index choices (HNSW vs IVFFlat), tuning for 10M+ rows, hybrid search with full-text, and the moment you should reach for Qdrant instead.

PostgreSQL pgvector RAG Vector Search Hybrid Search Database

The "Just Use Postgres" Argument

You probably already run Postgres. Adding pgvector is one extension, zero new services, full SQL power, and existing operational know-how. For 80% of teams starting with RAG, pgvector is the right call.

Setup

sql

CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE chunks ( id BIGSERIAL PRIMARY KEY, tenant_id UUID NOT NULL, doc_id UUID NOT NULL, chunk_text TEXT NOT NULL, embedding VECTOR(768) NOT NULL, metadata JSONB NOT NULL DEFAULT '{}', created_at TIMESTAMPTZ DEFAULT NOW() ); ```

HNSW vs IVFFlat

Index	Build time	Query speed	Recall	Memory
HNSW	Slow	Very fast	High	Higher
IVFFlat	Fast	Fast	Tunable	Lower

For < 10M vectors, HNSW is almost always the right choice:

sql

CREATE INDEX idx_chunks_embedding ON chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Tenant-first composite (so RLS doesn't blow up performance) CREATE INDEX idx_chunks_tenant ON chunks (tenant_id, created_at DESC); ```

Query with Tenant Isolation

sql

SELECT id, chunk_text, 1 - (embedding <=> $1) AS similarity
FROM chunks
WHERE tenant_id = $2
  AND (embedding <=> $1) < 0.4  -- distance < 0.4 = similarity > 0.6
ORDER BY embedding <=> $1
LIMIT 10;

Tune `ef_search` Per Query

Higher ef_search = more recall, slower query. Tune it per-request:

sql

SET LOCAL hnsw.ef_search = 100;  -- default is 40

Hybrid Search: Dense + Full-Text

sql

ALTER TABLE chunks ADD COLUMN tsv tsvector
    GENERATED ALWAYS AS (to_tsvector('english', chunk_text)) STORED;
CREATE INDEX idx_chunks_tsv ON chunks USING gin(tsv);

-- Combined ranking with Reciprocal Rank Fusion WITH dense AS ( SELECT id, ROW_NUMBER() OVER (ORDER BY embedding <=> $1) AS rank FROM chunks WHERE tenant_id = $2 ORDER BY embedding <=> $1 LIMIT 50 ), sparse AS ( SELECT id, ROW_NUMBER() OVER (ORDER BY ts_rank(tsv, query) DESC) AS rank FROM chunks, plainto_tsquery('english', $3) AS query WHERE tenant_id = $2 AND tsv @@ query ORDER BY ts_rank(tsv, query) DESC LIMIT 50 ) SELECT c.id, c.chunk_text, COALESCE(1.0/(60 + d.rank), 0) + COALESCE(1.0/(60 + s.rank), 0) AS rrf FROM chunks c LEFT JOIN dense d USING (id) LEFT JOIN sparse s USING (id) WHERE d.id IS NOT NULL OR s.id IS NOT NULL ORDER BY rrf DESC LIMIT 10; ```

When to Outgrow pgvector

> 50M vectors and you need sub-50ms p95
HNSW build times start exceeding maintenance windows
You need quantization (binary or scalar) for memory efficiency
Multiple vector spaces per query (rare, but happens)

Until then, the operational simplicity of one database wins.

Dilip Singh

Lead Software Architect · Hureka Technologies

14+ years building enterprise software and AI systems. Architecting multi-agent AI platforms, RAG pipelines, voice AI, and high-performance SaaS for global clients.

Hire me →About →

RAG Systems · 11 min read

Vector Database Showdown 2026: Qdrant vs Pinecone vs Weaviate vs pgvector

RAG Systems · 18 min read

RAG Pipeline Design: Chunking, Embeddings & Qdrant at Production Scale

RAG Systems · 20 min read

Enterprise RAG Pipeline Architecture: From POC to Production

All posts Work together