Intelligent Data Dictionary β AI-Powered Multi-Database Discovery Platform
Link to open source: https://github.com/narang25/intelligent-data-dictionary-agent
Intelligent Data Dictionary (IDD) is an AI-powered metadata intelligence platform that helps data teams stop flying blind across their databases. It connects to multiple heterogeneous data sources β PostgreSQL, MySQL, Snowflake, MongoDB, and BigQuery β and transforms raw schema information into a rich, searchable, and conversational knowledge base.
The Problem We Solve: Data teams waste hours hunting for what tables exist, what columns mean, where data comes from, and whether it can be trusted. This tribal knowledge lives in Slack messages, outdated wikis, and people's heads β not where engineers need it.
What IDD Does:
- π Multi-Database Connectivity β Connect to PostgreSQL, MySQL, Snowflake, MongoDB, and BigQuery from a single unified dashboard using a secure, plug-and-play
BaseConnectorarchitecture with Fernet-encrypted credentials stored at rest. - π€ AI-Powered Chat Interface β Ask questions in plain English. IDD intelligently detects whether to run a RAG (documentation lookup) or SQL query, generates dialect-aware SQL, executes it safely, and returns a summarized result with a confidence score.
- π Automated Schema Syncing β One-click or background Celery sync extracts tables, columns, relationships (foreign keys), and data types directly from source databases.
- 𧬠Data Quality Profiling β Built-in profiling surfaces null percentages, distinct counts, min/max/mean values, and flags anomalous columns automatically.
- πΊοΈ Data Lineage Tracking β Visual lineage graph traces how data flows through your systems, enabling impact analysis for schema changes.
- π Auto-Documentation β LLM-generated descriptions for every table and column, stored as a searchable vector knowledge base (pgvector) namespaced per connection.
- π Secure by Default β All credentials encrypted at rest; role-based access patterns built into the API layer.
- π³ Fully Dockerized β Six-container stack (React frontend, FastAPI backend, Celery worker, Celery beat scheduler, Redis, PostgreSQL) deployable in a single
docker-compose upcommand.
Tech Stack: FastAPI Β· React + Vite Β· PostgreSQL + pgvector Β· Celery + Redis Β· SQLAlchemy Β· Nginx Β· Docker
IDD is built for modern data teams who need to spend less time hunting metadata and more time building.
This build was uploaded as a hackathon project







