DataLensAI — Understand your data instantly with agentic, deterministic intelligence.
Link to open source: https://github.com/Shourya523/datalens-hackDays
Link to Live Project: https://datalens-hack-days.vercel.app/
The Intelligent Data Dictionary Agent By Team LocalHost
The Intelligent Data Dictionary Agent is an AI-powered platform that transforms complex enterprise schemas into a business-friendly, continuously updated knowledge layer. It automates documentation, governance, and development workflows to improve data trust and accessibility across organizations.
Core Goals
- Automated Enrichment: Uses AI to generate clear, user-friendly descriptions and summaries for technical metadata.
- Intelligent Governance: Performs real-time quality analysis (completeness/freshness) and automatically flags sensitive PII for compliance.
- Natural Language Accessibility: Democratizes data through a conversational chat interface, allowing users to query data meaning without writing SQL.
- Automated Lineage Mapping: Constructs dynamic lineage graphs to visualize how data flows between tables and systems, identifying dependencies and performing impact analysis.
- Agentic API Builder: An autonomous agent that reads your schema and generates production-ready REST API endpoints — no manual spec writing, no back-and-forth chat, just build.
- SQL Query Agent: Translates natural language questions into optimized SQL queries and executes them against your connected database instantly.
How It Works
- Ingestion: Connects securely to source databases (PostgreSQL, MySQL, Snowflake, Neo4j) using read-only connectors to extract schema metadata.
- AI Enrichment: Gemini/OpenAI generates business context, detects sensitive data, and provides impact analysis.
- Graph Construction: Parses foreign keys, query logs, and join patterns to build a relationship graph representing data lineage.
- Storage: Metadata and AI summaries are stored in Neon (PostgreSQL), utilizing pgvector for semantic retrieval.
- Discovery: Users interact via natural language chat that retrieves context from the vector store to answer data questions.
- Sync: Employs incremental updates to ensure documentation and lineage reflect real-time schema changes.
Key Features
- Intelligent Schema Scanner: Automatically extracts tables, columns, data types, primary/foreign keys, constraints, row counts, sample data, and indexes.
- Interactive ER Diagrams: Clickable, filterable relationship maps with drill-down inspection.
- Graph Database Support (Neo4j): Visualizes node labels, edge types, property structures, and relationship distributions.
- Data Quality Diagnostics & Health Score: Dynamic radial health scoring, data type profiling, and an audit issues log with remediation recommendations.
- Widescreen Reference Manual Portal: Full-screen documentation view with an interactive entity network graph, live-search sidebar, and book-style markdown layout.
- Premium PDF & Multi-Format Exports: Dark-themed print engine with page-break safeguards; exports to JSON, Markdown, and more.
- Executive Business Report: AI-generated governance insights, key findings, and schema health assessments.
- IDE Integration via MCP: A standalone MCP server for Cursor, VS Code, and Antigravity — enabling schema scan, documentation, query execution, data quality checks, and AI chat directly inside your editor.
IDE / MCP Setup
cd vscode-extension/mcp-server
npm install && npm run build
| IDE | Config file |
|---|---|
| Cursor | .cursor/mcp.json.example |
| VS Code | .vscode/mcp.json.example |
| Antigravity | antigravity-mcp.example.json |
Tech Stack
| Layer | Technology |
|---|---|
| Frontend | Next.js (App Router + Turbopack) |
| UI | Tailwind CSS + shadcn/ui + Lucide Icons |
| Backend | Next.js Server Actions & API Routes |
| ORM | Drizzle ORM |
| Auth | Clerk / Neon Auth (Google OAuth & Credentials) |
| Databases | PostgreSQL, MySQL, Snowflake, Neo4j |
| Embeddings | Mixedbread AI |
| Vector DB | Qdrant Cloud + pgvector (Neon) |
| LLM | Gemini 2.5 Flash + OpenAI |
| Language | TypeScript |
| Deployment | Vercel + Edge Functions |
Outcome
This solution reduces operational costs through automated processing, accelerates decision-making with instant data discovery, and ensures proactive compliance via automatic PII detection and visual lineage tracking — while empowering developers with agentic tools that go from schema to working API endpoints in seconds.
This build was uploaded as a hackathon project










