M87
AI Knowledge Platform
MVP Development

AI Knowledge Platform

Built a full AI knowledge platform with RAG, lead collection, scheduled web crawling, audio transcription, and an embeddable widget - now serving 10+ enterprise clients across law, insurance, automotive, and healthcare.

Client

Enterprise Client

Niche

Law, Insurance & Healthcare

Timeline

6 months

Key Result10+ enterprise clients on the platform

The Brief

The founder was a VP of Technologies & Business Development at a major local firm with deep relationships across the local market - law firms, insurance companies, car dealerships, content sites, health providers. He had the connections and the market access, but no AI engineering talent.

He didn't just want a chatbot. He wanted a platform that could turn his clients' documents and websites into AI-powered customer support - and turn those conversations into qualified leads.

What We Built

A full B2B AI platform with 22 database models, 45 API routes, and 34,000 lines of TypeScript. Not a wrapper around ChatGPT - a production SaaS with multi-tenancy, billing, and a reseller model built in from day one.

Professional-Grade Content Ingestion

The platform ingests documents, websites, and audio into a searchable knowledge base. PDFs are processed with Gemini-powered OCR - parallel batch processing that handles scanned documents, extracts image descriptions, and preserves page structure. Excel files are parsed sheet-by-sheet with table structure intact. Word documents, CSV files, and raw text each have format-specific processing with semantic chunking.

Beyond file uploads, the platform crawls websites on a schedule - daily, weekly, or monthly - via Google Cloud Scheduler. Knowledge bases stay fresh automatically without manual re-uploads. Each crawl job tracks errors per URL and persists scraper configuration.

Multi-LLM Chat with Hybrid Search

The chat interface supports 10+ models across three providers: Google Gemini (1.5 Pro/Flash, 2.0/2.5 Flash, 2.5 Pro), OpenAI (GPT-4o, GPT-4.1), and Anthropic (Claude 3.7 Sonnet/Haiku). Responses stream in real-time with citation tracking - users see which documents and pages the answer came from.

Retrieval uses a hybrid approach: vector similarity search via pgvector combined with BM25 ranking for precision. Admins can tune the retrieval depth per agent to balance context quality against response speed.

The platform also supports audio: users upload voice recordings (WAV, MP3, M4A, OGG, FLAC) that are transcribed via OpenAI and processed as queries.

AI-Powered Lead Collection

This is what makes the platform a business tool, not just a chat tool. The platform monitors conversations in real-time, classifies when a visitor is showing lead intent, and extracts structured parameters from the conversation - name, email, phone, interest, whatever the client configures.

Leads are submitted to external CRMs via webhooks with full authentication support (OAuth 2.0, Basic Auth, API Key). For a car dealership, this means a website visitor asks about a model, the AI answers from the dealership's inventory docs, and the lead lands in their CRM - automatically, with the full conversation attached.

Embeddable Widget

A script-tag widget that clients drop onto their websites. The widget supports RTL languages (Hebrew, Arabic), light/dark themes, mobile-specific positioning, and Google Tag Manager integration. Conversation digests are emailed to subscribers on a schedule.

Two widget versions (v1, v2) with customizable positioning, consent banners, and introduction text give clients control over the experience without touching code.

Multi-Tenant Platform

Every client gets their own organization with isolated data, agents, and knowledge bases. A granular permission system controls who can add users, manage data, configure agents, and edit settings. Token-level cost analytics break down usage by model and provider, with tiered pricing calculations.

The Result

The platform is now serving 10+ enterprise clients across law, insurance, automotive, and healthcare. It handles real production traffic with real business data - ingesting documents, answering customer questions, and generating leads around the clock.

The founder's market access combined with M87's engineering created a platform that scales through relationships, not marketing spend. Each new client is a new organization, configured and live in hours.

Technologies Used

Next.jsTypeScriptPostgreSQLPrismaGoogle Gemini APIOpenAI APIAnthropic APIPineconeGCPAWSNextAuthDocker

Tell us what you need.

Whether it's a product, a team transformation, or a developer - we'll get back to you with an honest answer on whether we're the right fit.