
Sound Suite AI
Sound Suite is a self-hosted legal document intelligence platform built for litigation case management. It ingests court filings and other case PDFs, applies OCR and semantic embedding locally, and surfaces document intelligence through a suite of AI analysis tools accessible via the Model Context Protocol (MCP).
Key features and functions include:
Automated Document Ingestion Pipeline
The platform monitors designated case folders for new PDFs in real time. Incoming documents are processed through an automated sequence of text extraction, OCR (for scanned pages), text chunking, and vector embedding generation. SHA-256 hash deduplication prevents reprocessing of existing files.
Fourteen MCP Analysis Tools
Sound Suite exposes fourteen AI-powered analysis tools through the MCP standard, enabling integration with compatible AI assistants such as Claude, ChatGPT, and others. Analysis capabilities include contradiction detection, timeline extraction, entity recognition, citation analysis, privilege screening, obligation extraction, claim evolution tracking, argument structure mapping, exhibit analysis, pattern scanning, semantic search, tone analysis, workflow automation, and knowledge query.
Hybrid Semantic and Keyword Search
Documents are indexed in a LanceDB vector store, enabling natural language semantic search across the full case corpus. Keyword and regex-based pattern matching are also supported. A cross-encoder reranker refines result relevance. Passage-level results are returned with relevance scores.
Legal Document Draft Editor
A multi-panel rich text editor supports the drafting of briefs, motions, and memos. Features include Word (.docx) import and export, PDF export with headers and footers, track changes with accept/reject functionality, named version history with side-by-side comparison, and an AI chat panel that references indexed case documents. Auto-suggest surfaces relevant citations, facts, and phrasing drawn from the case library as the user types.
Exhibit Extraction and Search
Sound Suite automatically detects, extracts, and indexes exhibit images embedded in court filings. Extracted exhibits are processed through an image enhancement and OCR pipeline and stored locally. Exhibits are searchable by natural language description.
Structured Case and Entity Tagging
During ingestion, documents are parsed for typed legal entities including parties, motions, exhibits, judges, and hearing dates. Structured search filters allow users to narrow results by entity type and traverse the case graph.
GPU Sidecar for Local AI Acceleration
An optional GPU Sidecar component allows users to orchestrate a local network of GPU machines (macOS, Windows, and Linux) to run AI models for embedding, reranking, OCR, and completions entirely on premises. Role assignment per machine is managed through an administrative dashboard.
Local-First Architecture
All document processing, vector storage, and AI inference occur on the user's own hardware by default. No case documents, search queries, or analysis results are transmitted to external servers unless the user elects to configure a cloud embedding provider (OpenAI or Anthropic). The software is source-available under the Polyform Noncommercial License 1.0.0, with the full source code publicly accessible on GitHub.
Loading...