All Posts
Product BuildingMarch 20, 20269 min read

How I Built ArguLens: AI-Powered Legal Document Analysis

The story behind ArguLens — from a friend's divorce paperwork to an AI product that finds contradictions, reconstructs timelines, and grounds every claim in source documents.

ArguLensLLMsClaude APIRAGNext.jsProduct

It Started with a Stack of Paperwork

A friend was going through a divorce. Financial disclosures, custody agreements, years of communication records — hundreds of pages, and her lawyer billed $400/hour to read through them. She asked me a simple question: "Can't AI do this?"

I couldn't stop thinking about it. Legal document review is a reading comprehension problem — the kind LLMs are genuinely good at. But the stakes are brutal. Miss a contradiction in a witness statement and someone loses their case. I wanted to build something that could actually help people in that situation.

What ArguLens Actually Does

You upload legal documents — PDFs, Word files, plain text — and get back structured analysis:

  • Strengths and weaknesses for each side, grounded in specific passages from the documents
  • Contradiction detection across documents — inconsistent dates, conflicting claims, impossible timelines
  • Chronological timeline of events, with gaps flagged
  • Evidence mapping that links every conclusion back to exact source text
  • Prenup draft generation for family law cases, with customizable templates

The key word there is "grounded." Nothing the system says is unsupported. Every claim has a citation.

Why Claude?

I tested GPT-4, Claude, and Gemini extensively. Claude won on three dimensions that matter most for legal work:

  1. Context window. Legal filings run 50+ pages. Claude's 200K token window meant I could feed in entire documents without aggressive chunking.
  2. Hallucination resistance. In legal analysis, fabricating a fact is catastrophic — worse than missing one entirely. Claude's tendency toward cautious, hedge-when-uncertain responses was exactly what I needed.
  3. Instruction following on complex tasks. The analysis prompt is a multi-step workflow: extract claims, locate supporting evidence, cross-reference across documents, identify contradictions, score argument strength. Claude handles these structured chains more reliably than the alternatives.

The RAG Pipeline

The core technical challenge was evidence grounding. Every insight the system produces must trace back to specific text in the uploaded documents. No vague summaries.

The pipeline works in four stages:

  1. Chunking by structure — paragraphs and section headers, not fixed-size windows. Legal documents have meaningful structure, and breaking mid-paragraph destroys context.
  2. Embedding with metadata — each chunk carries its document name, section heading, and page number.
  3. Retrieval per analytical question — the system asks targeted questions ("What claims does Party A make about asset division?") and retrieves relevant chunks.
  4. Forced citation — the prompt template requires Claude to include document references for every conclusion. If it can't cite a source, it must say so.

PDF Parsing Was the Hardest Part

I expected the AI layer to be the hard engineering problem. I was wrong — it was PDF parsing.

Legal documents arrive in every format imaginable: scanned images requiring OCR, financial tables with complex multi-column layouts, court filings with headers and footers on every page, documents with handwritten margin notes. I ended up building a multi-stage pipeline — structural extraction first, OCR fallback for scanned pages, then dedicated layout analysis for tables.

Getting this pipeline reliable across the variety of real-world legal documents took longer than building the entire AI analysis layer. If you're building a document-processing product, budget twice as much time for parsing as you think you need.

Finding Contradictions Across Documents

This is the feature that makes ArguLens genuinely useful, and it was the hardest to build well. Contradiction detection isn't keyword matching — it's semantic reasoning about consistency.

Consider this pair of statements from different documents:

  • Document A: "The meeting occurred on March 15th at the downtown office."
  • Document B: "I was traveling in Europe from March 10-20."

Zero overlapping keywords. But these statements are mutually exclusive — you can't attend a meeting at a downtown office while traveling in Europe.

My approach: extract factual claims from each document, categorize them (temporal, spatial, causal), then use the LLM to evaluate pairs of claims for logical consistency. The model isn't looking for textual similarity; it's reasoning about whether two claims can both be true simultaneously.

Four Things I Learned

Domain expertise comes first. I spent weeks reading about legal reasoning, evidence standards, and case analysis methodology before writing a line of code. The prompt engineering is only as good as your understanding of the domain. You can't shortcut this. Trust requires traceability. Every claim links back to source text. This wasn't a feature request — it's the reason the product works at all. Lawyers won't act on analysis they can't verify, and they shouldn't. Narrow beats broad. My first attempt was a general-purpose "legal AI assistant." Too vague, too shallow. Focusing specifically on case strength analysis and contradiction detection gave me a product sharp enough to be useful. UI is half the product. The first version dumped a wall of text. Nobody used it. The current version uses structured cards, color-coded strength indicators, and interactive timelines. Same AI underneath — dramatically better experience.

What's Next

I'm building multi-party analysis (cases with 3+ parties), jurisdiction-aware reasoning (incorporating state-specific precedents), and collaborative workspaces so legal teams can annotate the AI's analysis together.

ArguLens is live at argulens.com.

VS
Venkata Subramanian Srinivasan
Senior Data Scientist at Asurion | Georgia Tech Alumni
Share