← All work  ·  Case 01 of 04

SymplrHealthcare GRC SaaSProduct Owner2026 · Ongoing

Quote automation, built for the humans who still sign off.

An LLM + RAG pipeline cut data entry per quote by 88% and scaled analyst capacity 1.65x, without taking the human out of compliance.

38%
Faster turnaround
3.2 → 1.5 days
88%
Less data entry
8 min → 1 min / quote
1.65×
Analyst throughput
190 → 314 quotes/mo
$5–7M
Savings identified
enterprise hospital POC
0
SLA breaches
post-launch
0
New headcount
needed

Analysts were the bottleneck. So was compliance review.

Enterprise-tier quotes averaged 3.2 days to turn around, not because analysts were slow, but because every quote demanded reconciling unstructured vendor PDFs against a structured pricing matrix, then routing for compliance sign-off. Doubling throughput meant doubling headcount. Neither the unit economics nor the hiring pipeline supported that.

Leadership wanted "AI to fix this." The instinct most PMs would follow: ship a model that generates the quote end-to-end. In a GRC product, that instinct is wrong.

Shadowed 12 analysts. Read the SOPs. Asked what breaks.

I spent two weeks embedded with the quote team across three offices. The pattern was consistent: analysts weren't slow on decisions, they were slow on extraction, manually pulling SKUs, pricing tiers, and contract terms out of 40-page PDFs so a human could apply judgment.

"The AI doesn't need to price the quote. It needs to hand me a clean table so I can price it in two minutes instead of twenty."

That reframe changed the scope. We weren't automating pricing — we were automating data preparation, with the analyst still making the call and signing the quote.

LLM + RAG for extraction. Humans for judgment.

The architecture: a document-ingest pipeline parsed incoming PDFs, RAG retrieved the relevant pricing-matrix rows from our product catalog, and an LLM structured the extracted data into a validated table. The analyst saw the table pre-filled, with confidence scores, field-level provenance, and low-confidence rows flagged for manual review.

The non-negotiable constraint: every quote goes through human validation before it leaves the building. Compliance needed signatures, not model outputs. HITL wasn't a feature; it was the contract.

Engineering wanted to start with a fine-tuned model. I pushed back — we had zero labeled data and a 6-week pilot window. RAG first, fine-tune later if the baseline needed it. It didn't.

Before → after, in the only unit that matters.

3.2 days avg turnaroundbefore
+1.5 days avg turnaroundafter · 6 wks
8 min data entry / quotebefore
+1 min review / quoteafter
190 quotes / analyst / mobefore
+314 quotes / analyst / moafter

What I owned on this build.

Discovery
12 analyst shadowing sessions, SOP audit, workflow mapping with ops lead.
Scoping
Reframed ambiguous "AI automation" ask into extraction-first HITL architecture. Wrote the PRD.
Prioritization
WSJF against 5 competing initiatives; pitched and defended sequencing to leadership.
Delivery
Cross-functional execution with eng, QA, CS, compliance. Zero SLA breaches on shared OKRs.
Measurement
Pendo funnel telemetry + analyst throughput instrumentation. Weekly review cadence.

Three things I'd tell the next PM on this project.

The obvious AI build is usually wrong. End-to-end automation demos well. In a compliance-bound domain, it fails adoption. Find the unglamorous middle of the workflow and automate that.

Confidence scores are a UX decision, not a model one. What threshold flags a row for review? Who owns overrides? How do analysts report a model mistake? Those were PRD questions, not eng questions.

The best metric was analyst-reported. "I can breathe now" from one of the leads mattered more than the throughput number. Culture-of-adoption is the leading indicator; throughput is the lag.