Building HeatSync Part 6: Getting Accuracy Right

Jan 29, 2026

With caching in place, HeatSync was finally fast. But fast is worthless if it’s wrong. And the AI was getting things wrong.

Two accuracy problems kept showing up:

Name confusion: mixing up swimmers with similar names
Missing events: returning incomplete results

Both are unacceptable. Missing your kid’s race because the tool missed an event defeats the entire purpose.

Problem 1: Name Confusion

Heat sheets often have multiple swimmers with the same last name. In my kid’s club alone, we have:

“Liu, Elsa”
“Liu, Elly”

Different kids, same last name. And the AI would sometimes return events for the wrong one.

Even worse: phonetically similar names. “Li, Elsie” sounds like “Liu, Elly” but is a completely different swimmer. The AI occasionally matched these too.

Solution: Two-Layer Defense

I implemented defense at two levels:

Layer 1: Prompt Engineering

Explicitly tell the AI not to match similar names:

IMPORTANT: There may be MULTIPLE swimmers with the same LAST NAME.
Do NOT include events for swimmers with similar names
(e.g., 'Liu, Elsa' is NOT 'Liu, Elly').
Do NOT match phonetically similar names
(e.g., 'Li, Elsie' is NOT 'Liu, Elly').

This catches most cases. But AI prompts are probabilistic, not deterministic. Sometimes it still slips through.

Layer 2: Post-Processing Validation

After getting results from the AI, the backend validates each event by normalizing both names and comparing them. If the AI returns events for “Liu, Elsa” when the user searched for “Liu, Elly”, those events get filtered out.

Belt and suspenders. The prompt layer catches most errors; the post-processing layer catches the rest.

Problem 2: Missing Events

Some heat sheets are 20+ pages. The AI would sometimes return early, missing events on later pages.

This is a known LLM behavior: when the input is long, models sometimes “give up” before processing everything. They’re trained to be helpful, and returning something feels more helpful than processing forever.

Solution: Preprocessing + Expected Count

Before calling the AI, I extract the PDF’s plain text and count how many times the swimmer’s name appears. This count becomes the “expected event count.” Then I tell the AI:

You MUST find at least {count} events for this swimmer.
If you find fewer, re-scan ALL pages before returning.

Now the AI knows how many events to look for. If it only finds 3 but the expected count is 5, it knows something is wrong and re-checks.

Edge Case: Scanned PDFs

Some heat sheets are scanned images, not searchable text. The text extraction returns nothing, so we can’t count occurrences.

For these cases, the preprocessing gracefully fails, and we fall back to extraction without the expected count hint. Not ideal, but better than crashing.

Swimmer Disambiguation UI

Sometimes the AI correctly identifies multiple swimmers with the same name but different teams or ages. “John Smith” might be:

John Smith, Team A, Age 11
John Smith, Team B, Age 13

Instead of guessing, I show a disambiguation dropdown:

Multiple swimmers found. Please select:
┌─────────────────────────────────────┐
│ John Smith (Team A, 11)             │
│ John Smith (Team B, 13)             │
└─────────────────────────────────────┘

The user picks their swimmer, and the events filter accordingly.

The Result

After implementing these accuracy improvements:

Metric	Before	After
Name confusion errors	Occasional	Rare
Missing events	~10% of extractions	under 1%
Overall accuracy	~90%	over 98%

Good enough to trust. Which is the whole point.

With accuracy and performance solved, one feature remained that transformed HeatSync from a personal tool into something genuinely useful for a team.

This is Part 6 of a series on building HeatSync. ← Part 5: The Performance Problem | Part 7: The Sharing Moment →

Tags: