DIFF-4 · VALIDATED ACCURACY

We publish the numbers most sleep apps hide.

A single rounded accuracy percentage sounds great until you ask: accurate at what, on what population, against what ground truth. SomniSense publishes the numbers that actually matter — per-event snore sensitivity/precision range, breathing-irregularity detection accuracy, on-device model footprint, per-night BRI vs PSG event-rate agreement — plus the methodology you can read end to end.

Join the waitlist

Launching soon. First 7 days free at launch · then $7.99/mo or $49.99/yr.

Here's the short, less-flattering version most apps won't give you: there isn't one accuracy number, there are four. Against in-lab polysomnography over n=80 paired nights, snoring detection runs 91–93% sensitivity and 89–94% precision; breathing-event detection is 88.5% accurate, landing within ±5 of the PSG count on 87% of nights. Audio was scored by blinded AASM technicians. It's a wellness monitor, not a diagnosis — and the methodology is open enough that you can argue with it.

89-94%

SRI precision

5-seed bootstrap range · snoring detection

88.5%

BRI accuracy

production on-device model · breathing irregularity

87%

BRI±5 agreement

system-level · n=80 · preliminary

n=80

Paired PSG nights / 40 participants

blinded scoring by AASM techs

The single-number accuracy problem

Most consumer sleep apps quote a single rounded "accurate" percentage. I'll be honest — that kind of number used to mean nothing to me, and I built one of these things.

"Accurate" at what? Detecting that there was a sound? Distinguishing snores from background noise? Counting the right number of breathing pauses in a night? Different questions, different answers. Collapsing them into one rounded number is what you do when you don't want people to look closely.

The four numbers we actually have

The question	The number	What that means
Of the snores you make, how many do we catch?	91-93%	9 out of every 100 snores get missed.
When we say something is a snore, how often are we right?	89-94%	False positives are rare.
Of breathing pauses that happen, how many do we flag?	~87%	BRI (Breathing Irregularity Index) sensitivity. Production on-device model. We tune precision-first — borderline events get missed on purpose, rather than waking you with false alarms.
When we say a pause happened, how often did one really happen?	~87%	BRI precision. Combined apnea + hypopnea events. From the Coordinate-Attention 1D baseline; 50% L1-pruned production model preserves this.
How does our per-hour event rate compare to PSG event scoring?	87%	Per-night BRI within ±5 of the PSG-scored per-hour event rate for 87% of nights. Preliminary system-level result; full by-severity analysis in a forthcoming preprint. BRI is not an OSA diagnosis.

These come from n=80 paired PSG nights — meaning we ran SomniSense on a phone next to the same person, on the same night, in a sleep lab where they were also being recorded with a real polysomnography setup. Audio was scored by AASM-trained sleep technicians who didn't know what SomniSense had said.

That last part — "didn't know what we said" — is what blinded scoring means. We don't get to pre-train our scorers on our own answers. Otherwise the test would be circular.

Why I'm publishing this before the paper

The honest reason: the academic peer review for the paper is in active preparation. It hasn't published yet. Once submitted, peer review typically takes another 3–6 months, plus the patent application timeline.

If I waited, you'd have nothing to compare other apps against in the meantime. So I'm publishing the numbers and the methodology now, with the explicit caveat: they're from our internal study and haven't been peer-reviewed yet. When the paper publishes, this page gets the citation. If peer review changes any number meaningfully, this page changes — and I'll explain what and why.

That's the deal. I'd rather tell you something that might be slightly off and let you push back than say nothing for six months.

What you should know about the methodology

Full methodology at /accuracy; the full study and preprint portfolio live at apneasense.com/research. The short version:

Sample size: 80 paired PSG nights / 40 participants, adults with and without diagnosed sleep breathing issues. We're specific about who's underrepresented — mostly under-18, severe BMI extremes, certain ethnic groups — rather than hide it behind an aggregate "nights" number.
Recording setup: a smartphone on the bedside table, 50–90 cm from the participant's head. iPhones and mid-range Androids from 2018 onward.
Ground truth: in-lab polysomnography with synchronized audio, scored by AASM-trained sleep technicians blinded to what SomniSense said.
Analysis: per-event sensitivity and precision (paper-reportable); per-night Bland-Altman agreement of BRI vs PSG-scored per-hour event rate (system-level validation; BRI is an acoustic estimator of the AHI-shaped metric, not a clinical OSA diagnosis).

The algorithm builds on years of sleep apnea research. The current version was retrained from scratch and rebuilt for SomniAI LLC to handle smartphone audio specifically — different microphone, different distance, different acoustic environment than clinical hardware. Peer-reviewed publication is in active preparation; a U.S. provisional patent application has been filed and is pending.

What this isn't

Not a diagnostic claim. Even at these numbers, SomniSense isn't a medical device, doesn't diagnose obstructive sleep apnea (OSA), and isn't validated for users under 18.
Not a personal guarantee. Your bedroom might be acoustically unusual. Your partner might snore louder than you. The model might catch fewer of your events. The methodology paper documents the conditions we tested under — read it if you want to know whether your scenario is in or out of distribution.
Not a replacement for a sleep study. If your BRI runs above 15 consistently, that's a clinic conversation. We give you data to bring. The clinic gives you the diagnosis.

See the data on your own nights

First 7 days of Pro are free · Cancel through the App Store or Google Play before day 7 to avoid the renewal charge.

The single-number accuracy problem

The four numbers we actually have

Why I'm publishing this before the paper

What you should know about the methodology

What this isn't

Read next