Offline Medical Dictation for Mac: No Cloud Audio, No BAA Required
Most medical dictation tools (Dragon Medical One, AWS Transcribe Medical, Google Speech-to-Text Medical) stream your audio to a server, creating a HIPAA Business Associate relationship. Sapience Med runs the entire speech model on your Mac — Apple Silicon's Neural Engine processes the audio buffer locally and discards it after transcription. No audio leaves the device, no BAA needed for the voice path.
Why most medical dictation tools are cloud-based
Until recently, accurate speech recognition required compute power not available on a laptop. Cloud services (AWS Transcribe Medical, Google Speech-to-Text Medical, Nuance's Dragon Medical One, AssemblyAI, Deepgram) host large speech models on GPU servers and stream user audio to those servers for transcription. Latency over a fast connection is roughly 150-300ms, accuracy on medical vocabulary is high, and the vendor handles the ongoing model maintenance.
The architecture made sense for hospital deployments and large medical groups where a dedicated IT team can negotiate BAAs, monitor audit logs, manage network configurations, and absorb the operational overhead. For solo therapists and small mental- health practices, the same architecture imposes a much heavier relative compliance burden — the same BAA negotiation, vendor risk review, and ongoing compliance cost, with one or two clinicians to amortize it across.
Why privacy matters more for mental-health dictation
Therapy and psychiatric notes contain particularly sensitive PHI. A progress note dictated by a therapist might include presenting problems (sexual, substance use, suicidal ideation), trauma history, family-of-origin details, and identifying information about the client. The same content protected under HIPAA also tends to be the kind of content a clinician would be most uncomfortable sending to a third-party server, even with a BAA in place.
Practical concern: even with the strongest contracts, audio sent to a cloud service exists on that vendor's servers for some retention window. Breach is rare but possible. For mental-health practitioners, "technically compliant but architecturally avoidable" is not the bar — the bar is "audio of session-adjacent notes does not leave my computer."
On-device speech recognition on Apple Silicon
Apple Silicon (M1, M2, M3, M4) makes local speech recognition practical on a laptop. The Neural Engine in these chips runs quantized speech models — Whisper-large-v3-turbo, OpenAI's state-of-the-art open-source ASR — at sub-second latency on a typical clinical dictation.
Concretely on an M1 MacBook Pro: a 15-second progress note dictation finishes transcription roughly 0.5 seconds after you release the hotkey. The text appears in your EHR or Notes field, the audio buffer is discarded, and the next dictation starts fresh. CPU usage is minimal (a brief spike during inference, idle otherwise). Battery impact is negligible — comparable to other intermittent CPU tasks.
The accuracy is comparable to cloud APIs on general English and on medical vocabulary when the model is biased with a medical dictionary (which Sapience Med ships by default). There is no quality trade-off for running locally — only an architectural simplification.
What "offline" actually means in Sapience Med
Sapience Med uses the word "offline" specifically about the audio path: your audio buffer is never transmitted off-device. Microphone capture → on-device transcription → text injection into your EHR or notes field. The audio buffer is discarded after transcription; the transcript itself is in your clipboard and the target text field, where you control it.
A few non-audio network calls do happen for legitimate operational reasons:
- License verification against license.sapience.systems — sends license_id + machine ID hash, no audio, no notes, no PHI. Happens every few hours in background.
- Update check for new app versions. Sends just the current version number; updates are signed with our Ed25519 key and verified locally before applying.
- Crash reports (opt-in) if the app crashes — stack trace only, no audio or notes content.
None of these touch your microphone, your transcripts, or your clinical notes. The app works fully offline for dictation — no internet connection needed for transcription itself. License verification gracefully tolerates network outage (cached JWT remains valid until its expiry).
What this means for HIPAA / BAA
A Business Associate Agreement is required when a vendor creates, receives, maintains, or transmits PHI on behalf of a covered entity. The triggering action is the vendor having access to PHI — audio of medical dictation typically counts as PHI.
Sapience Med's architecture is designed so that we never have access to your audio or your notes. We don't receive them, we don't transmit them, we don't store them. The license server (which we do operate) handles license_id and machine ID hash for the payment/activation flow — these are not PHI. So the BAA question for the voice path is functionally moot.
For full architectural detail and the legal reasoning, see our HIPAA architecture brief. The short version: Sapience Med is not your Business Associate because we never touch the protected health information.
Frequently asked questions
Is Sapience Med truly offline, or does it phone home?
Can I use Sapience Med without an internet connection?
What chip do I need for offline Mac dictation to be fast?
How does this compare to dictating into Apple Notes with macOS Dictation?
Is the model that runs on my Mac the same as the cloud one?
Can I verify the audio actually doesn't leave my Mac?
Try Sapience Med free for 14 days.
$45/month or $399/year (save 24%) after the trial. No card required to start.