Why we built cloud transcription for the bits on-device can't handle

Three months after Scribr shipped, a consultant in Manchester sent us a voice note. Forty-two minutes long. She'd recorded a client call, and our on-device transcription choked halfway through. The app didn't crash, but it should never have been silent about the limit. That email changed how we thought about transcription.

The moment we realised on-device wasn't the whole story

When we first built Scribr, we were obsessed with privacy. Still are. On-device transcription via Whisper and Apple Speech means audio never leaves your phone, your notes stay encrypted, and you own the entire transcript. It's a genuine advantage over web-based tools.

But real work doesn't fit into tidy boundaries. A therapist records a 90-minute session. A researcher interviews someone for two hours. A sales call runs long. And suddenly the on-device option becomes a liability, not a feature.

The Manchester consultant's message came on a Tuesday. She'd tried the app, hit the wall, and moved on. We lost her not because Scribr was wrong, but because we'd shipped an incomplete solution and hadn't been honest about it.

Choosing a partner for the work on-device can't do

We needed cloud transcription. Not as a replacement for on-device, but as a complement. The decision came down to speed, reliability, and cost efficiency. Deepgram won because they process long audio without the usual latency tax.

A call that would tie up a phone's processor for twenty minutes transcribes in seconds on Deepgram's infrastructure. That matters when you're working on mobile. It matters when you're a student between lectures trying to review notes, or a consultant who needs to log a call and move to the next one.

We built it so you choose the transcription path that fits the moment. A quick voice memo about next week's meeting? On-device, instant, stays private. A two-hour research interview? Cloud transcription, and you get summaries and action items extracted alongside it.

What changed when we shipped it

The first week we shipped cloud transcription for Pro users, we saw 340 uploads of audio longer than 15 minutes. Ninety per cent of them would have failed or struggled on-device. We also learned something we didn't expect: people were more willing to record conversations once they knew the app could handle them reliably.

Consent matters here. Cloud transcription is not default. Free users never see it. Pro and Team users opt in per transcript, and we're explicit about what happens to the audio (it's processed and then deleted from Deepgram's servers, never stored, never trained on). Vault Mode, our AES-GCM encrypted storage, means even if you choose cloud transcription, your notes stay yours alone.

The speed changed the workflow. A Team leader in Bristol told us she could now record a performance review, have a transcript and summary ready within ninety seconds, and share specific action items with the person before they left her office. That's not possible on-device. That's the asymmetry we wanted to solve.

The limit you can see is better than the one you can't

Free users get 5 audio uploads per month with on-device transcription only. Pro gives you 500 cloud-transcribed calls monthly. Team gives you 1,500. These numbers are real constraints, not marketing theatre. We publish them because you should know what you're signing up for.

What we learned from the Manchester consultant's email is that hidden limits are worse than announced ones. If the app had said, 'On-device transcription works best for audio under 15 minutes', she would have understood. Instead, she just experienced silence.

Cloud transcription via Deepgram lets us be honest. You have a defined monthly allowance. You can see what you've used. If you need more, you upgrade or talk to us about Enterprise. No surprises, no artificial gatekeeping dressed up as a feature.

Why this matters beyond the numbers

The real win isn't speed or capacity. It's that we could finally say yes to the work Scribr is actually built for. A researcher with a 90-minute interview. A legal professional with a long consultation. A student recording a lecture. A therapist capturing a session. These are the people Scribr exists for, and on-device transcription alone left them short.

Cloud transcription also unlocked the other half of what makes Pro useful: AI summaries and action-item extraction. You can't extract action items from audio reliably on a phone. You can on a server with the bandwidth to run inference. So we built it so that when you choose cloud transcription, you get both the transcript and the distilled version that actually changes your day.

There's still a boundary: Team and Enterprise users also get Contact Intelligence and Note Sharing because once audio becomes actionable intelligence, you often need to pass it on. Free users don't get that. Privacy is the trade-off, not the hiding place.

The Manchester consultant never came back, but her message stayed. We fixed the silence. Now the question isn't whether Scribr can handle your longest recordings. It's whether you're ready to trust your phone with the meetings that actually shape your work.

Want to try Scribr?

Visit Scribr →