Why we built Scribr around on-device transcription

Three months before launch, a therapist emailed asking one question: 'Will my client sessions ever leave my phone?' That single message crystallised something we'd been circling around. We had to make a choice about where transcription would happen. And that choice would define everything.

The privacy question arrived before we'd finished the app

Early user testing showed us something we hadn't fully anticipated. People weren't worried about features. They were worried about what happened to their audio after they hit record. A consultant told us his firm had legal obligations around client confidentiality. A student mentioned FERPA compliance. A therapist, obviously, couldn't use a tool that sent session recordings anywhere they couldn't control.

We could have shrugged. Most note-taking apps send your audio to a cloud service immediately, transcribe it there, and delete it after a few days. That's a perfectly reasonable architecture. It's also a perfectly reasonable reason for a lot of people not to use your app.

So we asked ourselves: what if transcription never left your phone at all?

Whisper and Apple Speech weren't the obvious picks

When we started looking at on-device options, the landscape was actually pretty sparse. You had Whisper, OpenAI's speech recognition model, which had been released as open source and was getting quietly integrated into a bunch of apps. And you had Apple's native speech framework, which had been sitting there in iOS for years but hadn't seen much innovation.

Both had limitations. Whisper was more accurate but hungrier for processing power. Apple Speech was lighter but less flexible. We could have built one or the other. We could have built some hybrid. What we actually did was let the user decide.

Free users get on-device transcription. That's Whisper on iOS. Your audio lives on your phone. It never gets sent anywhere. It's transcribed locally, then you're done. That's the core promise. For users who need something different - longer recordings, cloud sync, AI summaries - we offer cloud transcription via Deepgram at the Pro tier and above. But that's an explicit choice, and it requires a paid subscription. You're not sliding into it without knowing.

The architecture has to match the promise

Building this way meant saying no to things we could have easily done. We couldn't offer cloud transcription on the free tier. We couldn't use a single transcription pipeline for everyone. We couldn't store audio anywhere 'just in case' we needed it later for quality improvements. Every architectural decision had to reinforce the same message: if you're on Scribr free, your audio doesn't leave your device.

That constraint actually made us better at the core product. We had to make on-device transcription work well. We couldn't rely on the crutch of better cloud processing. We built the Quick Record widget. We built Siri shortcuts for starting recordings and retrieving action items. We made sure the app itself was snappy enough that you didn't feel like you needed cloud sync for basic note-taking.

It also meant building for different use cases. Some people want transcription and nothing else. Some want summaries and action items extracted. Some need encrypted vaults. Some are legal teams with audit logs and compliance modes. Rather than cramming everything into one tier, we let the product grow with what you actually need.

A real conversation changed how we think about this

A few weeks before we shipped, someone from a legal practice reached out through our testing community. She was frustrated with her current tools - they were fast, feature-rich, cloud-native, and absolutely not an option for her firm's work. She told us she'd been taking notes by hand during calls because she couldn't trust anything else. That felt like the problem we should be solving.

It made the decision simpler. We weren't building Scribr as a general-purpose transcription service with privacy as a nice-to-have. We were building it for people whose work depends on confidentiality. The free tier - with on-device transcription, biometric lock, no cloud syncing - is a complete, functional product for those people. Paid tiers add features, not functionality you can't live without.

What this means for how you use Scribr

If you sign up free, you can record meetings, phone calls, voice notes, and upload audio files. You get five uploads a month. You can lock the app with your face or fingerprint. Your audio gets transcribed on your phone and stays there. That's the whole product, and it's genuinely private.

If you want cloud transcription for longer files, or summaries, or action items automatically extracted, or a way to sync your notes across devices, that's the Pro tier. If you're a sales team that needs contact intelligence or shared notes, that's Team. None of it is hidden behind dark patterns or surprise paywalls. The trade-off between privacy and features is transparent from the moment you open the app.

We chose Whisper and Apple Speech because they let us keep that promise. Other transcription tools could be faster or smarter. But they couldn't be local. And for Scribr, local is the point.

When you're choosing where to trust your recordings - your conversations, your thinking, your work - does it matter whether a company promises privacy, or whether the technology they chose makes privacy physically impossible to violate?

Ready to try Scribr by MRVL?

One tap to download. No sign-up wall.

Get it on the App Store