The two-tap action item problem (and how we solved it)

Last October, a sales manager using Scribr sent me a message that stuck with me: 'I finish a call, the transcript appears, I see three things I need to do. But by the time I've switched apps to write them down, I've already forgotten one.' I read that message on a Wednesday. By Friday, we'd shipped the first version of the Action Items Widget.

Why your action items disappear the moment the call ends

The problem isn't that action items don't exist in your transcript. They do. But transcripts are linear. You have to read through the whole thing, spot the commitments yourself, then jump to your to-do list. That's friction. Real friction. In a typical fifteen-minute call, you might miss one or two tasks because your working memory was full, or you were already thinking about the next meeting.

We watched customers export transcripts, print them, annotate them by hand. Smart people, busy people, doing manual work that should be automatic. So we asked ourselves: what if the action items were waiting for you before you even asked? Not buried in a note. Not tucked behind a search. Just there, in a widget on your home screen, ready the moment the transcription finishes.

That's where the Action Items Widget lives. Pro users see it without having to open Scribr. It's small, persistent, and it updates as soon as your audio finishes transcribing and the extraction runs.

Building around the way you actually work

Early on, we debated whether action items should be extracted automatically or only when you asked for them. Some people worried about false positives. A casual 'I should probably think about that' would surface as a real commitment. We tested both approaches. Automatic extraction won because it solved the real problem: time. You don't have bandwidth to manually request extraction after every call. You need it there, waiting.

But we built in control. You can review, edit, mark items complete, or delete the ones that weren't actually commitments. The widget shows you a summary of open items across all your meetings, so you see at a glance what's pending.

The Siri integration came from the same place. Some users asked if they could voice-query their action items while driving, or between meetings. So we added a Siri shortcut called 'Get Action Items.' You trigger it by voice, and Siri reads back everything you're waiting on. No app open. No screen unlock needed. Just your voice and a list.

The difference between extracting text and extracting intent

Extraction isn't just searching for the words 'I will' or 'we need to.' Intent is harder. Someone might say, 'Let's circle back on budget next week, Sarah will pull the numbers and I'll have a look at the structure.' That's two action items, not one. Sarah has a task. You have a task. A dumb keyword search misses the nuance.

Our extraction uses the full transcript context, so it understands who said what, who's responsible for what, and what depends on what. It's more accurate than regex, and it gets better as it processes more of your meetings.

This is a Pro feature, which matters. It requires cloud processing via Deepgram; it's not something we can do on your device. Your audio stays private until you decide to send it for transcription and processing. If you're on the Free tier, you get on-device transcription only, so nothing leaves your phone. Different security model, different capabilities.

The test that changed how we thought about speed

In beta, we tracked how long it took from 'call ends' to 'action item appears in the widget.' Our first version took about ninety seconds on a five-minute recording. Not terrible. But we watched beta users, and they were checking the widget before it finished. So they'd swipe away, come back thirty seconds later, and check again. They wanted to see the tasks now, not in a minute.

We optimised the pipeline. Transcription started streaming in chunks instead of waiting for the whole file. Extraction began on partial transcripts. A rough action item list appeared within twenty seconds, and refined as the full transcript came in. That's the version you use now.

The reason I mention it is because speed sounds like a nice-to-have. In practice, it's the difference between capturing your own memory and racing it. Hit that window, and the widget becomes reflexive. You finish a call, you glance at your home screen, you see the three things you agreed to do, and you're done. No context-switching. No review-it-later pile that grows and grows.

Why we didn't add a calendar sync or meeting bot integration

Early feedback asked for Zoom, Teams, and Google Calendar integration. On the surface, it makes sense. If Scribr could see your calendar and auto-record meetings, why wouldn't you want that? We said no, and I want to explain why because it reveals something about how we think about privacy and utility.

A meeting bot means Scribr joins your call, captures the whole room. That's powerful for teams. But it's also a governance challenge, a compliance question, and a reason some organisations can't use it. We built Scribr for individuals first. You control what records and when. You own the decision to transcribe a call, a therapy session, a sensitive client conversation. That clarity matters.

So instead of a bot, we built the widget and the Siri shortcut as capture points. Quick Record on your home screen. A shortcut on your lock screen. Siri voice commands. We made it as fast as possible to start recording yourself. You hold the phone. You own the moment you decide to document.

The Action Items Widget works inside this design. It's yours. It lives in your workspace. It doesn't require team settings or IT approval. Bring your own calls; we'll help you track what you committed to.

What actually happens when you say 'Get Action Items' to Siri

The shortcut queries your Scribr data locally first. If you have Pro with Scribr Cloud sync enabled, it fetches any items from your cloud vault as well. Siri reads back the list. You can ask it to filter by date, by meeting, or by person if you've used Contact Intelligence to tag who was involved. It's conversational. You're not staring at a screen.

The technical magic here is minimal. The real win is attention. You're driving, you're between meetings, your hands are full. You ask Siri once. You get an answer. You move on.

We kept it simple on purpose. No ability to create action items through Siri, no voice-to-text editing. Just retrieval. That's the constraint that makes it fast and reliable. You capture in Scribr, review in the widget or the app, manage in your task system if you want to. Siri is the read-only window when you don't have time for anything else.

The widget and the shortcut solve a problem that doesn't sound like a technology problem until you live it. You've already done the hard work of the meeting. Why should remembering what you agreed to do feel like a second job? What would it change in your week if the next steps were just waiting for you when you needed them?