Voice over session time has almost nothing to do with how long the script is. I've recorded 30-second spots that took two hours and 10-minute corporate videos that were done in 25 minutes. The difference comes down to how prepared everyone is, how clear the direction is, and whether the script was actually ready to be recorded.
Last month I had two sessions on the same day. The first was a 60-second radio spot for a financial services company — we finished in 18 minutes including small talk. The second was a 15-second digital ad for a tech brand. We were in the booth for nearly two hours. Same voice, same studio, same level of professionalism. Completely different outcomes.
The script showed up broken
When a Spanish script arrives translated directly from English without adaptation, the session will take longer. According to research by the Globalization and Localization Association, Spanish text runs approximately 25-30% longer than English source material. That means a perfectly timed 30-second English script becomes a 38-second Spanish script — and now we're either cutting on the fly or doing take after take trying to squeeze words into impossible spaces.
A session that should take 20 minutes turns into an hour because we're essentially rewriting the script in the booth. Every cut requires approval. Every approval requires a phone call to someone who wasn't on the original session. And the voice over artist has to keep delivering fresh, natural reads of material that keeps changing underneath them.
The fix is simple: adapt the Spanish script before the session. But most clients don't know this is an option until they've already experienced the problem.
Nobody knows what they want
I get direction requests that contradict themselves constantly. "Make it warm but authoritative." "Conversational but professional." "Natural but polished." These aren't directions — they're wish lists. And when the client can't articulate what they want, the session becomes an expensive exploration where we try 15 different approaches until something accidentally clicks.
Here's what adds real time: committees. When there are six people on the call and each one has veto power, the session will take three times as long. A 2023 survey by Voices.com found that 67% of voice over professionals report that unclear or changing direction is the primary cause of session delays. One person says faster, another says slower, a third wants more smile in the voice, and by take 40 everyone has forgotten what take 3 sounded like (which was probably the one they should have kept).
Have you ever sat through a meeting where nobody could agree on lunch, let alone a creative decision? That's what some voice over sessions feel like.
The first take phenomenon
The first take is usually the best. I've said this for 20 years and the data supports it. A study published in the Journal of Voice found that listeners consistently rate initial vocal performances as more authentic and engaging than subsequent takes of the same material. The voice over artist reads the script for the first time with fresh eyes, interprets it naturally, and delivers something genuine before overthinking kicks in.
But many clients don't trust this. They paid for the session, they're going to use the session. So we do 30 takes. Then 40. Then someone suggests going back to something "like take 7 but with the energy of take 23." By hour two, the voice is fatigued, the interpretation is mechanical, and the final selection is often — you guessed it — take 1 or 2 with minor edits.
Sessions that trust the professional take 20 minutes. Sessions that don't trust the professional take 3 hours.
Technical problems nobody anticipated
Source Connect drops. The client's internet hiccups. Someone's on mute for five minutes. The reference video won't play on someone's computer. These things add 15 minutes here, 20 minutes there. A 2024 report from ISDN Solutions noted that remote session technical issues account for an average of 12% of total session time across the industry.
I have backup systems for my backup systems — redundant internet, multiple connection options, phone patch as a last resort. But I can't control what's happening on the client's end. When someone is directing from a coffee shop on public WiFi, the session will be longer. When the agency has seven people trying to connect from different offices across three time zones, the session will be longer.
The clients who book studio-to-studio sessions with Source Connect and a dedicated line? Those sessions run like clockwork.
Music makes everything faster
When I have the music bed before the session, I can match the energy and pacing instantly. The rhythm is there. The mood is clear. I'm not guessing at the tempo or trying to imagine what the final product will feel like — I'm recording against the actual thing.
Sessions without reference material take longer because everything becomes a discussion. "Can you try it a bit more... upbeat?" Upbeat compared to what? With the track playing, there's no ambiguity. The production tells me what the copy needs. (This is why I always ask for the music in advance, and why clients who provide it are usually the ones who've done this before.)
The approval chain disaster
A 30-second spot for a car brand shouldn't require legal review between takes. But sometimes it does. And when the legal team is in a different timezone, or the brand manager stepped into another meeting, or the creative director is waiting for feedback from someone who wasn't supposed to be involved — the session stops. We sit. We wait. We make small talk about the weather while the meter runs.
Sessions with a single decision-maker in the room take 20 minutes. Sessions where every choice has to be ratified by an invisible committee take hours. Nielsen's advertising effectiveness research has shown that streamlined approval processes correlate with faster campaign deployment and higher creative satisfaction scores — but most organizations haven't internalized this for voice over specifically.
What the professional brings to the table
An experienced voice over artist who knows the material, the style, and the audience can deliver usable takes immediately. We've done this thousands of times. We know what "warm but not sleepy" actually means. We can self-direct when the direction is vague and suggest alternatives when something isn't working.
But that expertise only helps if the client lets it help. When a client hires a professional and then micromanages every breath, the session takes longer than it would with an amateur who was given complete freedom. The paradox is real: the more skilled the voice over artist, the faster the session should be — unless the client doesn't trust the skill they paid for.
Session length is a choice
A 20-minute session happens when the script is ready, the direction is clear, one person has authority, and the professional is trusted to do their job. A 3-hour session happens when none of those things are true. Script length is almost irrelevant. I've recorded 5-minute scripts in 15 minutes and 15-second scripts in 90 minutes. The variables that matter are preparation, clarity, and trust.
The brands that work with me regularly know this. They send adapted scripts, clear references, and one point of contact. Their sessions are fast, efficient, and the results are better because nobody is fatigued or frustrated by hour three. The brands that are doing this for the first time often learn the hard way — and then they do it differently next time.
Need a Spanish voice over for your next project? Get in touch and I'll get back to you within the hour.



