Audio Description (AD) is a narrated accessibility track that describes key visual elements of a programme — action, scene changes, on-screen text, and non-verbal communication — for viewers who are blind or have low vision. While the creative writing and voice performance are often handled by specialist teams, the technical mixing and delivery of the AD track falls squarely on the post-production audio engineer. Getting it wrong means your deliverable will be rejected. Getting it right means making content genuinely accessible to millions of people.
This guide covers the complete technical workflow — from session setup through to platform-compliant delivery.
1. What Is an Audio Description Track?
An Audio Description track is an additional audio programme that runs alongside the main programme mix. It is mixed at a level that allows the narrator's voice to be clearly intelligible while the original programme audio — dialogue, music, effects — continues underneath at a reduced level.
There are two primary delivery formats:
- AD Stereo Mix: A dedicated stereo file containing the full programme mix with the AD narration embedded. This is the most common format for broadcast and streaming.
- AD Narration Stem Only: A clean narration-only track (no programme audio) that the platform or broadcaster mixes itself in real time. Required by some OTT platforms and broadcasters for maximum flexibility.
Key principle: The AD narrator must always be clearly intelligible — even when speaking over music, action sound effects, or the programme's original dialogue. Intelligibility takes precedence over everything else in AD mixing.
2. Session Setup
The starting point for any AD session is the approved, locked final programme mix — your M&E (Music & Effects) stems, dialogue stems, and the full stereo or 5.1 mix. You will need:
- The fully approved final programme mix (stereo and/or 5.1 printmaster)
- The AD narration recordings — usually delivered as a clean WAV file, sync'd to timecode
- The M&E stems (for ducking the programme underneath the narration)
- A loudness meter capable of reading integrated LUFS and true peak (e.g. Nugen Audio VisLM, iZotope Insight, or Waves WLM Plus)
Set your session to match the programme's sample rate and bit depth — typically 48 kHz / 24-bit for broadcast and streaming. Your project timecode must match the locked picture cut.
3. Processing the AD Narration Voice
The AD narration voice recording should ideally arrive clean, with minimal room noise and consistent gain. However, you will still need to process it for the final mix.
EQ
Apply a high-pass filter (HPF) at around 80–100 Hz to remove any low-frequency handling noise or room rumble. Gentle presence boosts in the 2–5 kHz region can aid intelligibility without making the voice sound harsh. Cut any harshness in the 3–6 kHz range if the recording is bright. The goal is a clear, warm, and effortless voice — not an EQ'd radio voice.
Compression
A gentle compressor with a ratio of 2:1 to 3:1 and a medium attack (10–20 ms) helps even out dynamic variations in the narrator's performance. Avoid heavy compression, which can introduce pumping artefacts and make the voice sound unnatural against the programme audio. A de-esser is recommended if the recording has pronounced sibilance — particularly important because the AD voice is often placed in the centre of the stereo field where sibilance is most audible.
Noise Reduction
If the narration recording has perceptible background noise, use a spectral repair tool (iZotope RX, Cedar) to apply transparent noise reduction. Be conservative — heavy processing will introduce artefacts that are immediately noticeable in the mix.
A practical setup for audio description workflows — built to help you manage narration clarity, ducking structure, loudness control, and delivery-ready session organisation.
Get It Now4. Mixing the AD Narration Against the Programme
This is the core skill of AD mixing. The narration must sit clearly above the programme audio, but the programme audio must still be audible — it should not feel completely silenced or absent when the narrator speaks.
Ducking the Programme Mix
The standard approach is to automate a level reduction on the programme mix bus whenever the AD narration is present. Typical ducking amounts range from −6 dB to −12 dB on the programme mix, depending on the complexity of the programme audio at that moment. Action sequences with loud SFX may require more ducking; quiet dialogue scenes may require very little.
Automation is essential here — do not use a sidechain compressor triggered by the AD voice as a set-and-forget solution. Each cue needs to be assessed individually. Your ducking automation should use smoothed ramps (not hard cuts) with attack times of around 100–200 ms and release times of 300–500 ms to avoid jarring transitions.
Placement in the Stereo Field
The AD narration voice should be placed in the centre of the stereo field. Do not pan it left or right. This ensures it is clearly differentiated from programme dialogue and is equally audible on both stereo and mono playback systems (important for broadcast compliance).
Fitting Narration to Available Windows
AD narration is scripted to fit into natural windows in the programme's original audio — pauses in dialogue, gaps between scenes. However, real-world recordings rarely fit perfectly. As the AD mixer, you may need to work with the AD writer to trim narration, or creatively adjust ducking to make longer cues work without losing intelligibility. The narration must never talk over critical programme dialogue.
5. Loudness Standards for AD Delivery
Loudness compliance is mandatory for broadcast and OTT delivery. The AD mix must meet the same loudness targets as the main programme mix.
| Platform / Standard | Integrated Loudness | True Peak | LRA |
|---|---|---|---|
| EBU R128 (European Broadcast) | −23 LUFS (±1 LU) | −1 dBTP | ≤ 20 LU |
| ATSC A/85 (US Broadcast) | −24 LKFS (±2 LU) | −2 dBTP | — |
| Netflix | −27 LUFS (±1 LU) | −2 dBTP | ≤ 18 LU |
| Apple TV+ | −27 LUFS (±1 LU) | −1 dBTP | ≤ 18 LU |
| Amazon Prime Video | −24 LUFS (±1 LU) | −2 dBTP | ≤ 20 LU |
Measure the integrated loudness of the entire AD mix — not just the narration-only segments. The programme audio that continues underneath the narration is part of the mix and will affect your overall LUFS reading. Always do a full-pass loudness measurement before final export.
Common mistake: Mixing the AD narration too loud to ensure intelligibility, which then causes the overall integrated loudness of the AD mix to exceed the target. Use your loudness meter continuously while mixing — not just at the QC stage.
6. Delivery Formats
Always confirm the exact delivery specification with your client or broadcaster before starting. Common formats include:
- Stereo WAV (48 kHz / 24-bit): A complete stereo AD mix (programme + narration). The most common broadcast delivery format.
- Narration Only WAV (48 kHz / 24-bit): Clean narration track, no programme audio, timecode-locked. Submitted separately alongside the main programme mix stems.
- Embedded in MXF or ADM BWF: Some broadcasters and OTT platforms require the AD track delivered as an additional audio programme within an MXF container or an ADM BWF (Broadcast Wave Format) file, alongside the main mix. This is particularly common for Dolby Atmos deliverables.
File naming conventions vary by broadcaster and platform — always follow the tech spec document exactly. Incorrect file naming is one of the most frequent causes of deliverable rejection.
7. Quality Control Checklist
Before submitting, run through the following QC checks:
- Full playback of the AD mix from start to finish — listen for any sync errors, dropped cues, or narration that overlaps with programme dialogue.
- Integrated loudness measurement passes target (±1 LU tolerance).
- True peak does not exceed the specified ceiling.
- Narration is clearly intelligible on all cues — test on consumer-grade speakers and headphones, not just studio monitors.
- No audible ducking artefacts — transitions into and out of ducked sections sound smooth and natural.
- File format, sample rate, bit depth, and naming convention match the delivery spec.
- Timecode and sync reference confirmed against the locked picture.
Speed up your AD workflow with a cleaner template structure for narration management, controlled ducking, intelligibility-first mixing, and compliant final delivery.
Get It Now8. Workflow Summary
Audio Description mixing is a specialist but learnable skill. The core principles are consistent: prioritise intelligibility, automate ducking carefully with smooth ramps, mix to loudness targets throughout the session rather than correcting at the end, and always confirm the delivery specification before you start. A well-executed AD mix is invisible to sighted viewers and genuinely life-changing for those who depend on it — that combination of technical precision and real-world impact is what makes AD work one of the most rewarding specialisations in audio post-production.