What is latency and how do I reduce latency in VoIP?

Calls feel awkward when speech arrives late.
Delays stack up across devices, Wi-Fi, and WAN paths.
Fix the path and settings; the conversation feels natural again.

Latency is one-way voice delay from mouth to ear. Keep it ≤150 ms one-way (≤300 ms RTT). Reduce network hops, queue time, Wi-Fi contention, and oversized buffers to keep speech snappy.

DJSlink IP phone call explaining mouth encode jitter buffer decode audio latency path
VoIP Call Latency

Latency comes from encode time, network transit, jitter buffers, and decode time. Jitter adds variation on top. Packet loss makes it worse. You tackle delay at each stage: wiring, Wi-Fi, WAN, devices, and PBX/SBC. Start with measurement, then apply small, proven fixes.


What causes high latency on my SIP calls—routing, Wi-Fi, firewalls, or ISP?

Delays hide in plain sight.
A single bad hop or busy uplink can ruin all calls.
Find the bottleneck, then make one change at a time.

Top culprits are long WAN routes, bufferbloat on uplinks, noisy or distant Wi-Fi, busy firewalls, VPN hairpins, and oversized packetization or jitter buffers. Wired first. Short paths. Small queues.

World map showing SIP to SBC call route with hairpin path stretch
SIP Hairpin Routing

Dive deeper Paragraph:

1) Routing and ISP choices

Long AS paths, poor peering, or international hairpins add tens of milliseconds per leg. If your SIP trunk anchors in a far region, RTP trombones through a distant data center. Use providers with regional SBC/PoPs close to users. Ask the ISP about where they peer with your voice carrier. Avoid VPN hairpins that force traffic through HQ for no reason. Place SBCs or media relays near callers to keep audio local. When troubleshooting, separate signaling from media: Session Initiation Protocol (SIP) 1 setup issues can look like “lag,” while Real-time Transport Protocol (RTP) 2 path stretch causes true mouth-to-ear delay.

2) Bufferbloat and busy uplinks

Large uploads fill queues in consumer routers and some enterprise edges. Voice waits behind big TCP bursts. This increases one-way delay and jitter. Fix by enabling Smart Queue Management such as the Controlled Delay (CoDel) AQM 3 or the FQ-CoDel packet scheduler 4, or rate-limiting egress to ~90–95% of link speed. Put RTP in a strict-priority queue. Cap bulk transfers. This single change often drops mouth-to-ear delay by dozens of milliseconds during busy hours.

3) Wi-Fi friction

Wi-Fi adds contention, retries, and power-save wake delays. Prefer Ethernet. If you must use Wi-Fi, pick 5 GHz (or 6 GHz), set strong RSSI (≥-65 dBm), limit sticky roaming, and enable WMM Voice. Avoid crowded channels. Disable client power-saving features that park the radio between packets. Separate voice SSIDs are fine only if you also shape data SSIDs.

4) Firewalls and deep inspection

Heavy DPI, TLS inspection, or SIP ALG can add processing delay or break mid-call updates. Disable SIP ALG unless your carrier requires it. Exempt RTP from deep inspection. For VPNs, ensure hardware offload is active and DSCP survives the tunnel. Keep stateful rules but avoid per-packet scanning on EF traffic.

5) Packetization and device load

Low-power phones or loaded PCs add encode/decode delay. Large packetization intervals (ptime 40–60 ms) add delay by design. Use 20 ms ptime (or 10 ms for Opus where supported). Keep devices cool, update firmware/DSP, and close heavy background apps.

Cause Symptom Quick Proof Fix
Long routes High base RTT tracert/mtr shows many hops Pick nearer SBC/PoP or better-peered ISP
Bufferbloat Latency spikes during uploads ping rises when speedtest runs CoDel/FQ-CoDel, egress shaping, EF queue
Wi-Fi Variable delay, robot voice Wired test is clean Wire it, or 5 GHz + WMM Voice
Firewall/DPI Consistent added delay Bypass shows drop Exempt RTP, disable ALG/DPI for voice
Big ptime Always laggy but clean audio Phone shows ptime 40ms+ Use 20 ms (or 10 ms Opus)

How do I measure latency with MOS, RTT, jitter, and packet loss tools?

You cannot fix what you cannot see.
Measure one-way and round-trip, then compare to user reports.
Keep a simple kit and run it daily.

Use ping/trace for RTT and path. Use RTCP stats and MOS from phones or SBC. Use synthetic call probes. Log jitter, loss, and one-way delay if your gear supports it. Correlate with time of day.

Engineer monitoring VoIP call quality metrics jitter packet loss MOS on wall dashboards
Call Quality NOC

Dive deeper Paragraph:

1) Round-trip vs one-way

RTT (ping) is easy. One-way needs time sync or dual probes. If you cannot do one-way, use RTT as a proxy and divide by two with caution. Sync clocks via NTP on phones, PBX, SBC, and test hosts. Some SBCs and endpoints expose RTP one-way delay in RTCP XR; use it when available.

2) What numbers to capture

  • RTT: target <100 ms within a region; <200 ms cross-continent.
  • One-way delay: for interactive speech, align targets with the ITU-T Recommendation G.114 one-way delay guidance 5.
  • Jitter (RTP): keep <20–30 ms; big swings are worse than a steady 40 ms.
  • Packet loss: aim <0.2% sustained; bursts are more harmful.
  • MOS (listening): 4.0–4.5 good; <3.6 users notice.
    Log per call: codec, ptime, jitter buffer size, loss, and MOS. Break down by ISP, site, SSID, and time.

3) Field kit examples

  • ICMP: ping -n 50 sip.example.com for baseline.
  • Path: tracert or mtr to RTP relay or SBC.
  • Synthetic RTP: many SBCs can generate a test stream to a handset; read RTCP back for jitter/loss.
  • Phone screens: most SIP phones show live RTP stats (jitter, pkt loss, MOS). Take photos during bad calls.
  • Wi-Fi: netsh wlan show interfaces for RSSI/PHY rate; use a spectrum view if possible.

4) Interpreting MOS

MOS depends on codec, loss, concealment, and delay. Do not compare Opus MOS to G.711 blindly. Watch MOS vs time and vs load. A MOS valley at lunch hours points to uplink contention. MOS drops after a VPN change point to MTU or encryption overhead.

Metric Good Caution Bad
RTT (regional) <50 ms 50–100 ms >100 ms
One-way delay <100 ms 100–150 ms >150 ms
Jitter (RTP) <20 ms 20–30 ms >30 ms
Loss (avg) <0.2% 0.2–1% >1%
MOS (G.711) ≥4.1 3.7–4.0 <3.7

Which QoS, codecs, and jitter buffers reduce one-way delays on my network?

Packets do not care about job titles.
Give voice the fast lane, keep frames small, and set sane buffers.
Small, steady improvements stack up.

Mark RTP EF (46), queue it with strict priority, and enable smart queueing on WAN. Use Opus or G.711 at 20 ms ptime. Keep jitter buffers small and adaptive. Avoid big frames and heavy VAD/PLC tuning unless tested.

Wireless router and network devices illustrating VoIP bandwidth usage per connected endpoint
VoIP Router Topology

Dive deeper Paragraph:

1) QoS end-to-end

  • Marking: RTP DSCP is carried in the Differentiated Services (DS) field 6. Mark RTP EF (46), SIP CS5/AF31. Trust at access ports for phones. Remark at WAN edge.
  • Queuing: Use strict-priority/LLQ aligned to the Expedited Forwarding (EF) PHB 7, but cap it (e.g., 20–30% of link) so voice stays fast without starving data. Shape bulk data below link rate so priority always has headroom.
  • Preservation: Make sure VPN/SD-WAN keeps DSCP. Many tunnels zero markings by default. Map EF to the best class on provider edge.

2) Codec and packetization

  • Opus (narrow/wide/super-wide): resilient to loss, supports 10–20 ms ptime, good quality at lower bitrates. Check handset and trunk support.
  • G.711: simple, universal, works well at 20 ms. Plan ~80–90 kbps each way including overhead.
  • G.729: compresses well but adds codec delay and is fragile under burst loss. Use only if bandwidth is tight and consistent.
  • ptime: Favor 20 ms. 10 ms can reduce jitter sensitivity but raises packets per second. Avoid 40–60 ms frames; they add delay by design.

3) Jitter buffer tuning

Start with adaptive jitter buffers with minimum 20 ms and maximum 60–80 ms. If jitter is low and stable, reduce the max to cut mouth-to-ear delay. If users hear “robot” or gaps, increase the min by 10 ms and test again. Do not set huge buffers to mask a bad path; fix the path.

4) WAN shaping and queue math

Apply FQ-CoDel or PIE on egress. Set shaper to ~90–95% of ISP rate to prevent queue buildup at the modem. Reserve bandwidth for EF equal to ConcurrentCalls × per-call kbps + 25% headroom.

Example

20 calls on G.711 → ~1.8 Mbps RTP each way. Reserve ~2.5 Mbps for EF. Shape the link to 90% and cap bulk queues.

Tuning Area Setting Why
DSCP EF (46) on RTP Short queue time
Queue Strict priority w/ cap No starvation, low delay
Codec Opus (10–20 ms) or G.711 (20 ms) Low encode + network delay
Jitter buffer 20–60 ms adaptive Smooths variance, avoids extra lag
WAN SQM FQ-CoDel/PIE @ 90–95% Kills bufferbloat

Should I use VLANs, SD-WAN, or dual ISPs to lower VoIP latency?

Structure beats hope.
Separate voice, steer around congestion, and keep a spare road.
Do it once, then sleep better.

Yes—use a voice VLAN to control QoS, consider SD-WAN for path steering, and add dual ISPs for redundancy. These reduce latency variation and prevent long outages from killing calls.

Network diagram separating data VLAN and voice VLAN with LLDP MED and DSCP QoS
Voice Data VLAN QoS

Dive deeper Paragraph:

1) Voice VLANs

Create a Voice VLAN with its own gateway, DHCP, and ACLs. Trust DSCP at the access switch for that VLAN. Keep phones off chatty data broadcasts. LLDP-MED can place phones on the right VLAN automatically. This isolation makes QoS honest, keeps multicast paging clean, and reduces collision with large data flows.

2) SD-WAN and path control

SD-WAN measures loss, jitter, and delay per path and can move new RTP flows to the cleaner link in seconds. It can also bond or fail open when a link degrades. Ensure your SD-WAN preserves DSCP and does not re-order voice packets. Use policies: voice prefers low latency, high availability paths; data prefers cheap bandwidth.

3) Dual ISPs and diversity

Two links beat one, but only if they are diverse. Use different media (fiber + cable, fiber + fixed wireless) and different last-mile paths. Add automatic failover (SD-WAN or BGP). Anchor calls on an SBC so a WAN flip does not drop handset legs mid-call. Test failover quarterly with real calls.

4) Avoid hairpins

If you use VPN for security, avoid routing RTP through HQ by default. Allow direct media to the nearest SBC or cloud POP. Set split tunneling with care so SIP/TLS and RTP take the shortest, monitored path.

5) MTU and tunnels

Tunnels reduce MTU. If you see call setup succeed but audio drops or stalls, check MSS clamping and fragmentation. Align MTU across sites. A single bad clamp adds delay through retries and fragmentation.

Design Choice Latency Impact Complexity Note
Voice VLAN Lowers jitter/delay Low Easiest win
SD-WAN Steers away from bad paths Medium Needs tuning
Dual ISPs Prevents outages, lowers variation Medium Use diverse media
SBC anchoring Saves mid-call during failover Medium Keep public reachability
MTU alignment Removes hidden stalls Low Set once, verify often

Conclusion

Lower latency comes from short paths, small queues, sane packet sizes, and right-sized buffers. Wire what you can, shape the WAN, trust DSCP, and keep a clean, local media route.


Footnotes


  1. SIP spec for understanding signaling timers, retransmits, and why setup “lag” differs from media delay.  

  2. RTP spec explaining real-time media transport, timing sensitivity, and why delay/loss directly impacts voice.  

  3. CoDel standard describing how AQM controls bufferbloat-generated excess delay in router queues.  

  4. FQ-CoDel standard for fair-queuing plus AQM to reduce latency under load on busy uplinks.  

  5. ITU guidance on one-way delay thresholds for interactive speech and when user experience degrades.  

  6. Defines the IP header DS field that carries DSCP markings used for end-to-end QoS classification.  

  7. Defines Expedited Forwarding behavior for low delay, low jitter, low loss traffic like voice.  

About The Author
Picture of DJSLink R&D Team
DJSLink R&D Team

DJSLink China's top SIP Audio And Video Communication Solutions manufacturer & factory .
Over the past 15 years, we have not only provided reliable, secure, clear, high-quality audio and video products and services, but we also take care of the delivery of your projects, ensuring your success in the local market and helping you to build a strong reputation.

Request A Quote Today!

Your email address will not be published. Required fields are marked *. We will contact you within 24 hours!
Kindly Send Us Your Project Details

We Will Quote for You Within 24 Hours .

OR
Recent Products
Get a Free Quote

DJSLink experts Will Quote for You Within 24 Hours .

OR