Calls break when the network is busy. Voices clip. Customers get angry. Quality of Service (QoS) 1 stops this. It gives voice first place when links fight for bandwidth.
QoS (Quality of Service) is a set of rules that marks, queues, and shapes packets so SIP and RTP get low delay, low loss, and stable jitter—even when the network is congested. It protects call quality.

Now I will show how QoS works end to end. I will explain how to mark SIP and RTP. I will compare DSCP, WMM, and VLANs. I will list practical router, switch, and Wi-Fi settings. I will also show how to test with packet captures and MOS so you can prove the result.
How does QoS prioritize my SIP and RTP traffic?
Voice fails first under stress. Data can wait. Speech cannot. So we treat SIP and RTP as VIPs from the first hop to the last.
QoS prioritizes RTP media with strict priority and gives SIP signaling high (but lower) priority. It does this with Differentiated Services Code Point (DSCP) 2 marks, class maps, and queues on every device along the path.

Dive deeper Paragraph:
The logic: media over signaling
RTP carries the audio. SIP sets up and tears down the call. When links clog, late media sounds bad immediately. Late signaling hurts setup, but not ongoing talk. So I typically:
- Mark RTP (and often RTCP) as Expedited Forwarding (EF) per-hop behavior 3.
- Mark SIP as CS3 (24) or Assured Forwarding (AF) per-hop behavior 4 (commonly AF31).
Trust boundary and remarking
I define a “trust boundary” at the access switch (or the first hop router). Only known IP phones, SIP intercoms, PBXs, or SBCs can set DSCP. If anything else tries to claim EF, I remark it down or police it. This prevents queue starvation and stops “fake voice” traffic from abusing priority.
Queuing model that actually works
I use Low Latency Queueing (LLQ) 5 / strict priority for EF voice and WFQ for everything else:
- EF (RTP) gets a strict low-latency queue (LLQ).
- SIP gets a high WFQ class (important, but not strict priority).
- Best-effort stays default.
- Bulk/backup goes to a scavenger class (lowest weight, higher drop).
I size the EF queue to real voice load (codec + overhead + headroom), then police EF so it cannot starve other queues.
Design targets that keep calls clear
Targets are simple and proven:
- One-way latency: ≤ 150 ms
- Jitter: ≤ 30 ms
- Packet loss: ≤ 1% (bursty loss is worse than average loss)
The worst leg sets the user experience, so I check both directions.
End-to-end or nothing
QoS only works when every hop honors the marks:
- Phones/intercoms mark
- Access switches trust + queue
- Core carries DSCP unchanged
- WAN edge shapes + queues
- VPN/SD-WAN copies DSCP to the outer header (otherwise the tunnel hides priority)
| Traffic Type | DSCP | Queue Class | Notes |
|---|---|---|---|
| RTP (media) | EF (46) | LLQ / strict | Size to peak voice + overhead |
| RTCP (reports) | EF (46) or CS3 | LLQ or High WFQ | Keep reports timely; vendor choice varies |
| SIP (signaling) | CS3 (24) or AF31 (26) | High WFQ | Do not use strict priority |
| Interactive control | AF21–AF23 | Medium WFQ | Optional |
| Best effort | 0 | Default | Most traffic |
| Bulk/backup | CS1 (8) | Scavenger | Lowest weight / higher drop |
Which QoS methods should I use—DSCP, WMM, or VLANs?
DSCP, WMM, and VLANs solve different layers. Use them together with clear roles.
DSCP marks priority at Layer 3 across routed networks. WMM maps priority on Wi-Fi. VLANs isolate voice and simplify policy—but VLANs alone do not create priority.

Dive deeper Paragraph:
DSCP for the routed world
DSCP is the core method across routers and modern switches. EF (46) for RTP is the common standard. SIP often uses CS3 or AF31. I match classes primarily by DSCP, not ports alone—ports change, and TLS can move or obscure assumptions.
WMM for Wi-Fi airtime
Wi-Fi uses Wi-Fi Multimedia (WMM) access categories 6:
- AC_VO (voice)
- AC_VI (video)
- AC_BE (best effort)
- AC_BK (background)
I map EF → AC_VO so voice gets faster airtime access. Then I keep voice devices on 5 GHz / 6 GHz when possible, and I tune the WLAN to reduce retries and latency.
VLANs for clean policy, not automatic priority
Voice VLANs are valuable for:
- Cleaner broadcasts
- Easier DHCP/provisioning (Option 66/160)
- LLDP-MED, PoE planning, and ACLs
But VLANs don’t “prioritize” by themselves. I still need DSCP trust + queue mapping on uplinks, and (if used) 802.1p/PCP mapping between VLAN CoS and DSCP.
WAN and tunnels
On WAN links, QoS is life support:
- Shape just below the real usable rate so your router is the bottleneck (not the ISP modem’s buffer).
- On low-rate links, reduce serialization delay (big packets blocking small voice packets).
- For IPsec/GRE/SD-WAN, copy inner DSCP to the outer header so upstream devices can prioritize it.
| Method | Layer | What it does | Use it for |
|---|---|---|---|
| DSCP | L3 end-to-end | Marks + queues in routers/switches | RTP EF, SIP CS3/AF31 |
| WMM | Wi-Fi MAC | Faster airtime access | Map EF → AC_VO |
| VLAN | L2 segmentation | Isolation, DHCP/PoE/ACL policy | Voice VLAN design + trust boundary |
What settings improve my call quality—bandwidth, queues, jitter buffers?
Good intent is not enough. You need numbers: queue sizes, rate limits, and buffers set to match codecs and links.
Give RTP a strict priority queue sized to peak voice rate. Police EF. Shape WAN links. Tune jitter buffers. Keep latency ≤150 ms, jitter ≤30 ms, and loss ≤1%.

Dive deeper Paragraph:
Bandwidth planning
Start with codecs and overhead (per call, one-way, typical 20 ms ptime):
- G.711 / G.722: ~80–100 kbps
- Opus (wideband): depends on cap (often 32–70 kbps with overhead)
Multiply by max concurrent calls, add ~20% headroom, then shape WAN egress so queues stay under your control.
LLQ sizing and EF policing
- Create an EF class for RTP (and usually RTCP).
- Reserve LLQ bandwidth ≈ peak voice load + 10–20%.
- Police EF to the same order of magnitude so it can’t starve the rest.
Jitter buffers and playout
I prefer adaptive jitter buffers with sane limits:
- Start around 20–30 ms
- Allow growth to 60–80 ms on spiky Wi-Fi/WAN
Huge fixed buffers “hide” jitter but add constant mouth-to-ear delay.
Wi-Fi tuning that matters
- Enable WMM
- Map EF to AC_VO
- Prefer 5/6 GHz, keep RSSI strong (≈ -65 dBm or better)
- Avoid overloaded APs; use more APs at sane power instead of one AP at max power
- Use roaming features (802.11k/v/r) where handoff matters—and test them
Targets and alarms
I alert on:
- EF drops (any drop is serious)
- EF queue depth that grows beyond a small steady-state threshold
- SIP retransmissions (signaling class too small or loss upstream)
| Setting | Good Starting Value | Why |
|---|---|---|
| EF queue bandwidth | Peak RTP load + 10–20% | Prevents clipping during bursts |
| EF policing | Match EF bandwidth | Stops starvation/abuse |
| SIP class share | ~2–5% of link | Fast setup without strict priority |
| Jitter buffer | 20–30 ms start, up to 60–80 ms | Smooths bursts with low extra delay |
| WAN shaping | ~90–95% of real rate | Ensures your queues (not ISP buffers) rule |
How do I test QoS with packet captures and MOS?
Do not trust a checkbox. Prove it. Test before go-live, after changes, and during incidents.
Capture packets to confirm DSCP and queue hits. Measure loss, jitter, and delay with RTCP or media probes. Translate them to MOS/R-Factor to show user impact.

Dive deeper Paragraph:
Packet captures: what to check
Capture at:
- Phone/intercom access port (or SPAN)
- Access switch uplink
- WAN edge egress
Verify:
- RTP is DSCP 46 (EF)
- SIP is CS3 (24) or AF31 (26)
- Marks survive across each hop (especially tunnels)
Useful Wireshark filters:
- EF RTP:
ip.dsfield.dscp == 46 - SIP (CS3):
ip.dsfield.dscp == 24 - SIP (AF31):
ip.dsfield.dscp == 26
Also check RTP sequence numbers for gaps (loss) and jitter graphs.
Queue/policy verification on network gear
Device counters should show:
- Packets matching EF/SIP classes
- Zero EF drops under design load
- Predictable queue depth (not runaway)
Then add background load (uploads/backups) and confirm voice stays clean.
RTCP, R-Factor, and MOS
RTCP gives you loss/jitter/delay from real calls. I track MOS by site and by hour:
- On clean LAN/WAN, MOS ≥ 4.0 is common for G.711/G.722.
- If MOS dips, I correlate with EF drops, WAN saturation, or Wi-Fi retries.
Call Admission Control (CAC)
QoS cannot create bandwidth. If a link can handle 8 calls, I allow only 6–7 with CAC. This keeps EF inside budget and prevents “priority overload.”
Test plan checklist
- Place a test call and run a heavy upload. Confirm audio stays clean.
- Verify DSCP marks at each hop (including tunnel outer headers).
- Confirm EF drops stay zero under expected load.
- Collect RTCP/MOS stats for a week and inspect the worst hour, not just averages.
- Test roaming/failover if wireless or dual WAN exists.
| Evidence | Tool | Pass Criteria |
|---|---|---|
| DSCP marking | Wireshark/tcpdump | EF (46) on RTP end-to-end |
| Queue behavior | Router/switch counters | EF has no drops under design load |
| Loss/jitter/delay | RTCP/SBC stats | ≤1% loss, ≤30 ms jitter, ≤150 ms one-way |
| MOS/R-Factor | Monitoring/dashboard | MOS ≥3.8 typical business target |
| CAC behavior | PBX/SBC logs | Excess calls blocked before EF overload |
Conclusion
QoS is simple when roles are clear: mark RTP as EF, protect it with LLQ, police it, trust only at the edge, map EF to WMM on Wi-Fi, preserve marks through tunnels, and prove everything with RTCP and MOS.
Footnotes
-
Background and key concepts for prioritizing time-sensitive traffic when networks are congested. ↩︎ ↩
-
Explains DSCP marking in IP headers so routers and switches can classify traffic consistently. ↩︎ ↩
-
Defines EF behavior used for low-loss, low-jitter voice media treatment across QoS-enabled networks. ↩︎ ↩
-
Describes the Assured Forwarding PHB groups commonly used for high-priority signaling like SIP. ↩︎ ↩
-
Shows how LLQ/priority queuing is implemented to protect latency-sensitive voice on routers. ↩︎ ↩
-
Maps voice traffic into Wi-Fi airtime priority categories so VoIP competes better on busy WLANs. ↩︎ ↩








