What is the User Datagram Protocol (UDP)?

When voice calls cut out or sound robotic, the cause is often deeper than the PBX. It usually lives at the transport layer, where UDP quietly moves your packets.

Table of Contents hide

1 How does UDP differ from TCP for VoIP calls?

1.1 TCP vs UDP at a glance for voice

1.2 UDP for RTP media and often for SIP

2 Why do many SIP phones use UDP by default?

2.1 Reasons vendors like UDP for SIP

2.2 The downsides you still need to manage

3 How do jitter buffers handle UDP packet loss?

3.1 Jitter vs packet loss: two different problems

3.2 What jitter buffers actually do

3.3 Limits of what jitter buffers can fix

4 When should I choose UDP vs TCP/TLS for SIP?

4.1 Simple decision rules

4.2 Remember: transport and media are separate

5 Conclusion

6 Footnotes

UDP (User Datagram Protocol) is a lightweight, connectionless transport protocol that sends datagrams without handshakes or retransmissions, trading reliability for low latency, which makes it ideal for real-time traffic like VoIP, streaming, gaming, and DNS.

For the underlying specification, see IETF RFC 768: User Datagram Protocol ¹.

Infographic showing a cloud labeled “UDP User Datagram Protocol” with email icons and connected app symbols, highlighting that UDP has no handshake, no retransmission, and minimal built-in security. — Conceptual overview of UDP for real-time messaging

In VoIP systems, UDP is the common path for both SIP signaling and RTP media. It skips the heavy “are you there?” overhead of TCP and simply pushes packets out as fast as the network allows. That is great for delay-sensitive voice, but it forces phones, PBXs, and jitter buffers to handle loss and jitter on their own.

How does UDP differ from TCP for VoIP calls?

When people hear “unreliable UDP”, they often wonder why we use it for phone calls instead of the safer TCP. It feels wrong at first.

UDP sends small packets with no connection, no retransmission, and no ordering, while TCP adds three-way handshakes, streams, retransmissions, and flow control. For VoIP, UDP’s low delay usually beats TCP’s reliability.

Blue comparison table UI titled “UDP Connectship / Deabling Contrel” with misspelled column and row labels for UDP and TCP packets, representing configurable transport options. — Settings matrix for UDP and TCP connection features

TCP vs UDP at a glance for voice

TCP and UDP both sit on top of IP, but they behave very differently:

Feature	UDP	TCP
Connection setup	None (connectionless)	Three-way handshake
Delivery guarantee	Best effort only	Reliable, retransmits lost data
Ordering	Not guaranteed	In-order byte stream
Header size	8 bytes	20+ bytes
Congestion control	None built-in	Built-in (slow start, backoff, etc.)
Typical VoIP use	SIP signaling, RTP / SRTP media	Some SIP trunks, TLS, large SIP messages

For VoIP, the key trade-off is simple:

TCP will retry lost packets and keep them in order, but those retries add delay.
UDP will drop lost packets and move on, keeping latency low but allowing gaps.

In a file transfer, you want every byte. In a live call, you want fresh audio more than you want perfect recovery of old packets.

UDP for RTP media and often for SIP

In a normal VoIP call:

The voice travels as Real-time Transport Protocol (RTP) ⁴ or SRTP over UDP. Each packet carries, for example, 20 ms of audio.
If a packet is lost, the receiver does not request a resend. It uses packet loss concealment instead.
If packets arrive a bit late, the jitter buffer can smooth them out.

For SIP signaling, many phones also use UDP by default:

SIP messages are small.
Calls do not need constant heavy signaling once they are set up.
Timeouts and retries happen at the SIP layer, not in TCP.

Some providers and PBXs prefer SIP over TCP or TLS, especially when messages are large (for example, many contacts in a single REGISTER) or when they need reliable transport over long-haul links. But even then, the actual voice media almost always stays on UDP.

For real-time voice, UDP is like a fast but unforgiving road. It gets your packets there quickly, but it will not go back if something falls off. That job belongs to jitter buffers, codecs, and smart design.

Why do many SIP phones use UDP by default?

When you open the web page of a SIP phone and see “Transport: UDP / TCP / TLS”, the default is usually UDP. This is not laziness; it is a design choice.

Most SIP phones default to UDP because it has low overhead, simple behavior for small messages, good support from carriers, and predictable performance for large deployments—especially when media already uses UDP.

Close-up of a modern touchscreen IP desk phone on a wooden desk next to a laptop and coffee mug, displaying a blue VoIP call management interface. — Office IP phone with on-screen VoIP control panel

⁵

Reasons vendors like UDP for SIP

Several practical reasons push SIP devices toward UDP by default:

Lower overhead
SIP is text-based. Messages are often small: REGISTER, INVITE, 200 OK, BYE. Sending these over UDP avoids TCP handshakes and connection state.
Scalability on PBX side
A large IP PBX or SIP proxy must handle thousands of phones. Maintaining thousands of TCP connections means more memory and state. UDP requests are simpler: they arrive, they get processed, they finish.
Fast failover
If a proxy or PBX fails, phones using UDP quickly notice timeouts and retry to another server. With TCP, failure detection might depend on keepalives or OS-level socket states.
Alignment with RTP
Voice packets already use UDP. Keeping both signaling and media on UDP can simplify certain firewall and NAT stories, especially in small networks.

Here is a quick summary:

Aspect	UDP for SIP	TCP/TLS for SIP
Setup time	Minimal	Needs handshake
Server state	Lower (no per-call socket)	Higher (per-connection state)
Message reliability	SIP handles retries	TCP handles retransmissions
Typical default	Phones and intercoms	Trunks, encrypted signaling, large messages

The downsides you still need to manage

UDP’s simplicity comes with some costs:

Larger SIP messages can be fragmented at IP level, which is fragile.
Firewalls and NAT devices may close UDP mappings quickly, so phones need keepalives.
Without TLS, SIP over UDP is clear text, which is not acceptable for some environments.

That is why many modern deployments mix modes:

Internal devices use UDP for SIP within a protected LAN or VPN.
External trunks and remote workers use TCP or TLS for better reliability and security.
Media remains on RTP/SRTP over UDP everywhere.

So when you see “UDP” as the default transport on SIP phones, it is not because engineers forgot about TCP. It is because for small, frequent control messages in a voice network, UDP is often the cleanest starting point.

How do jitter buffers handle UDP packet loss?

Because UDP does not fix loss or reorder packets, many people assume the phone is helpless when packets disappear. Yet calls often sound fine even with a bit of loss.

Jitter buffers smooth out delay variations by holding UDP voice packets briefly before playback, while packet loss concealment and error-recovery tricks mask missing packets so users hear continuous, natural audio.

Illustrated blue network appliance with multiple Ethernet ports and internal modules, overlaid with arrows and boxes labeled “UP” and “UPP” to visualize packet and port flows. — Diagram of packet routing through a multi-port UDP/TCP gateway device

⁶

Jitter vs packet loss: two different problems

First, keep the two issues separate:

Jitter: packets arrive at uneven intervals. Some are early, some are late.
Loss: packets never arrive or arrive so late they are useless.

UDP itself does not care about either. It just delivers whatever it can, as fast as it can.

A jitter buffer sits between the network and the decoder. It collects a small number of audio packets, then plays them out at a steady pace. This turns jittery arrival times into smooth playback.

What jitter buffers actually do

Most VoIP phones, intercoms, and gateways support:

Fixed jitter buffer: holds, for example, 60 ms of audio before playback starts and keeps that delay steady.
Adaptive jitter buffer: grows or shrinks based on network behavior to balance delay and smoothness.

Typical behaviour:

The device receives RTP packets over UDP.
It stores them in time order in the jitter buffer.
It starts playback after a small delay (for example 40–80 ms).
If packets arrive slightly late, they may still land in time.
If a packet is too late or missing, the decoder gets a “gap”.

When there is a gap, packet loss concealment (PLC) steps in:

It can repeat the previous frame.
It can fade or interpolate between known samples.
Modern codecs like Opus have advanced PLC that can guess missing audio.

Some systems also use FEC (Forward Error Correction) or duplicate packets on key paths to add redundancy. This adds overhead, but it helps in lossy networks.

Limits of what jitter buffers can fix

Jitter buffers are powerful, but not magic:

If jitter grows too large, the buffer must grow too, which increases delay.
If loss is high (for example, >3–5%), audio becomes choppy or robotic.
Very bursty loss (many packets missing in a row) is harder to hide than isolated drops.

Here is a quick view:

Problem type	Main tool	What users hear when it is handled well
Small jitter	Jitter buffer	Smooth audio, small added delay
Occasional loss	PLC, sometimes FEC	Maybe tiny glitches, often not noticed
Heavy loss burst	PLC cannot fully hide it	Words drop, speech becomes hard to follow
High jitter	Larger jitter buffer	Audio stable, but conversation feels delayed

So jitter buffers do not “repair” UDP packets. They hide network messiness by trading a bit of delay for smooth sound. As long as loss and jitter stay within reasonable limits, users hear a clean conversation even though UDP gives no guarantees.

When should I choose UDP vs TCP/TLS for SIP?

Choosing the wrong transport for SIP can lead to odd failures, strange timeouts, or security gaps. Many teams stay on defaults and hope for the best.

Use UDP for SIP when you want low overhead and you control the network. Use TCP or TLS when messages are large, paths are long or complex, or when you need stronger reliability and encryption across the public internet.

Flat network diagram of blue circular icons—SIP, computer, truck, globe, SIP cloud, locks, mail, and gear—connected by lines to depict a secure, service-oriented VoIP infrastructure. — Service map of secure SIP and data services in an IP network

⁷

Simple decision rules

You can think about it in four questions:

Is this inside a trusted LAN or VPN?
- Yes: UDP is often a good default.
- No: Consider TLS for signaling security.
Are SIP messages small and simple?
- Yes: UDP works well.
- No: TCP handles large messages better (no fragmentation).
Do you need encryption on signaling?
- Yes: Use SIP over TLS (which rides on TCP).
- No: UDP may be enough, if policy allows.
Will this serve many devices or multi-tenant traffic?
- Yes: An SBC with TCP/TLS support helps manage complexity.
- No: Direct UDP between phones and PBX can be fine.

Here is a practical mapping:

Scenario	Recommended SIP transport	Why
Small office, local PBX	UDP on LAN	Simple, low overhead
Large enterprise core	Mix of UDP and TCP, often via SBC	Scalability and control
Remote workers over internet	TLS (TCP) to cloud PBX or SBC	Encryption and better NAT handling
SIP trunk to ITSP / carrier	As carrier requires (often UDP or TCP/TLS)	Interop and contract terms
High-security environment	TLS for signaling + SRTP for media	Protects metadata and content

Remember: transport and media are separate

Whatever you choose for SIP transport:

Media (RTP/SRTP) almost always uses UDP.
SIP over TCP/TLS does not mean voice uses TCP.

So the typical secure and robust stack looks like:

Inside LAN: SIP over UDP, RTP or SRTP, QoS-enabled.
Across the internet: SIP over TLS (TCP) to an SBC or hosted PBX, SRTP for media, with firewalls and NAT tuned for these flows.

In other words, you do not need to pick “UDP or TCP forever”. You use UDP where it keeps things simple and fast, and TCP/TLS where the path, size, or security requirements demand more control.

Conclusion

UDP keeps VoIP fast and simple by skipping connections and retries, while SIP, jitter buffers, and codecs handle the messy parts so your SIP PBX, IP phones, and intercoms can deliver clear real-time voice.

Footnotes

The official UDP specification from the RFC Editor. ↩ ↩
Visual summary of UDP’s “no handshake” behavior for real-time traffic. ↩ ↩
Quick UDP vs TCP feature comparison for VoIP transport decisions. ↩ ↩
The official RTP standard describing media packet timing and sequencing. ↩ ↩
Example SIP phone UI context for transport settings in real deployments. ↩ ↩
Diagram-style view of packet flow concepts through a VoIP gateway endpoint. ↩ ↩
High-level visual aid for choosing SIP transport and security options. ↩ ↩

About The Author

DJSLink R&D Team

DJSLink China's top SIP Audio And Video Communication Solutions manufacturer & factory .
Over the past 15 years, we have not only provided reliable, secure, clear, high-quality audio and video products and services, but we also take care of the delivery of your projects, ensuring your success in the local market and helping you to build a strong reputation.