When an emergency phone is down, the risk is not only “one device offline.” The risk is an unprotected area and a failed audit trail.
Refineries and terminals commonly target MTTR that fits their safety and permit reality: fast on-site restore for critical points (often within 1–4 hours) and a 24-hour swap target under an SLA for non-critical locations. The best MTTR comes from modular design, strong remote diagnostics, and a spares plan that matches Zone 1/2 work rules.

How to set MTTR for Ex telephones without guessing
MTTR is not only “repair time,” it is “restore time”
For hazardous-area devices, “repair” can be slow because:
-
access permits and gas tests are needed
-
hot work rules apply in some locations
-
opening Ex d enclosures can require special steps
-
weather and shift schedules delay work
So the right MTTR metric 1 is often:
-
time to restore service (swap)
-
not time to fix the internal failure (bench repair)
This is why many sites set a strong SLA for swap and keep detailed repair work offsite.
MTTR depends on criticality and location
A phone at a loading rack, jetty, or emergency muster route is more critical than a phone in a low-risk corridor. MTTR targets should reflect that.
A practical categorization works:
-
Tier 1 (critical): must be restored quickly (same shift)
-
Tier 2 (important): restore within 24 hours
-
Tier 3 (convenience): restore within 48–72 hours
Write MTTR targets in a way vendors can actually meet
Vendors cannot control your permit process, but they can control design and spares. A good MTTR requirement includes:
-
target restore times per tier
-
required modular features and tooling needs
-
required diagnostics and alerting
-
spare kit and RMA workflow expectations
| MTTR element | What it includes | Who controls it |
|---|---|---|
| Detection time | how fast a fault is noticed | monitoring + alarms |
| Access time | permits, travel, scaffolding | site process |
| Replace time | swap unit or module | device design + training |
| Recommission time | SIP register + call test | network + templates |
| Closure time | logs and documentation | maintenance workflow |
Once you separate these elements, it becomes obvious where to invest: monitoring, modularity, and spares.
Next sections answer the benchmark targets, design features, diagnostics, and maintenance planning that meet MTTR in the real world.
What MTTR benchmarks do refineries and terminals expect—under 1 hour on-site or 24-hour swap under SLA?
Some buyers ask for “MTTR under 1 hour” for every device. That is rarely realistic in Zone 1 areas. Still, fast restore targets are achievable when the site plans for swap.
A realistic benchmark is: Tier 1 locations aim for restore within 1–4 hours during staffed hours, while many terminals use a 24-hour swap SLA for most points. “Under 1 hour” is achievable only when the phone is accessible, the spares are on site, and no special permit delays apply.

A practical tier-based MTTR table
| Tier | Typical location | Target restore time | How it is achieved |
|---|---|---|---|
| Tier 1 | loading racks, jetty, muster routes, control room perimeter | 1–4 hours | on-site spare units + trained tech + templates |
| Tier 2 | process units, tank farm roads | ≤24 hours | swap under SLA 2, next-shift response |
| Tier 3 | offices, low-risk corridors | 48–72 hours | planned maintenance window |
Why “swap MTTR” is the right KPI for Ex telephones
Bench repair of an Ex d unit in the field is slow and often not allowed. A swap approach avoids:
-
long enclosure open time
-
re-termination work
-
repeated sealing steps
The fastest safe restore is often:
-
isolate power (PoE)
-
remove handset module or full unit
-
install spare
-
confirm SIP register and test call
-
record serial number and update asset log
Add a detection target, not only repair
If the phone stays down for a day before anyone notices, MTTR is meaningless. A good SLA includes:
-
detection within minutes (SNMP/syslog or PBX registration alarms)
-
restore within hours or one day based on tier
Now, MTTR targets are only realistic when the phone design supports fast replacement. The next section lists design details that consistently reduce on-site time.
Which designs reduce MTTR—modular handsets/PCBs, plug-in relays, hot-swap PoE, and tool-less seals?
A phone can be “rugged” and still be slow to service. Rugged does not automatically mean maintainable.
Low MTTR designs use modular components: field-swappable handset/cord, plug-in relay or I/O modules, connectorized PCBs, and sealing designs that reduce rework. Hot-swap PoE at the switch level helps restore service quickly without touching AC power systems. Tool-less seals can help, but only if they still preserve Ex and IP integrity.

Modular handset and cord assemblies
Handset and cord failures are common. A fast swap design includes:
-
connectorized handset termination
-
strong strain relief that is easy to inspect
-
spare handset kits that do not require deep disassembly
This can cut restore time from hours to minutes.
PCB modularity: swap modules, repair offsite
For electronics failures, the best approach is:
Design elements that help:
-
plug-in boards with keyed connectors
-
clear labeling and torque marks
-
minimal “free wiring” inside the enclosure
Plug-in relays and I/O modules
I/O faults can be hard to diagnose in the field. Plug-in modules help because:
-
the tech swaps the module
-
the system returns fast
-
deeper troubleshooting can happen offsite
Hot-swap PoE: restore power without waiting on electricians
A PoE design is naturally helpful for MTTR. A tech can move the cable to a spare switch port or swap an injector quickly. Hot-swap happens at the network level:
-
redundant PoE switches on UPS
-
spare ports preconfigured
-
templates for MAC-based provisioning or auto-provisioning
Tool-less seals: helpful only when they remain compliant
Tool-less seals can reduce time, but they must not increase risk:
-
gaskets must seat consistently
-
compression must be controlled
-
the method must be approved in the product instructions for the hazardous zone
In many Zone 1 cases, a swap approach is still faster and safer than opening the enclosure repeatedly.
| Design feature | MTTR benefit | What to verify before deployment |
|---|---|---|
| Field-swappable handset/cord | Minutes vs hours | spare kit availability and procedure |
| Modular I/O board | Fast fault isolation | module part numbers and interchangeability |
| Connectorized PCB | Shorter bench repair | revision control and compatibility |
| External test points | Faster diagnostics | clear labeling and safe access |
| PoE design | Fast power recovery | redundant switch/UPS and port templates |
Design matters, but diagnostics decide how fast the team knows what to swap. The next section focuses on alerting and remote tools that reduce wasted trips.
How do diagnostics cut repair time—self-test, SNMP/Syslog alerts, remote reboot, and spare-parts kits with RMA workflow?
The biggest MTTR killer is the “two-trip repair.” First trip to discover the issue. Second trip to bring the right part.
Diagnostics reduce MTTR by cutting detection time and eliminating guesswork. Self-test, SNMP traps, syslog alerts, remote reboot, and clear spare-part kits help technicians arrive with the right replacement and restore service in one visit. A tight RMA workflow keeps spares replenished and traceable.

Self-test and health indicators
A good Ex phone should expose:
-
boot self-test result
-
microphone/speaker loop test where possible
-
I/O test status
-
temperature or internal fault indicators if available
Even a simple “fault LED + syslog message” saves time.
SNMP and syslog: detect failure in minutes
Monitoring should cover:
-
SIP registration status (from PBX or endpoint)
-
link up/down
-
reboot events
-
over-temperature warnings if supported
-
tamper or cover switch events if used
The goal is to trigger a ticket 4 fast and include device ID, location, and likely failure type.
Remote actions: reboot and configuration checks
Remote reboot can be safe and useful when:
-
the device is stuck in a software state
-
the network path is stable
-
the site allows remote reset procedures
Still, remote reboot should not be used as a “fix.” It should be a first-response tool with logging and limits.
Spare parts kits that prevent wrong-part trips
A good spare kit includes:
-
handset + cord assembly
-
gland seals and washers
-
mounting hardware pack
-
one full spare phone per defined quantity of installed devices (for Tier 1)
-
optional I/O module if used
The kit should be tied to a clear RMA system, so the used spare is replenished quickly.
RMA workflow: keep traceability and speed
A clean RMA workflow includes:
-
serial number tracking
-
failure symptom code
-
photos if needed
-
fast replacement shipment
-
repair report returned to the customer
This matters in B2B projects because customers expand sites over time and want consistent hardware batches.
| Diagnostic tool | What it shortens | Practical requirement line |
|---|---|---|
| SNMP traps | detection time | “SNMP traps for link, fault, reboot” |
| Syslog 5 | troubleshooting time | “Syslog with event codes and timestamps” |
| Remote reboot | first response | “Remote reboot with audit logging” |
| Self-test | fault isolation | “Self-test results available in UI/API” |
| Spare kits | travel and wait time | “On-site spares sized per tier” |
| RMA process | replenishment time | “RMA response and repair reporting SLA” |
Now, even great tools fail when the maintenance plan is weak. The final section shows how to build a plan that meets MTTR targets consistently.
How should maintenance plans meet MTTR—spares ratio, technician training, acceptance tests, and vendor on-call support?
MTTR is a system KPI. It is not a product feature. It is the result of planning, training, and documentation.
A maintenance plan meets MTTR by stocking spares based on criticality, training technicians for safe swap procedures, running acceptance tests after every swap, and defining vendor support and escalation paths. The plan should also include templates for provisioning so a swapped unit registers and behaves correctly in minutes.

Spares ratio: match tier and site size
A practical spares model:
-
Tier 1: keep at least one full spare phone per site zone or per defined cluster
-
Tier 2: 2–3% spare units depending on fleet size
-
Tier 3: 1–2% spares
Handsets and cords often need higher spares because they are wear items:
- 3–5% handset/cord spare kits in harsh areas is common
The goal is simple: the tech should never wait for shipping to restore a critical point.
Technician training: focus on safe, repeatable steps
Training should cover:
-
hazardous area access rules
-
correct bonding checks after mounting
-
how to run a basic call test and paging test
-
how to log the swap and update asset IDs
Short, repeatable procedures reduce mistakes. They also reduce rework.
Acceptance tests: keep a short “post-swap checklist”
A post-swap acceptance test should include:
-
SIP registration check
-
inbound/outbound call
-
paging group test if used
-
relay and input check if the site uses I/O
-
timestamp check (NTP synced) if logs are audited
This can be done in 5–10 minutes and saves days of later troubleshooting.
Vendor on-call support: define escalation and artifacts
For B2B deployments, vendors should provide:
-
spare part lists and lead times
-
configuration templates and firmware control
-
remote troubleshooting support with logs
-
clear escalation for urgent failures
A good SLA includes:
-
response time for critical tickets
-
replacement dispatch window
-
and documentation for root cause after repair
| Maintenance element | Practical target | Why it supports MTTR |
|---|---|---|
| Spares stocking | Tier-based spares on site | Enables swap restoration |
| Templates | auto-provision or config backup | Cuts commissioning time |
| Training | annual refresh + new tech onboarding | Reduces errors and rework |
| Acceptance test | short checklist per swap | Prevents silent misconfig |
| Vendor support | on-call path + RMA SLA | Keeps spares replenished |
| Documentation | asset tracking 7 and event logs | Supports audit and analytics |
The biggest MTTR win is consistency. The second biggest win is making replacement easy and safe in Zone 1/2 areas. With a modular design and a serious spares plan, “restore in the same shift” becomes achievable even in harsh terminals.
Conclusion
Refineries and terminals usually aim for 1–4 hour restore for critical points and a 24-hour swap SLA for most areas, achieved through modular design, strong diagnostics, and a tier-based spares and training plan.
Footnotes
-
[Mean Time To Repair/Restore, a metric used to measure the average time required to troubleshoot and repair failed equipment.] ↩
-
[Service Level Agreement, a contract between a service provider and a customer that defines the level of service expected.] ↩
-
[Return Merchandise Authorization, a part of the process of returning a product to receive a refund, replacement, or repair.] ↩
-
[A record in an issue tracking system that documents a reported problem or request.] ↩
-
[Standard for message logging that allows separation of the software that generates messages, the system that stores them, and the software that reports and analyzes them.] ↩
-
[Ingress Protection rating, classifying the degrees of protection provided against the intrusion of solid objects, dust, accidental contact, and water in electrical enclosures.] ↩
-
[Systematic process of developing, operating, maintaining, upgrading, and disposing of assets cost-effectively.] ↩








