How is Insights data gathered?
Sensors in the Twilio Voice SDKs and on Twilio media gateways gather call metrics and events and send them to the Voice Insights platform for analysis and aggregation.
What are the requirements for Voice Insights data to be gathered on a call?
Why is Voice Insights focused on network metrics instead of audio quality?
In the world of Voice-over-IP network metrics and audio quality are essentially synonymous, and our analysis of hundreds of billions of calls over more than ten years supports the theory that network transport issues are the number one contributor to reports of audio quality degradation for VoIP calls.
In the old school world of plain-old-telephone-services the only transport metric that mattered was physical continuity of copper wires buried underground, but in a VoIP telecom deployment when people talk about "choppiness" on a call what they are actually talking about is packet loss. When they talk about "noise" or "robotic speech" they are almost certainly talking about jitter. Many reports of "one-way audio" and "dropped calls" are really high post-dial delay.
We monitor and report on these metrics because without visibility into the underlying behavior of the network infrastructure it is not possible to detect, diagnose, or resolve the causes of quality issues.
What types of audio quality issues can Voice Insights not detect?
In-stream audio issues like echo, or noise that is not related to jitter or packet loss, can't be detected by Voice Insights today but should be captured in any recordings of the calls made using Twilio.
How does Twilio's silence detection work?
Twilio's media gateways can detect missing RTP streams or streams which contain only silence. We will mark those calls has having contained silence in the Call Summary API response and in the Call Summary properties in Console. Note that silence detection is not the same thing as human speech detection, so a call could conceivably contain no one speaker but display enough background noise to result in Twilio failing to mark the call as having been silent.
How is "Who Hung Up" determined?
We look at our external signaling edge and the direction of which party sent the SIP BYE.
Why might “Who Hung Up” not be available?
We use the source of the SIP BYE at our signaling edge to determine who hung up. From a SIP perspective a call needs to have been answered before it can be ended via a SIP BYE, so calls that are
canceled we will not provide who hung up; however you can infer that any unanswered call was ended by the calling party.
For Voice SDK calls, who hung up might be missing due to multiple reasons: computer crashes, network issues, the browser being closed or crashing while the call was in progress, the client object being destroyed before the event could be sent to Twilio Insights, or any other behavior that results in Twilio not receiving the final events from the SDK unexpectedly. These will appear as "disconnected by: unknown" when viewing Insights data in Console and aggregate reports.
When we say “packet loss > 1% in 3 out of last 5 samples”, what’s a “sample”?
For Voice SDK calls we sample each second; for Carrier and SIP calls we sample the cumulative stats for the previous 10 seconds every 10 seconds.
Isn’t 1% packet loss and 30 ms jitter a little too sensitive?
Our quality thresholds are based of ITU-T standards for VoIP quality, and do indeed lean toward sensitivity. The goal behind highlighting issues at-or-below the edges of perceptibility is to allow detection and mitigation to occur before users notice; however, we also provide cumulative metrics in the call Summary, which allow you to craft your own internal threshold for performance.
Why are the network and audio warnings emitted by the Voice SDKs more sensitive than the tagging thresholds?
Almost all quality issues for Voice SDK calls are due to local network conditions, which are typically caused by things like misapplied or absent quality of service (QoS), assymetric bandwidth allocations, or bandwidth limitations. We expose a more sensitive threshold of data for Voice SDK calls to allow developers to identify and respond to changing quality conditions before their users notice by surfacing warnings in their applications and giving prescriptive instructions; e.g. "quality issues detected, try moving to a different location".
Why is Post-Dial Delay tagging based on a percentile and not a numeric threshold?
Different destinations have different expectations for post-dial delay, and accordingly varying thresholds for acceptable PDD. In the USA, for example, PDD higher than 6 seconds is typically something that could be escalated to the carrier; however, in South Africa PDD of 10 seconds is common and considered acceptable, so tagging calls to South Africa with PDD > 6 seconds would result in us tagging basically all the calls in South Africa, reducing the utility of the data to uncover outliers in performance.
Any limitations of the Insights Dashboard to be aware of?
Due to GDPR Voice Insights data is only stored for 30 days, so the Dashboard is limited to 30 days worth of data. Since only 30 days worth of data is available, we can only show comparisons for up to 15 days. CSV export is limited to 2000 rows.
How can I address the SIP errors?
For SIP Interface and Elastic SIP trunking calls you can view your local SIP infrastructure logs and pcaps, and compare with the Twilio public pcap available in the call resource page in Console.
Voice SDK SIP errors are often due to unexpected application behavior. We recommend enabling debug logging in your apps and reproducing the issue to understand the cause.
SIP errors that originate from the carriers are currently not actionable by you directly. In case there are issues with a specific carrier are impacting your users, you can reach out to Twilio Support with the data to get more help.
Is this data available using REST APIs?
How long is the data retained?
Voice Insights data is retained for 30 days after the call is made.
Why don’t I see Voice Insights Metrics or Events for my calls?
Enable Voice Insights Advanced Features using the Voice Insights Settings page in Console to see call events and metrics.
Why are there missing Insights metrics/events/summary for this Voice SDK call?
Missing metrics/events/summaries can be caused by:
- local network configuration blocking publishing to the Insights events gateway
- expired tokens
- publishing delays on the Voice Insights backend
I just enabled Voice Insights advanced features. Why do I see only one day worth of data?
Advanced features data is available only while the feature is active on the account, any calls placed before the feature is enabled will not be flagged by the Voice Insights infrastructure, and the per-interval metrics and event stream will not be stored.
How is Twilio RTP Latency calculated?
Twilio RTP latency shows the average and max Twilio-internal media stream traversal time in milliseconds. The Voice Insights platform analyzes the timestamps of when RTP packets are received at the ingress of Twilio's media gateway and compares that timestamp against the timestamp of the same packets at the egress on the other media edge.
What sources contribute to Twilio internal RTP traversal time?
The overwhelming number of calls marked as having high latency are calls with participants in distant geographic locations; e.g. a Voice SDK user in the Philippines dialing into a conference being mixed in Ireland. Calls placed using a conference call flow will have a small jitter buffer that can result in conference calls being more prone to be marked as having high latency. Optimizing the region selections for Voice SDK instances and conferences is the best way to mitigate the impact of internal RTP traversal time. Ultimately if you have users separated by large distances, some degree of Twilio-internal latency can’t be avoided.
Do I need to pay for Voice Insights advanced features to utilize the event handlers in the SDKs?
No. The SDK-level events like constant audio warnings and low MOS warnings can be instrumented without needing to pay for advanced features. Availability of the events via Console is an advanced feature.
What are some best practices for implementing these events in applications?
Implement handlers for network-quality-warning-raised group to warn users that their local network conditions might be impacting call quality—use the SDK to display warnings and prescriptive actions to users in the application; e.g. “check headset connection” or “move to an area with better WiFi”.
Implement handlers for audio-level-warning-raised events to show visual indication to users that their audio is not being detected by the application.
Create post-call surveys using feedback events asking users for to rate the subjective quality of experience and correlate responses with other metrics and properties to identify commonalities in call behavior changes.
Why is the “To” information always blank for calls placed from Voice SDK?
The “To” for a Voice SDK call is the Twilio application SID; the TwiML returned from the webhook configured in the application SID will create a child call, and that child call will contain the expected “To” details.
Why do we see jitter on the incoming stream of one side of this call, but no jitter on the outgoing stream of the other call?
Twilio’s conference mixers have a small jitter buffer that can reduce or eliminate jitter received on one call before passing it along to the other calls. The tradeoff for this is a proportional increase in latency.
What’s the difference between high round trip time and high latency?
High round trip time (RTT) indicates the measurement of how long it took packets to arrive at the Voice SDK application sensors from Twilio’s media gateway has breached 400 ms in 3 out of the last 5 samples.
High latency indicates that the average of the Twilio internal RTP traversal time was greater than 150 ms.
This call looks like there was no jitter or packet loss on the incoming stream to Twilio, but on the outgoing stream of the child call we see some jitter and packet loss, is Twilio introducing these?
It’s possible. RTP media streams are sent via User Datagram Protocol (UDP) which is a fire-and-forget protocol with no error correction. Jitter and packet loss for UDP transport is a fact of life in any network, and calls whose media streams traverse large distances (e.g. a Voice SDK call in Singapore terminating in a conference mixed in Brazil) are more likely to be impacted by inevitable-but-rare backbone-level jitter and packet loss.
Based on the transport metrics this call should sound terrible, but the users didn’t report any problems?
It’s challenging to infer the subjective user experience from transport-level metrics alone. In general, it’s safe to say packet loss >5% is going to result in noticeably choppy audio; average jitter >5 is going to result in robotic audio, and RTT > 1000 ms is going to result in people either talking over each other or long periods of silence between speech.
Transport-level metrics like jitter and packet loss are attempts at monitoring indicators that can contribute to call quality degradation; however, browsers like Chrome have dynamic adaptive jitter buffers that can mask the impact of jitter by introducing latency. Codecs like Opus have packet loss concealment algorithms that can smooth out the impact of missing packets. At the SDK level we don’t have a good sense for the jitter buffer or packet loss concealment activities at any given time, so we report on the underlying transport metrics in the absence of subjective feedback. SIP and PSTN carrier infrastructure may also be implementing jitter buffers or transcoding to reduce quality issues.
Can we detect if a Voice SDK user is connected via VPN?
Not directly; the IP address reported to Insights would be that of the VPN IP, not the local device location. You can sometimes infer this from conversations with users; e.g. a user who is physically located in Germany but whose IP is showing up as being in Spain.
Do we know the mobile signal strength of the called party?
Not at this time. We are working on including signal strength and battery metrics in future versions of Insights for the mobile SDK, but the PLMN conditions are not available today.
How can we check the external network status? What are the tools used to check the external network for audio issues?
The metrics reported at the carrier gateway represent what Twilio received from the PSTN. Twilio’s Super Network is monitoring these connections in real-time and will raise quality degradations to carriers and file incidents as appropriate. Destination carriers often have status pages that can be checked for their local conditions as well, but they tend to be very conservative and slow to update.
Voice Insights is limited to inferring quality issues from proxy metrics like jitter and packet loss or silence detection. Voice Insights has visibility into what we sent from, and was received by, our media edges; degradation introduced on the way to the destination due to poor signal strength or a carrier issue would be transparent to Voice Insights today.