Twilio Conferences use a jitter buffer to smooth out irregularity in voice packet arrival times to conference participants. This results in better audio quality but introduces a fixed delay for each participant.
When a participant's media stream displays extremely high jitter the jitter buffer may swell to compensate, and at sizes of ~250ms the jitter buffer can be perceived by the conference participants as audio latency.
We have added a parameter to conferences that allows the buffer setting to be configured. The buffer size can be explicitly set or the buffer can be disabled outright which will reduce the perceived latency at the expense of potentially introducing degraded audio artifacts. For more information see our blog post.