Christoph Egger, Marco Happenhofer, Joachim Fabini, Peter Reichl
May 12, 2012
The Session Initiation Protocol (SIP) relies on timer-based message retransmission for safeguarding reliable message transfer when deployed on top of the unreliable User Datagram Protocol (UDP) over IP. In this paper we present a detailed impact analysis of SIP timers onto the functionality of a system and its capabilities to recover from overload situations. The results of our event-based SIP simulations demonstrate that message retransmissions originating from a minor short-term overload can force a system into a deterministic congestion collapse when using default SIP timer settings. A recovery from this severe system overload situation is highly difficult or impossible, even if the system load is reduced substantially afterward. Our performance evaluation shows that an increase of timer T1's value significantly enhances the stability, robustness and the ability of systems to handle overload, whereas the resulting increase of response times is relatively small and overall system responsiveness can even improve in some cases. We propose an algorithm for implicit collapse detection as solution for dynamic timer T1 optimization. Based on monitoring of system load and pending transaction counts, our algorithm enables intermediate SIP proxies to detect congestion in a very early phase, allowing them to counteract in time, i.e., to modify timers and reject new system load in order to prevent the system from collapse.