Reliable Network Services

Client-Transparent Fault-Tolerant Web Service

CoRAL : Connection Replication, Application-Level Logging

Most of the existing techniques used for increasing the availability of web services do not provide fault tolerance for requests being processed at the time of server failure. Other schemes require deterministic servers or changes to the web client. These limitations are unacceptable for many current and future applications of the Web. We have developed CoRAL, a fault tolerance scheme for Web service that does not have the limitations mentioned above. CoRAL is based on a hot standby backup server that actively replicates the server TCP state and maintains logs of HTTP requests and replies. Our implementation includes modifications to the Linux kernel and to the Apache web server, using their respective module mechanisms.


Client-Transparent Fault-Tolerant Video Conferencing

As video conferencing plays an increasingly critical role in many business environments, there is a need to ensure highly reliable operation of the conferencing infrastructure. We present a scheme for adding fault tolerance to an existing video conferencing server. The scheme is client-transparent so that it can be used by the installed base of clients. While the scheme is based on replication, the associated overhead is negligible since the backup does not process the media streams. Most previous work on fault-tolerant network services focused on transaction-oriented services. Video conferencing is an interesting test-case for applying fault tolerance for other types of services since it combines critical conference state that must be protected with media streams where limited data losses are acceptable. Our implementation combines kernel modules with small changes to the server application to efficiently preserve both the reliable connections used for control messages and unreliable connections used for media transfer.