Reliable Network Services
Client-Transparent Fault-Tolerant Web Service
CoRAL : Connection Replication, Application-Level Logging
Most of the existing techniques used for increasing the
availability of web services do not provide
fault tolerance for requests being processed at
the time of server failure.
Other schemes require deterministic servers or changes
to the web client.
These limitations are unacceptable for many current and
future applications of the Web.
We have developed
CoRAL,
a fault tolerance scheme for Web service
that does not have the limitations mentioned above.
CoRAL is based on a hot standby backup server
that actively replicates the server TCP state
and maintains logs of HTTP requests and replies.
Our implementation includes modifications to
the Linux kernel and to the Apache web server, using their respective
module mechanisms.
Publications
- Navid Aghdaie and Yuval Tamir, Fast Transparent Failover for Reliable Web Service, Proceedings of the 15th International Conference on Parallel and Distributed Computing and Systems (PDCS 2003), Marina del Rey, CA, November 3-5, 2003.
(PDF)
- Navid Aghdaie and Yuval Tamir, Performance Optimizations for Transparent Fault-Tolerant Web Service, Proceedings of the 2003 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing (PACRIM 2003), Victoria, BC, Canada, August 28-30, 2003.
(PDF)
- Navid Aghdaie and Yuval Tamir, Implementation and Evaluation of Transparent Fault-Tolerant Web Service with Kernel-Level Support, Proceedings of the 11th International Conference on Computer Communications and Networks (ICCCN 2002), Miami, Florida, October 14-16, 2002.
(PDF)
- Navid Aghdaie and Yuval Tamir, Client-Transparent Fault-Tolerant Web Service, Proceedings of the 20th IEEE International Performance, Computing, and Communications Conference (IPCCC 2001), Phoenix, Arizona, April 4-6, 2001.
(PDF)
Client-Transparent Fault-Tolerant Video Conferencing
As video conferencing plays an increasingly critical role in many
business environments, there is a need to ensure
highly reliable operation of the conferencing infrastructure.
We present a scheme for adding fault tolerance
to an existing video conferencing server.
The scheme is client-transparent so that
it can be used by the installed base of clients.
While the scheme is based on replication,
the associated overhead is negligible
since the backup does not process
the media streams.
Most previous work on fault-tolerant network services
focused on transaction-oriented services.
Video conferencing is an interesting test-case for
applying fault tolerance for other types of services
since it combines critical
conference state that must be protected with media streams
where limited data losses are acceptable.
Our implementation combines kernel modules with
small changes to the server application
to efficiently preserve both
the reliable connections used for control messages
and unreliable connections used for media transfer.
Publications
- Navid Aghdaie and Yuval Tamir, Efficient Client-Transparent Fault Tolerance for Video Conferencing, Submitted for Publication
Demo