Open Access Open Access  Restricted Access Subscription or Fee Access

An Efficient Minimum-process Checkpointing Scheme for Non-Deterministic Mobile Distributed Systems

Parveen Kumar, Preeti Gupta, Anil Kumar Solanki

Abstract


While dealing with mobile distributed systems, we come across some issues like: mobility, low bandwidth of wireless channels and lack of stable storage on mobile nodes, disconnections, limited battery power and high failure rate of mobile nodes.   These issues make traditional checkpointing techniques designed for Distributed systems unsuitable for Mobileenvironments. In this paper, we design a minimum process algorithm for Mobile Distributed systems, where no useless checkpoints are taken and an effort has been made to optimize the blocking of processes. We propose to delay the processing of selective messages at the receiver end only during the checkpointing period. A Process is allowed to perform its normal computations and send messages during its blocking period. In this way, we try to keep blocking of processes to bare minimum. In order to keep the blocking time minimum, we collect the dependency vectors and compute the exact minimum set in the beginning of the algorithm.   The number of processes that take checkpoints is minimized to 1) avoid  awakening of MHs in doze mode of operation, 2) minimize thrashing of MHs with checkpointing activity, 3) save limited battery life of MHs and low bandwidth of wireless channels. In coordinated checkpointing, if a single process fails to take its checkpoint; all the checkpointing effort goes waste, because, each process has to abort its tentative checkpoint. In order to take its tentative checkpoint, an MH needs to transfer large checkpoint data to its local MSS over wireless channels.

 


Keywords


Fault Tolerance, Consistent Global State, Coordinated Checkpointing and Mobile Systems

Full Text:

PDF

References


A. Acharya and B. R. Badrinath, Checkpointing Distributed Applications on Mobile Computers, In Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems (PDIS 1994), 1994, 73-80.

R. Baldoni, J-M Hélary, A. Mostefaoui and M. Raynal, A Communication-Induced Checkpointing Protocol that Ensures Rollback-Dependency Tractability, In Proceedings of the International Symposium on Fault-Tolerant-Computing Systems, 1997, 68-77.

G. Cao and M. Singhal, On coordinated checkpointing in Distributed Systems, IEEE Transactions on Parallel and Distributed Systems, 9 (12), 1998, 1213-1225.

G. Cao and M. Singhal, “On the Impossibility of Min-process Non-blocking Checkpointing and an Efficient Checkpointing Algorithm for Mobile Computing Systems,” In Proceedings of International Conference on Parallel Processing, 1998, 37-44.

G. Cao and M. Singhal, Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing systems, IEEE Transaction On Parallel and Distributed Systems, 12(2), 2001, 157-172.

K.M. Chandy and L. Lamport, “Distributed Snapshots: Determining Global State of Distributed Systems,” ACM Transaction on Computing Systems, 3(1), 1985, 63-75.

E. N. Elnozahy, L. Alvisi, Y. M. Wang and D. B. Johnson, “A Survey of Rollback-Recovery Protocols in Message-Passing Systems,” ACM Computing Surveys, 34(3), 2002, 375-408.

E.N. Elnozahy, D.B. Johnson and W. Zwaenepoel, The Performance of Consistent Checkpointing, In Proceedings of the 11th Symposium on Reliable Distributed Systems, 1992, 39-47.

J.M. Helary, A. Mosterfaoui and M. Raynal, Communication-Induced Determination of Consistent Snapshots, in Proceedings of the 28th International Symposium on Fault-Tolerant Computing, 1998, 208-217.

H. Higaki and M. Takizawa, Checkpoint-recovery Protocol for Reliable Mobile Systems, Transactions of Information processing Japan, 40(1), 1999, 236-244.

R. Koo and S. Toueg, Checkpointing and Roll-Back Recovery for Distributed Systems, IEEE Transactions on Software Engineering, 13(1), 1987, 23-31.

P. Kumar, L. Kumar, R. K. Chauhan and V. K. Gupta, A Non-Intrusive Minimum Process Synchronous Checkpointing Protocol for Mobile Distributed Systems, In Proceedings of IEEE ICPWC-2005, 2005.

J.L. Kim and T. Park, An efficient Protocol for checkpointing Recovery in Distributed Systems, IEEE Transactions on Parallel and Distributed Systems, 1993, 955-960.

L. Kumar, M. Misra, R.C. Joshi, Checkpointing in Distributed Computing Systems, In Concurrency in Dependable Computing, 2002, 273-92.

L. Kumar, M. Misra, R.C. Joshi, Low overhead optimal checkpointing for mobile distributed systems, In Proceedings of 19th IEEE International Conference on Data Engineering, 2003, 686 – 88.

L. Kumar and P.Kumar, A Synchronous Checkpointing Protocol for Mobile Distributed Systems: Probabilistic Approach, International Journal of Information and Computer Security, 1(3), 2007, 298-314.

L. Lamport, Time, clocks and ordering of events in a distributed system, Communications of the ACM, 21(7), 1978, 558-565.

N. Neves and W.K. Fuchs, Adaptive Recovery for Mobile Environments, Communications of the ACM, 40(1), 1997, 68-74.

W. Ni, S. Vrbsky and S. Ray, Pitfalls in Distributed Nonblocking Checkpointing, Journal of Interconnection Networks, 1(5), 2004, 47-78.

D.K. Pradhan, P.P. Krishana and N.H. Vaidya, Recovery in Mobile Wireless Environment: Design and Trade-off Analysis, In Proceedings of 26th International Symposium on Fault-Tolerant Computing, 1996, 16-25.

R. Prakash and M. Singhal, Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems, IEEE Transaction On Parallel and Distributed Systems, 7(10), 1996, 1035-1048.

K.F. Ssu, B. Yao, W.K. Fuchs and N.F. Neves, Adaptive Checkpointing with Storage Management for Mobile Environments, IEEE Transactions on Reliability, 48(4), 1999, 315-324.

L.M. Silva and J.G. Silva, Global checkpointing for distributed programs, In Proceedings of the 11th symposium on Reliable Distributed Systems, 1992, 155-62.

Sunil Kumar, R K Chauhan, Parveen Kumar, “A Minimum-process Coordinated Checkpointing Protocol for Mobile Computing Systems”, International Journal of Foundations of Computer science,Vol 19, No. 4, pp 1015-1038 (2008).

Parveen Kumar, “A Low-Cost Hybrid Coordinated Checkpointing Protocol for mobile distributed systems”, Mobile Information Systems. pp 13-32, Vol. 4, No. 1, 2007.

Rao, S., & Naidu, M.M, “A New, Efficient Coordinated Checkpointing Protocol Combined with Selective Sender-Based Message Logging”, IEEE/ACS International Conference on Computer Systems and Applications, 2008.

Biswas S, & Neogy S,“A Mobility-Based Checkpointing Protocol for Mobile Computing System”, International Journal of Computer Science & Information Technology, Vol.2, No.1,pp135-15,2010.

Gao Y., Deng C., & Che, Y.,“ An Adaptive Index-Based Algorithm Using Time-Coordination in Mobile Computing”, International Symposiums on Information Processing, pp.578-585, 2008.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.