Dynamic Checkpoints in Distributed System Based on Data Usage

!!!! Bi-Annual Double Blind Peer Reviewed Refereed Journal !!!!

!!!! Open Access Journal !!!!

Category: 
Part1
Author: 
Ms. Sunayna Giroti, ME IT Student, MIT COLLEGE OF ENGINEERING, Pune
Prof. Priya Deshpande Associate Professor, MIT COLLEGE OF ENGINEERING, Pune
Abstract: 

In distributed system, all the data get stored on data nodes. Availability of this data is always a matter of concern. This data is always vulnerable to failure or loss. To recover from this kind of failure checkpoint method is used by many systems like Hadoop, Amazon and Google Distributed System [5]. Checkpoints get stored in the form of snapshots at namenode. Data nodes are responsible for sending regular updates in the form of snapshots of data stored in it. These snapshots are used to recover data whenever failure takes place for one of the data nodes. In the current scenarios for the last time interval each and every data node sends their snapshots to namenode. But question arises when some data nodes are idle for the last time interval. It is clearly a waste of bandwidth as well as memory space if we send snapshots for unused data nodes. So, our strategy works around this point only. We will send snapshots for only used data nodes on the basis of read and write operations. This will help us reduce memory usage of name node, reduce processing time of name mode, reduce bandwidth consumption of the system. With this strategy we are still able to maintain availability of data.

Rating: 
No votes yet