Data deduplication is a special technique to recognize the duplicate data, which stores only one copy for all redundant data, and creates a link to that copy so when data is access by the user, one will have to refer that link. Today’s era, data deduplication has become the necessity and critical component for the primary storage. In this paper, we have discussed the technique of data deduplication on primary storage workloads such as user directories and emails. In this paper, we have merged both a scalable inline cluster Deduplication Framework for Big Data and architecture of primary data deduplication to increase the RAM performance, throughput, and efficient lookups in the memory; in order to reduce the metadata overhead.