Open Access Open Access  Restricted Access Subscription or Fee Access

Sensitive Data Leakage Detection using Fuzzy Fingerprint Technique in Host-Assisted Mechanism

Patil Deepali Eknath, Ghatage Trupti Babasaheb, B. Takmare Sachin


Statistics from security firms, government organizations and research institutions show that the numbers of data-leak instances have grown rapidly in recent years. Detecting and preventing data leaks requires a set of complementary solutions, which may include detection data-leak data confinement stealthy malware detection and policy enforcement. The designs, implement, and evaluate fuzzy fingerprint technique that enhances data privacy during data-leak detection operations. This is based on a fast and practical one-way computation on the sensitive data. The DLD provider computes fingerprints from network traffic and identifies by the potential leaks in them. To prevent the DLD provider from gathering exact knowledge about sensitive data, the collection of potential leaks is composed of real leaks and noises. It is the data owner, who post-processes sent back the potential leaks by the DLD provider and determines whether there is any real data leak. This supports detection operation delegation and ISPs can provide data-leak detection as an add-on service to their customers using this model.

Design the host-assisted mechanism for complete data-leak detection for large-scale organizations. The data owner computes a special set of digests or fingerprints from the sensitive data and then discloses only by small amount of them to the DLD provider. Fuzzy fingerprints are special sensitive data digests prepared by the data owner for release to the DLD provider. These results indicate high accuracy achieved by this underlying scheme with very low false positive rate. Data preparation and filtering steps can take considerable amount of processing time but once data preprocessing is done the data become more reliable and robust results are achieved. They have conducted experiments to validate the accuracy and privacy of these solutions.


Data Leak; Network Security; Fuzzy Fingerprint; Data-Leak Detection.

Full Text:



X. Shu and D. Yao, “Data leak detection as a service,” in Proc. 8th Int. Conf. Secur. Privacy Commun. Netw., 2012, pp. 222–240.

A. Kapravelos, Y. Shoshitaishvili, M. Cova, C. Kruegel, and G. Vigna, “Revolver: An automated approach to the detection of evasiveweb-based malware,” in Proc. 22nd USENIX Secur. Symp., 2013, pp. 637–652.

X. Jiang, X. Wang, and D. Xu, “Stealthy malware detection and monitoring through VMM-based ‘out-of-the-box’ semantic view reconstruction,” ACM Trans. Inf. Syst. Secur., vol. 13, no. 2, 2010, p. 12.

A. V. Aho and M. J. Corasick, “Efficient string matching: An aid to bibliographic search,” Commun. ACM, vol. 18, no. 6, 1975 , pp. 333–340.

P.-C. Lin, Y.-D. Lin, Y.-C. Lai, and T.-H. Lee, “Using string matching for deep packet inspection,” IEEE Comput., vol. 41, no. 4, Apr. 2008, pp. 23–28.

J. R. Troncoso-Pastoriza, S. Katzenbeisser, and M. Celik, “Privacy preserving error resilient dna searching through oblivious automata,” in Proc. 14th ACM Conf. Comput. Commun. Secur., 2007, pp. 519–528.

A. C.-C. Yao, “How to generate and exchange secrets,” in Proc. 27th Annu. Symp. Found. Comput. Sci., 1986, pp. 162–167.

D. Yao, K. B. Frikken, M. J. Atallah, and R. Tamassia, “Private information: To reveal or not to reveal,” ACM Trans. Inf. Syst. Secur., vol. 12, no. 1, 2008, Art. ID 6.

Q. Huang, D. Jao, and H. J. Wang, “Applications of secure electronic voting to automated privacy-preserving troubleshooting,” in Proc. 12th ACM Conf. Comput. Commun. Secur., 2005, pp. 68–80.

P. Williams and R. Sion, “Usable PIR,” in Proc. 13th Netw. Distrib. Syst. Secur. Symp., 2008.

M. Burkhart, M. Strasser, D. Many, and X. Dimitropoulos, “SEPIA: Privacy-preserving aggregation of multi-domain network events and statistics,” in Proc. 19th USENIX Conf. Secur. Symp., 2010, p. 15.

X. Yi, R. Paulet, and E. Bertino,Private Information Retrieval (Syn- thesis Lectures on Information Security, Privacy, & Trust). San Rafael, CA, USA: Morgan & Claypool Pub., 2013.

S. Jha, L. Kruger, and V. Shmatikov, “Towards practical privacy for genomic computation,” in Proc. 29th IEEE Symp. Secur. Privacy, May 2008, pp. 216–230.

X. Yi, M. G. Kaosar, R. Paulet, and E. Bertino, “Single-database private information retrieval from fully homomorphic encryption,” IEEE Trans. Knowl. Data Eng., vol. 25, no. 5, May 2013, pp. 1125–1134.

B. Carbunar and R. Sion, “Joining privately on outsourced data,” in Secure Data Management (Lecture Notes in Computer Science), vol. 6358. Berlin, Germany: Springer-Verlag, 2010, pp. 70–86.

G. Jagannathan and R. N. Wright, “Privacy-preserving distributed k-means clustering over arbitrarily partitioned data,” in Proc. 11th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, 2005, pp. 593–599.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.