From the thesis abstract: "Data exfiltration over a network poses a threat to confidential information. Due to the possibility of malicious insiders, this threat is especially difficult to mitigate. Our goal is to contribute to the development of a method to detect exfiltration of m any targeted files without incurring the full cost of reassembling flows. One strategy for accomplishing this would be to implement an approximate matching scheme that attempts to determine whether a file is being transmitted over the network by analyzing the quantity of payload data that matches fragments of the targeted file. Our work establishes the basic feasibility of such an approach by matching Transmission Control Protocol (TCP) payloads of traffic containing exfiltrated data against a database of MD5 [message-digest algorithm] hashes, each representing a fragment of our target data. We tested against a database of 415 million fragment hashes, where the length of the fragments was chosen to be smaller than the payload size expected for most common Maximum Transmission Units (MTUs), and we simulated exfiltration by sending a sample of our targeted data across the network along with other non-target files representing 'noise.' We demonstrate that under these conditions, we are able to detect the targeted content with a recall of 98.3% and precision of 99.1%."
Naval Postgraduate School, Dudley Knox Library: https://calhoun.nps.edu/