-
Notifications
You must be signed in to change notification settings - Fork 12
License
sporksmith/polygraph
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
/* * Polygraph (release 0.1) * Signature generation algorithms for polymorphic worms * * Copyright (c) 2004-2005, Intel Corporation * All Rights Reserved * * This software is distributed under the terms of the Eclipse Public * License, Version 1.0 which can be found in the file named LICENSE. * ANY USE, REPRODUCTION OR DISTRIBUTION OF THIS SOFTWARE CONSTITUTES * RECIPIENT'S ACCEPTANCE OF THIS AGREEMENT */ This software is a test harness for the algorithms and techniques described in: ``Polygraph: Automatically generating signatures for polymorphic worms'' Citation: James Newsome, Brad Karp, and Dawn Song. Polygraph: Automatically generating signatures for polymorphic worms. In Proceedings of the IEEE Symposium on Security and Privacy, May 2005. The included software can be used to verify our results, and experiment with variations of our techniques. It is NOT intended to be used as a production system for generating worm signatures. We are aware of several attacks against these signature generation algorithms. Some of these attacks are described in the Polygraph paper itself, others in: James Newsome, Brad Karp, and Dawn Song. Paragraph: Thwarting Signature Learning by Training Maliciously. In Proceedings of the International Symposium on Recent Advances in Intrusion Detection (RAID), September 2006. Other attacks are described in: Roberto Perdisci, David Dagon, Wenke Lee, Prahlad Fogla, and Monirul Sharif. Misleading Worm Signature Generators Using Deliberate Noise Injection. In Proceedings of the IEEE Symposium of the IEEE Symposium on Security and Privacy, May 2006. This software has been tested on Redhat 9.3, using Python 2.3.4. We are unable to provide help on running this software in other environments. This software was written by James Newsome (jnewsome@ece.cmu.edu), in collaboration with Brad Karp (brad.n.karp@intel.com) and Dawn Song (dawnsong@cmu.edu). *************************************************************************** Overview *************************************************************************** This package contains the polygraph code, in the 'polygraph' directory, and scripts to run the experiments in our paper, in the 'experiments' directory. For instructions on installing the polygraph code and running the experiments, see README.install. The source code for polygraph is organized as follows: bin # support scripts pysary # python wrapper to sary suffix array library pysubseq # Smith-Waterman sequence alignment sig_gen # signature generation sigprob # false positive rate estimation sutil # efficient string processing trace_crunching # pcap trace processing util # miscellaneous worm_gen # generators for (synthetic) polymorphic worms *************************************************************************** Known Issues *************************************************************************** The script for reconstructing the network streams from pcap trace files (polygraph/bin/reconstruct_streams) can process multiple traces, but does not reconstruct streams that span over multiple traces. That is, a stream that starts in one pcap file and finishes in the next pcap file will be assembled into two separate streams. In our experiments, noise workloads are generates using samples from the evaluation traces. That is, the streams that are used as noise in the suspicious pool are also used in the evaluation to measure false positives, possibly biasing our results to report slightly higher false positive rates. *************************************************************************** Suggested Extensions *************************************************************************** For token subsequence signatures, we have experimented with keeping track of whether tokens always appear at a particular offset from each-other, and incorporating that information into the signature. This option (use_fixed_gaps in sig_gen.lcseq_tree.LCSeqTree) is currently broken, but would be straight-forward to get working again. This strategy can make the generated signatures more specific (reducing false positives), at the risk of making them *too* specific (increasing false negatives). *************************************************************************** Performance Improvements *************************************************************************** Implement linear-time algorithm for finding what tokens are present in a string. Suffix tree algorithm is implemented in polygraph/sutil/sutilc.c (stree_find_tokens), but it is not currently being used, because naive implementation is faster in practice. See Gusfield for how to achieve linear time bound in the suffix tree implementation. Implement linear-time algorithm for finding common substrings, as described in Gusfield, chapter 9. Current implementation is O(Kn) time, where K is the number of strings, and n is the total length of the strings. This doesn't seem to be a major bottleneck though.
About
No description, website, or topics provided.
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published