Skip to content

Commit

Permalink
init code
Browse files Browse the repository at this point in the history
  • Loading branch information
luisa authored and luisa committed Nov 24, 2023
1 parent ef2a894 commit ec33d7f
Show file tree
Hide file tree
Showing 44 changed files with 12,535 additions and 0 deletions.
Binary file added benchmark_results/simil_indel.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
7 changes: 7 additions & 0 deletions benchmark_results/similarity_indices.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
file, distance, ratio
bro217_224,0.2608
dotstar09_300,0.3311
poweren_300,0.1757
protomata_300,0.5637
ranges1_300,0.334
tcp_300,0.3832
102 changes: 102 additions & 0 deletions benchmark_scripts/similarity_benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
import Levenshtein as lv
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import rc

palette = ["#f0f9e8",
"#bae4bc",
"#7bccc4",
"#43a2ca",
"#0868ac"]


inputRE = [
"../datasets/intrusion_benchmarks/bro217_224.txt",
"../datasets/intrusion_benchmarks/dotstar09_300.txt",
"../datasets/PowerEN_cmp/poweren_300.txt",
"../datasets/protomata/protomata_300.txt",
"../datasets/intrusion_benchmarks/ranges1_300.txt",
"../datasets/intrusion_benchmarks/tcp_300.txt",
]

log = "../benchmark_results/similarity_indices.csv"

o = open(log, "w")

o.write("file, distance, ratio\n")

fig = plt.figure()

plot_d = []
f_names= []

# cons. entire dataset
for file in inputRE:
f= open(file, "r")
lines = f.readlines()
distances = []
i_ratios = []
hammings = []
for l in lines:
for m in lines:
if l != m:
i_ratios.append(lv.ratio(l, m))
plot_d.append(np.average(i_ratios))
f_names.append(file.split("/")[3].split(".")[0].split("_")[0])
# print("filen "+f_names[-1]+" ratio "+str(plot_d[-1]))
o.write(file.split("/")[3].split(".")[0]+","+str(round(np.average(i_ratios), 4))+"\n")
f.close()
o.close()



# cons. blocks

# print(plot_d)

df = pd.DataFrame(columns=['dataset','ratio'])


df['dataset'] = f_names
df['ratio'] = plot_d

df.sort_values(by=['dataset'], inplace=True)
# print(df)

names = ["BRO","DS9","PEN","PRO","RG1","TCP"]
df['dataset'] = names
# print(df)


fig = df.plot.bar(x='dataset', y=['ratio'],
rot=0,
color=palette[2],
# title="Normalized INDEL similarity values for different datasets",
fontsize=14,
edgecolor='black',
linewidth=0.4,
width=0.8,
legend=False,

)

fig.set_xlabel('Datasets',fontsize=14)
fig.set_ylabel('Normalized INDEL similarity [0,1]',fontsize=14)
fig.set_xticklabels(df['dataset'],fontsize=12,rotation=90)


# plt.tight_layout()

# plt.show()

fig.get_figure().savefig('../benchmark_results/simil_indel.png', bbox_inches='tight')

# Read CSV into pandas

# Figure Size

# Horizontal Bar Plot

# Show Plot
# plt.show()
21 changes: 21 additions & 0 deletions datasets/PowerEN_cmp/LICENSE_poweren
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
This code was not explicitly licensed by the contributor at the time of download. Thus, we (the University of Virginia) is implicitly licensed by IBM's Download of Content Agreement located at the following URL (https://www.ibm.com/developerworks/community/terms/download?lang=en) and is reproduced below.

The following are terms of a legal downloader agreement (the "Agreement") regarding Your download of Content (as defined below) from this Website. IBM may change these terms of use and other requirements and guidelines for use of this Website at its sole discretion. This Website may contain other proprietary notices and copyright information (http://www.ibm.com/developerworks/exchange), the terms of which must be observed and followed. Any use of the Content in violation of this Agreement is strictly prohibited.

"Content" includes, but is not limited to software, text and/or speech files, code, associated materials, media and /or documentation that You download from this Website. The Content may be provided by IBM or third-parties. IBM-Content is owned by IBM and is copyrighted and licensed, not sold. Third-party Content is owned by its respective owner and is licensed by the third party directly to You. IBM's decision to permit posting of third-party Content does not constitute an express or implied license from IBM to You or a recommendation or endorsement by IBM of any particular product, service, company or technology.

The party providing the Content (the "Provider") grants You a nonexclusive, worldwide, irrevocable, royalty-free, copyright license to edit, copy, reproduce, publish, publicly display and/or perform, format, modify and/or make derivative works of, translate, re-arrange, and distribute the Content or any portions thereof and to sublicense any or all such rights and to permit sublicensees to further sublicense such rights, for both commercial and non-commercial use, provided You abide by the terms of this Agreement. You understand that no assurances are provided that the Content does not infringe the intellectual property rights of any other entity. Neither IBM nor the provider of the Content grants a patent license of any kind, whether expressed or implied or by estoppel. As a condition of exercising the rights and licenses granted under this Agreement, You assume sole responsibility to obtain any other intellectual property rights needed.

The Provider of the Content is the party that submitted the Content for Posting and who represents and warrants that they own all of the Content, (or have obtained all written releases, authorizations and licenses from any other owner(s) necessary to grant IBM and downloaders this license with respect to portions of the Content not owned by the Provider). All information provided on or through this Website may be changed or updated without notice. You understand that IBM has no obligation to check information and /or Content on the Website and that the information and/or Content provided on this Web site may contain technical inaccuracies or typographical errors.

IBM may, in its sole discretion, discontinue the Website, any service provided on or through the Website, as well as limit or discontinue access to any Content posted on the Website for any reason without notice. IBM may terminate this Agreement and Your rights to access, use and download Content from the Website at any time, with or without cause, immediately and without notice.

ALL INFORMATION AND CONTENT IS PROVIDED ON AN "AS IS" BASIS. IBM MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR IMPLIED, CONCERNING USE OF THE WEBSITE, THE CONTENT, OR THE COMPLETENESS OR ACCURACY OF THE CONTENT OR INFORMATION OBTAINED FROM THE WEBSITE. IBM SPECIFICALLY DISCLAIMS ALL WARRANTIES WITH REGARD TO THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IBM DOES NOT WARRANT UNINTERRUPTED OR ERROR-FREE OPERATION OF ANY CONTENT. IBM IS NOT RESPONSIBLE FOR THE RESULTS OBTAINED FROM THE USE OF THE CONTENT OR INFORMATION OBTAINED FROM THE WEBSITE.

LIMITATION OF LIABILITY. IN NO EVENT WILL IBM BE LIABLE TO ANY PARTY FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES FOR ANY USE OF THIS WEBSITE, THE USE OF CONTENT FROM THIS WEBSITE, OR ON ANY OTHER HYPER LINKED WEB SITE, INCLUDING, WITHOUT LIMITATION, ANY LOST PROFITS, BUSINESS INTERRUPTION, LOSS OF PROGRAMS OR OTHER DATA ON YOUR INFORMATION HANDLING SYSTEM OR OTHERWISE, EVEN IF IBM IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

The laws of the State of New York, USA govern this Agreement, without reference to conflict of law principles. The "United Nations Convention on International Sale of Goods" does not apply. This Agreement may not be assigned by You. The parties agree to waive their right to a trial by jury.

This Agreement is the complete and exclusive agreement between the parties and supersedes all prior agreements, oral or written, and all other communications relating to the subject matter hereof. For clarification, it is understood and You agree, that any additional agreement or license terms that may accompany the Content is invalid, void, and non-enforceable to any downloader of this Content including IBM.

If any section of this Agreement is found by competent authority to be invalid, illegal or unenforceable in any respect for any reason, the validity, legality and enforceability of any such section in every other respect and the remainder of this Agreement shall continue in effect.
Loading

0 comments on commit ec33d7f

Please sign in to comment.