Technology used in backend logic #247

AccessViolation95 · 2023-10-09T10:41:18Z

AccessViolation95
Oct 9, 2023

Hi there!

I'm a huge fan of your work (I'm at least in part responsible for the hundreds of visits from Tor Browser, I love playing around with settings and seeing how they affect the detection), and I'm fascinated by your backend logic. In particular with regards to automatic grouping of similar fingerprints and being able to tell which properties are and aren't relevant for different fingerprints.

I'm working on a project with the goal of rapidly categorizing strains of malware and deciding whether newly uploaded files are likely to belong to a certain malware strain. The goal is that I can dump 100 different executable files of the same malware strain or family in the service, and it will parse the files intro structured data containing many attributes, and create a single fingerprint which would represent that malware strain. New samples of the same malware would match that fingerprint. Ideally the grouping of submitted files would be largely automatic, and I would just need to define certain created fingerprints as being a certain malware family.

Another side project of mine that these techniques could be useful for is my network location service (an alternative to GPS that uses nearby emitters like Bluetooth and wifi devices), where the task is detecting whether wifi routers are likely mobile or stationary before observing evidence that they've moved, based on previously collected data and the attributes in the announce packets broadcast by routers.

I expect there to be some overlap between what I'm describing and how your backend works. I know it's closed source, but are there things you can point me to? Research papers, talks, blog posts, or maybe a general overview of the technology used for processing the attributes of submitted data?

Thanks!

abrahamjuliot · 2023-10-21T23:02:47Z

abrahamjuliot
Oct 21, 2023
Maintainer

This sounds like a cool and challenging idea. I've not explored malware detection, but the use of Fuzzy hashing and Bloom filters might be useful for this.

Some bits of what goes on in the backend:

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Technology used in backend logic #247

{{title}}

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Technology used in backend logic #247

AccessViolation95 Oct 9, 2023

Replies: 1 comment

abrahamjuliot Oct 21, 2023 Maintainer

AccessViolation95
Oct 9, 2023

abrahamjuliot
Oct 21, 2023
Maintainer