Replies: 9 comments 2 replies
-
Great questions! Some of this can be answered by reading my whitepaper, but well explained threats/examples are definitely lacking in my GitHub repo's documentation.
The VM would need to be installed or (more likely) be a component of a larger malware program. If you consider an interpreter such as the Jaws VM being part of a trojan or a c2 agent, it would seem benign from a static analysis standpoint. After all, an interpreter isn't inherently malicious. Initial Jaws code could be written to fetch more instructions from an arbitrary network location. That initial code could be stored just about anywhere.
Even Python ignores arbitrary whitespace when it's on its own line(s). Jaws' headers and footers create the ability to stop and start interpretation of whitespace much as needed. Non-whitespace controlled languages are trivial to inject as long as you replace all the existing whitespace between Jaws headers and footers with valid Jaws instructions. Consider the following example I just whipped up and added to the tests directory of this project: #include <stdio.h>
int
main()
{
printf("What does this do?");
return 0;
}
In this example, the spaces in the parts I kid, it prints
This is actually part of the beauty of Jaws. You can inject Jaws code into a file after it's gone through a code-formatter and distribute it in that state if you want. Let's say Jaws was injected into a Python script. To a malware analyst, it would initially look like it's the Python code that is doing the "bad stuff." When it comes time to do manual analysis on a Python script with a bunch of blank lines added, what would be the first thing you would do? Assume it's a pathetic attempt at obfuscation and delete it? Obviously with enough time, it would become clear what's going on. But the point of Jaws isn't to try to stump malware analysts, it's to discredit automated static analysis. As I mention in the root directory's README, behavioral analysis would be the only way to detect something like this unless you had pre-existing signatures for this particular interpreter. But there are plenty of ways to create an interpreter, and infinite ways to design a "hidden" programming language. What are we going to do, flag anything that has some kind of interpreter functionality? I propose that we could move on from static analysis and focus on improving behavior-based detection tools. I'm talking about cutting edge developments like finite state machines embedded in a blocking EDR or some hybrid of the static and dynamic analysis (identification + validation). At some point, bad stuff will do something bad whether you could tell it was going to beforehand or not. |
Beta Was this translation helpful? Give feedback.
-
I plan to update the main README to elaborate on some of this. I may even copy some of my explanation here. When you have your work broken up between GitHub, a blog, and a whitepaper, I guess some important parts get lost in translation 😅 Thanks again for your questions, @AnharMiah, I love seeing interest in my work! |
Beta Was this translation helpful? Give feedback.
-
@doctormay6 thanks for the detailed response! That makes more sense now, my only question now is that I can see that visual inspection would certainly fail this is true. On the point regarding Python and adding extra whitespaces on the same line: I actually have "render whitespace" enabled on in my text editor and out of habit will remove any extra whitespaces at the end of lines! of course that doesn't mean other developers are as pedantic! since the VM is mostly a map between opcode to machine I/O wouldn't virus detection software simply add that as their signature? Even if the interpreter is "streamed" over a network, without the full VM you have a non-functional VM, but once you have a full VM then its back to the above and detectable via some kind of signature? EDIT: actually I think I've answered my own question, as you mentioned one could create multiple interpreters so it would be a "cat and mouse" game. The only issue here would be that each "Jaws" code would only work with a certain interpreter and would be incompatible with any other one. I guess what you would need is a "meta Jaws VM" generator, you could probably use it in conjunction with some kind of seed/cipher that then generates the unique VM for each original Jaw code? |
Beta Was this translation helpful? Give feedback.
-
Yep, there are a few ways to go about making it polymorphic. That's the first one that came to my mind too. Static interpreters are usually kind of a "cat and mouse" game by nature. AI-based static analysis tries to overcome this weakness to detect things that look bad, but there are two problems with that approach for a Jaws-type scenario:
|
Beta Was this translation helpful? Give feedback.
-
thanks @doctormay6 ! In regards to training models to detect VMs, I think there is a possibility where in this case it may not be needed. Given that the scope of Jaws will always have to parse whitespace characters, that constraint alone could be used to detect potential Jaws VM instances, this might work well since there aren't too many legitimate things that parse whitespace and the ones that do could be whitelisted? |
Beta Was this translation helpful? Give feedback.
-
You make a good point, but that only works for languages similar to Jaws. The overall weakness of static analysis is interpreted code, not necessary polyglot code. Let's think about unknown interpreted languages in general. There are multiple ways to hide the code of an interpreted language other than polyglot code. To name a few: it could be stored as a string in the calling executable, it could be appended after the footer inside an image file, or it could be streamed over a network connection. Creativity is the biggest limiter here. Let's compare an arbitrary interpreted language to shellcode injection. An executable could be written that accepts a string containing shellcode, and the processor will execute it. Similarly, an executable containing an interpreter would accept a string containing an unknown language, and it would also be executed. The difference would be that the string containing the shellcode could be identified and its functionality could be inferred beforehand. Whereas nothing could be inferred about the unknown language's string of code. Unless a static analysis tool included some form of the interpreter in its functionality, it wouldn't be learning what the code does until it starts executing instructions. To sum up what I'm trying to say, the weakness remains without the requirement of whitespace characters because interpreted code doesn't need to look anything like Jaws. To be honest, the whitespace part of the design was partly to make my project more interesting or attention grabbing. The variation of interpretable context-free languages is nearly infinite, so you really would have to be able to detect any instance of an interpreter. Even without considering interpreted malware, static analysis is already a game of cat and mouse as you said. But bad programs will always do bad things, and in my opinion we should be focusing on detecting/preventing behaviors, not traits. |
Beta Was this translation helpful? Give feedback.
-
hi @doctormay6 thanks for the detailed expansion! I agree that makes sense that we could have any arbitrary language with a custom VM for it and that would be basically impossible to detect because we simply don't have enough information to use as a detection trigger per say. Having said that, in terms of "hiding" an instance of such a language:
would restrict that set such that its not (a) the original language and (b) one that isn't easily seen by humans, I think those constraints does add limits to that nearly infinite set. Of course as you mentioned hiding languages is but one creative way and there are so many other places it would be hidden and you're absolutely right. But here is another thought: given an arbitrary string with length and encoding set we can compute its information entropy. Based on Shannon's Information theory given that whatever language instance that is devised will have some complexity we can I believe statistically detect between absolute random versus a language instance? What are you thoughts? |
Beta Was this translation helpful? Give feedback.
-
I don't have much experience with entropy to be honest, but I would think that there wouldn't be much variation between the entropy of a string containing an unknown programming language versus a string containing a natural language or strings containing structured information used by an application. I see lots of room for false positives with that approach as well. What do you think? |
Beta Was this translation helpful? Give feedback.
-
One quick note, I opened a discussions tab for the repo and will convert this issue to a discussion. We can continue over there if you'd like. Thanks! |
Beta Was this translation helpful? Give feedback.
-
First off thank you, this seems like very interesting research!
Hope these questions doesn't come off as rude:
but that would require said VM to be actually installed on the target machine in the first place?
Since most languages have code formatters and linters, some even auto format on save can it really survive?
then you have whitespace sensitive languages such as Python that casts doubt on this premise?
also note: under emacs one can use the
M-x fixup-whitespace
Beta Was this translation helpful? Give feedback.
All reactions