Improved xor modifier. #1132

wxsBSD · 2019-09-14T03:43:51Z

Implement an improved xor modifier which allows you to do:

$a = "This program cannot" xor(1-255)

This will generate every single byte xor for the string from 1 to 255 (inclusive).

This commit also switches the default behavior of xor by itself back to the old behavior of xor'ing with 0 to 255 so it will search for the plaintext too.

plusvic

Good job!

libyara/lexer.l

libyara/grammar.y

- No more changes to lexer. - Duplicate modifiers are now errors. - Do not rely on undocumented behavior in bison.

plusvic · 2019-09-17T08:10:47Z

libyara/grammar.y

+      }
+    | string_modifiers string_modifier
+      {
+        set_flag_or_error($$.flags, $2.flags);


This has a problem similar to the one mentioned in my previous comments. The line set_flag_or_error($$.flags, $2.flags) does this:

$$.flags |= $2.flags;

But it turns out that $$ may have any random value initially (at least in theory). In practice it doesn't have a random value because it gets initialized by Bison as $$ = $1, but Bison doesn't guarantee this, it's just an implementation detail that we shouldn't rely on.

Also, my previous comment still holds:

Yes, that's correct, $1 is the accumulation of all the other modifiers. But $1 could have the XOR flag set and some values in xor_min and xor_max, but if $2 doesn't have the flag, the values already $1 are not copied into $$. The values in $$ are the result from this rule, and could be undefined if not explicitly set.

It works because the implementation detail in Bison that forces $$ = $1 before calling our code. But this is not documented, in theory if you don't set the result value for your reduction rule in $$ it is undefined.

I mean, this if statement is not enough:

if ($2.flags & STRING_GFLAGS_XOR) { $$.xor_min = $2.xor_min; $$.xor_max = $2.xor_max; }

If the condition is not true $$.xor_min and $$.xor_max are not initialized, and we could lose any flag or xor limits stored in $1. This is not happening for the same reason explained above.

We should do something like:

$$ = $1; <-- This guarantees that $$ is initialized with whatever we have in $1, even if this already happens in the current implementation of Bison, we must do it explicitly. set_flag_or_error($$.flags, $2.flags); if ($2.flags & STRING_GFLAGS_XOR) { $$.xor_min = $2.xor_min; $$.xor_max = $2.xor_max; }

Thanks for the detailed explanation. I see how it works now, and thank you for pointing it out!

wxsBSD · 2019-09-17T13:11:31Z

I don't know if there is a good answer to this but this string:

$a = "foobar" xor(1-255) wide

will generate generate the xor values from 1 to 255 and the wide string, which is probably not what people will mean when they write this. When I think about those modifiers I think "generate me a wide string and then xor it with 1 to 255", not "generate a wide string and the xor version of that wide string from 1 to 255". A poor way to deal with this is to do:

$a = "f\x00o\x00o\x00b\x00a\x00r\x00" xor(1-255)

I'm not sure how to deal with this. Thoughts?

plusvic · 2019-09-17T13:48:58Z

I don't know if there is a good answer to this but this string:

$a = "foobar" xor(1-255) wide

will generate generate the xor values from 1 to 255 and the wide string, which is probably not what people will mean when they write this. When I think about those modifiers I think "generate me a wide string and then xor it with 1 to 255", not "generate a wide string and the xor version of that wide string from 1 to 255". A poor way to deal with this is to do:

$a = "f\x00o\x00o\x00b\x00a\x00r\x00" xor(1-255)

I'm not sure how to deal with this. Thoughts?

Right now "foobar" xor(1-255) wide extracts the atoms for the wide string (in this case it will be f\x00o\x00) and then apply the xor to those atoms, which results in the interleaved zeroes also xored. That's equivalent to "f\x00o\x00o\x00b\x00a\x00r\x00" xor(1-255), and that's probably what I would expect, right? Would you expect something different?

PS: Now I see that "foobar" xor(1-255) wide is not equivalent to "f\x00o\x00o\x00b\x00a\x00r\x00" xor(1-255) the former will search for the non-xored wide string too. Is that what you meant?

plusvic · 2019-09-17T14:03:11Z

What I would expect is:

"foobar" xor(1-255) wide only matches the xored wide string
"foobar" xor(1-255) ascii only matches the xored ascii string
"foobar" xor(1-255) ascii wide only matches the xored ascii string and the xored wide string

So, once you use the xor modifier you are not matching the original non-xored string anymore, unless you write something like:

"foobar" xor(0-255) wide it matches the original wide string because you are XORing with 0. Does it makes sense?

wxsBSD · 2019-09-17T15:45:15Z

Yes, this makes sense. I'll get it fixed and update this PR.

This commit now makes it so that "xor(1-255) wide" will no longer search for the plaintext wide version of the string, but "xor wide" will because it implies xor(0-255).

wxsBSD · 2019-09-18T12:39:36Z

Updated to address all concerns (I think).

One thing I've realized we can't express is a plaintext string and an xor range that doesn't start at zero. In the current iteration of this PR can you think of a way to do that?

I think the best way to express this would be xor(0,10-20) - that is, the grammar would accept a list of ranges. In this example the 0 is just syntactic sugar for 0-0. Thoughts? Is it worth doing? I'm happy to write the code.

plusvic · 2019-09-18T13:04:26Z

Updated to address all concerns (I think).

One thing I've realized we can't express is a plaintext string and an xor range that doesn't start at zero. In the current iteration of this PR can you think of a way to do that?

I think the best way to express this would be xor(0,10-20) - that is, the grammar would accept a list of ranges. In this example the 0 is just syntactic sugar for 0-0. Thoughts? Is it worth doing? I'm happy to write the code.

Your solution is exactly what I would do, but I would wait and see if this have enough demand and it's worth the effort. The good thing is that this would be backward compatible change. What I would certainly implement now is xor(X) as a syntactic sugar for xor(X-X) because this is easy to implement.

* Implement improved xor modifier. * Fix breakage when xor(...) came before another modifier. * Add documentation for xor improvements. * Fixes from review by Victor. - No more changes to lexer. - Duplicate modifiers are now errors. - Do not rely on undocumented behavior in bison. * Apply fix to deal with undocumented bison feature we shouldn't rely on. * Make xor work properly with other modifiers. This commit now makes it so that "xor(1-255) wide" will no longer search for the plaintext wide version of the string, but "xor wide" will because it implies xor(0-255). * Add support for xor(X) and tests.

wxsBSD added 2 commits September 13, 2019 23:22

Implement improved xor modifier.

fdce972

Fix breakage when xor(...) came before another modifier.

93757f4

wxsBSD mentioned this pull request Sep 14, 2019

xor now work with nocase #1110

Closed

Add documentation for xor improvements.

278283e

plusvic reviewed Sep 16, 2019

View reviewed changes

libyara/lexer.l Outdated Show resolved Hide resolved

libyara/grammar.y Outdated Show resolved Hide resolved

Fixes from review by Victor.

361aa3e

- No more changes to lexer. - Duplicate modifiers are now errors. - Do not rely on undocumented behavior in bison.

plusvic reviewed Sep 17, 2019

View reviewed changes

Apply fix to deal with undocumented bison feature we shouldn't rely on.

859b14c

Make xor work properly with other modifiers.

46f93a6

This commit now makes it so that "xor(1-255) wide" will no longer search for the plaintext wide version of the string, but "xor wide" will because it implies xor(0-255).

Add support for xor(X) and tests.

26affa2

plusvic approved these changes Sep 18, 2019

View reviewed changes

plusvic merged commit ce470b7 into VirusTotal:master Sep 18, 2019

wxsBSD deleted the xor_fixes_2 branch October 22, 2019 16:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved xor modifier. #1132

Improved xor modifier. #1132

wxsBSD commented Sep 14, 2019

plusvic left a comment

plusvic Sep 17, 2019

wxsBSD Sep 17, 2019

wxsBSD commented Sep 17, 2019

plusvic commented Sep 17, 2019 •

edited

Loading

plusvic commented Sep 17, 2019

wxsBSD commented Sep 17, 2019

wxsBSD commented Sep 18, 2019

plusvic commented Sep 18, 2019

Improved xor modifier. #1132

Improved xor modifier. #1132

Conversation

wxsBSD commented Sep 14, 2019

plusvic left a comment

Choose a reason for hiding this comment

plusvic Sep 17, 2019

Choose a reason for hiding this comment

wxsBSD Sep 17, 2019

Choose a reason for hiding this comment

wxsBSD commented Sep 17, 2019

plusvic commented Sep 17, 2019 • edited Loading

plusvic commented Sep 17, 2019

wxsBSD commented Sep 17, 2019

wxsBSD commented Sep 18, 2019

plusvic commented Sep 18, 2019

plusvic commented Sep 17, 2019 •

edited

Loading