Align "reference" field value in JSON transaction log with standard log. #3093

asamuta · 2024-02-23T16:04:42Z

Transaction JSON log of "reference" is too verbose for some rules. There is a limit for "ref" field in "standard" log.

file: rule_message.cc

...
msg.append(" [ref \"" + utils::string::limitTo(200, rm->m_reference) + "\"]");
...

but there is no such limit in file: transaction.cc

...
LOGFY_ADD("reference", a.m_reference.c_str());
...

With this fix JSON out will be truncated as standard log.

Before:

...
"messages": [
      {
        ....
        "details": {
          "match": "Matched \"Operator `Gt' with parameter `10000' against variable `ARGS_COMBINED_SIZE' (Value: `971714.000000' )",
          "reference": "v0,31v32,32v0,31v32,74v0,32v33,166v0,33v34,5v0,31v32,32v0,30v31,165v0,33v34,193v0,33v34,227v0,33v34,41v0,33v34,47v0,33v34,222v0,35v36,9v0,28v29,10v0,25v26,237v0,35v36,62v0,31v32,20v0,31v32,20v0,31v32,20v0,31v32,19v0,31v32,19v0,31v32,20v0,31v32,20v0,31v32,20v0,31v32,19v0,32v33,19v0,25v26,26v0,25v26,39v0,25v26,321v0,25v26,216v0,25v26,147v0,25v26,229v0,25v26,295v0,25v26,147v0,25v26,219v0,26v27,412v0,26v27,357v0,26v27,254v0,26v27,232v0,26v27,309v0,26v27,76v0,26v27,116v0,26v27,32v0,26v27,346v0,26v27,211v0,26v27,467v0,26v27,363v0,26v27,79v0,26v27,190v0,26v27,205v0,26v27,217v0,33v34,17v0,33v34,17v0,33v34,16v0,33v34,17v0,33v34,17v0,33v34,17v0,33v34,16v0,33v34,17v0,33v34,17v0,34v35,17v0,..... 75 kilobytes of text here,
          "ruleId": "1100",
          "file": "/etc/nginx/modsec/owasp-modsecurity-crs/rules/REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf",
          "lineNumber": "209",
         ...
        }
      }
...

after the fix:

...
"reference":"v0,31v32,32v0,31v32,74v0,32v33,166v0,33v34,5v0,31v32,32v0,30v31,165v0,33v34,193v0,33v34,227v0,33v34,41v0,33v34,47v0,33v34,222v0,35v36,9v0,28v29,10v0,25v26,237v0,35v36,62v0,31v32,20v0,31v32,20v0,31v32, (79218 characters omitted)",
...

As long as there is no way to disable the "reference" field in JSON transaction log and the information it contains doesn't look useful in production logs please approve this change. I think the JSON log approach may be reworked later but meanwhile, we need a way to truncate it.

sonarcloud · 2024-02-23T16:05:22Z

Quality Gate passed

Issues
0 New issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

airween · 2024-02-25T21:40:47Z

Hi @asamuta,

first of all: thank your for your contribution!

Could you share with us some example, how can we reproduce this message in audit log? (Eg. share some details (eg. used rule set, special settings) and a curl example).

asamuta · 2024-02-26T10:33:25Z

Hi @airween,

NGINX
MOD_SECURITY v3.0.12
OWASP CRS V4.0 https://github.com/coreruleset/coreruleset/releases/tag/v4.0.0

Additional rule that must be placed in REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf

SecRule ARGS_COMBINED_SIZE "@gt 1000"\
     "id:400001,\
     phase:2,\
     pass,\
     nolog,\
     auditlog,\
     msg:'REQ_SIZE',\
     logdata:'%{ARGS_COMBINED_SIZE}',\
     ctl:ruleRemoveTargetByTag=attack-xss;ARGS"

This is a simplified rule to demonstrate the issue.

Payload - a big json (the example is attached).
example.json

CURL:
curl --location 'localhost:8080/test' --header 'Content-Type: application/json' --data-binary "@./example.json"

NOTE: Nginx in my case configured as a reverse proxy with upstream that targets a mock server that responds with simple valid JSON to all requests.

MirkoDziadzka · 2024-02-26T14:59:15Z

src/transaction.cc

-            LOGFY_ADD("reference", a.m_reference.c_str());
+            LOGFY_ADD("reference", utils::string::limitTo(200, a.m_reference).c_str());


This changes the semantic (which might be ok) but it also might break automated parsing tools. I think when we cut of data, the remaining stuff should still be parsable (e.g. cut of in a way that all the remaining data is valid and follow the expected format)

I asked the example to check the behavior on mod_security2.

Unfortunately I can't find any documentation about the expected behavior, but as I know this is not a bug, but a feature - I mean the audit.log contains the whole log entry.

In fact, the limitTo() method is justified, because Nginx has a limit for the length of error.log line (NB: Apache has it too). But audit.log does not have any limit, so if we truncate the line in audit.log, we can lost relevant information.

In addition: I can imagine this PR can be a useful feature, but only with a configure option: if anyone wants to use this, then it can be optional in build time. Then we can be sure that user knows what he does.

Let me know how mod_security2 works.

asamuta · 2024-02-26T22:34:09Z

Hi @MirkoDziadzka , @airween ,
I have tried to find any documentation about what the reference field value is. Unfortunately without any luck. Could you please guide me to any documentation about what this reference value means?

"reference": "v0,31v32,32v0,31v32,74v0,32v33,166v0 ....

How to interpret these numbers? How do these numbers help a SOC team? Is it worth storing tens of kilobytes?

Ideally of course it should be possible to switch it on/off using "audit log parts" letters. But it will be a much more complicated task... I think the build time option is also a good point.

Thanks for your efforts on this.
Waiting for your decision.

zimmerle · 2024-02-27T00:47:41Z

Hi @asamuta, I developed this feature to pinpoint the exact location (within the request body or any other data) where the operator returned true. It was intended to highlight the relevant fragments related to the match, making visualization easier in the web console for specific use cases. The issue you've encountered seems to be a bug, although I haven't delved into the details. If it's not necessary, you can easily disable this feature with a build-time flag, or even at runtime if needed. Over the years, I've customized this for various consulting clients. If it's not being utilized, it might indeed be superfluous to retain this data. I understand your concern; even a few bytes can add up to significant costs over a billion requests.

As per v2, I don't think this feature is available there.

MirkoDziadzka · 2024-02-27T07:57:51Z

I agree that this feature might not be used. Unfortunately, (if I remember correctly), many regression tests are using the debug output of the variable origin.

My concern with this PR: Assuming somebody is using this and parsing the line and it fails (e.g. simple python scripts would raise an exception) and some workflow is breaking.

I'm not sure if this is a concern for the modec team, but long term ago I worked as a system admin and I would not like such a change. But not my decision. It might ok to just remove it.

@asamuta regarding the format:

it is a list of v{offset},{length} of all the variable locations in the request.
smashed together without separator
"documented" in the code in variable_origin.h

If we care to not break external parsers, we should probably split before "v"

asamuta · 2024-02-28T14:47:59Z

Hi All,
As far as I understand the easiest option here is to create a build-time flag and cut the value before 'v' to keep parsers working. Please confirm that such a change will be accepted.

If so, please point me to such "build-time flag" so that I can implement this based on that example. Also, it would be great if you could advise me the name of the new flag according to the naming convention used in the project.
Thanks in advance.

airween · 2024-02-28T21:51:45Z

Please confirm that such a change will be accepted.

I can agree an optional, not mandatory modification.

If so, please point me to such "build-time flag" so that I can implement this based on that example. Also, it would be great if you could advise me the name of the new flag according to the naming convention used in the project. Thanks in advance.

As I know libmodsecurity3 does not have any kind of configure option which can be used to demonstrate the example. But mod_security2 has. You should check out the branch v2/master (or it would be better if you make a new clone and check out there), run the ./autogen.sh command, and then the ./configure --help. There you can see some similar solution, eg. --enable-pcre-match-limit or --enable-pcre-match-limit-recursion. For the syntax, check the configure.ac. Please find the references in source code for symbol MODSEC_PCRE_MATCH_LIMIT or MODSEC_PCRE_MATCH_LIMIT_RECURSION.

If you don't want to check the source tree and the build system's behavior, just see the configure.ac file relevant parts.

For the name, I can imagine the --enable-truncate-reference-in-auditlog or something similar, with a numeric argument, which indicates how much is the maximum length of that field, eg --enable-truncate-reference-in-auditlog=200 which means the maximum length can be 200 characters (or the last occurrence of v{offset},{length} pattern). Please make an accurate note for that option.

The option name is not the best choice, so please help me to figure out the correct one.

airween · 2024-03-29T10:27:31Z

To everyone: what do you think about to make this feature optional? Do you think we can use the configure flag --enable-truncate-reference-in-auditlog or --enable-truncate-reference-in-auditlog=NNN?

@asamuta could you prepare this patch with this suggestion? May be you should replace the name of the flag, but that's it, I guess.

Align "reference" field value in JSON transaction log with standard log.

8aaa3ec

airween added the 3.x Related to ModSecurity version 3.x label Feb 25, 2024

airween self-assigned this Feb 25, 2024

MirkoDziadzka reviewed Feb 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Align "reference" field value in JSON transaction log with standard log. #3093

Align "reference" field value in JSON transaction log with standard log. #3093

asamuta commented Feb 23, 2024 •

edited

Loading

sonarcloud bot commented Feb 23, 2024

airween commented Feb 25, 2024

asamuta commented Feb 26, 2024

MirkoDziadzka Feb 26, 2024

airween Feb 26, 2024

asamuta commented Feb 26, 2024

zimmerle commented Feb 27, 2024

MirkoDziadzka commented Feb 27, 2024 •

edited

Loading

asamuta commented Feb 28, 2024

airween commented Feb 28, 2024

airween commented Mar 29, 2024

		LOGFY_ADD("reference", a.m_reference.c_str());
		LOGFY_ADD("reference", utils::string::limitTo(200, a.m_reference).c_str());

Align "reference" field value in JSON transaction log with standard log. #3093

Are you sure you want to change the base?

Align "reference" field value in JSON transaction log with standard log. #3093

Conversation

asamuta commented Feb 23, 2024 • edited Loading

sonarcloud bot commented Feb 23, 2024

Quality Gate passed

airween commented Feb 25, 2024

asamuta commented Feb 26, 2024

MirkoDziadzka Feb 26, 2024

Choose a reason for hiding this comment

airween Feb 26, 2024

Choose a reason for hiding this comment

asamuta commented Feb 26, 2024

zimmerle commented Feb 27, 2024

MirkoDziadzka commented Feb 27, 2024 • edited Loading

asamuta commented Feb 28, 2024

airween commented Feb 28, 2024

airween commented Mar 29, 2024

asamuta commented Feb 23, 2024 •

edited

Loading

MirkoDziadzka commented Feb 27, 2024 •

edited

Loading