Skip to content

Adding XML and JSON Schema validation via validateSchema operator #1343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 28 commits into
base: main
Choose a base branch
from

Conversation

cognitivegears
Copy link
Contributor

Thank you for contributing to Coraza WAF, your effort is greatly appreciated
Before submitting check if what you want to add to coraza list meets quality standards before sending pull request. Thanks!

Make sure that you've checked the boxes below before you submit PR:

Thanks for your contribution ❤️

Fixes Issue #1279

This implements the validateSchema operator as per the existing modsecurity documentation, but also adds on JSON support. Since it didn't fit in the current structure of Coraza I didn't include it here, but I also have additional documentation and a test server for validation at: https://github.com/cognitivegears/coraza_validate_schema_extras

@cognitivegears cognitivegears requested a review from a team as a code owner April 5, 2025 04:28
Copy link

codecov bot commented Apr 5, 2025

Codecov Report

Attention: Patch coverage is 73.07692% with 63 lines in your changes missing coverage. Please review.

Project coverage is 83.80%. Comparing base (ad40dcb) to head (7e37be5).

Files with missing lines Patch % Lines
internal/operators/validate_schema_tinygo.go 49.38% 39 Missing and 2 partials ⚠️
internal/bodyprocessors/xml.go 42.10% 22 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1343      +/-   ##
==========================================
- Coverage   84.01%   83.80%   -0.22%     
==========================================
  Files         170      172       +2     
  Lines        9803    10034     +231     
==========================================
+ Hits         8236     8409     +173     
- Misses       1323     1380      +57     
- Partials      244      245       +1     
Flag Coverage Δ
coraza.rule.case_sensitive_args_keys 83.76% <73.07%> (-0.21%) ⬇️
coraza.rule.multiphase_evaluation 83.46% <73.07%> (-0.21%) ⬇️
coraza.rule.no_regex_multiline 83.74% <73.07%> (-0.21%) ⬇️
default 83.80% <73.07%> (-0.22%) ⬇️
examples+ 16.25% <1.30%> (-0.25%) ⬇️
examples+coraza.rule.case_sensitive_args_keys 83.76% <73.07%> (-0.21%) ⬇️
examples+coraza.rule.multiphase_evaluation 83.30% <73.07%> (-0.20%) ⬇️
examples+coraza.rule.no_regex_multiline 83.66% <73.07%> (-0.21%) ⬇️
examples+memoize_builders 84.05% <85.62%> (+0.07%) ⬆️
examples+no_fs_access 81.09% <73.07%> (-0.15%) ⬇️
ftw 83.80% <73.07%> (-0.22%) ⬇️
memoize_builders 84.18% <85.62%> (+0.06%) ⬆️
no_fs_access 83.31% <73.07%> (-0.20%) ⬇️
tinygo 83.77% <73.07%> (-0.21%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@cognitivegears
Copy link
Contributor Author

FYI I'll take a look at the code coverage change, looks like I need to add a few more tests.

I'm not sure what happened with the Tinygo test, I'll have to look into that one further.

In the meantime, would you mind taking a look at the overall approach and let me know if you have any questions or concerns / things you would like changed?

@jptosso
Copy link
Member

jptosso commented Apr 5, 2025

Thank you for your contribution!!
I'm concerned about XXE for XML, will have to look into the lib

rawXml := buf.String()
if txVar := v.TX(); txVar != nil && v.RequestBody() != nil {
// Store the content type and raw body
txVar.Set("xml_request_body", []string{rawXml})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this how modsec handles this?
Otherwise, I think we should use the transaction context, as it becomes a future concern for data masking


"github.com/corazawaf/coraza/v3/experimental/plugins/plugintypes"
"github.com/kaptinlin/jsonschema"
xsdvalidate "github.com/terminalstatic/go-xsd-validate"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This library uses cgo, we can still use CGO, but CGO must always be a build flag, not part of the default package
cc @jcchavezs
Also, it doesn't look reliable we should look for alternatives

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at the same thing, for the build flag, thanks! I wasn't able to find a better package, but I would absolutely be open to suggestions. One of the considerations in using this package though was fast performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe my comment didn't go through. I was saying since XML usage isn't crucial as we don't seem to have users using them I would just focus on JSON first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've removed the XML code - now everything is passing except a bit of code coverage- I should be able to add a little more testing tomorrow

@cognitivegears
Copy link
Contributor Author

Thank you for your contribution!! I'm concerned about XXE for XML, will have to look into the lib

Good callout. The way that libxml2 is initialized it does not look like it should resolve external entity references. However, I will try to add some tests to validate that specifically as well.

@cognitivegears
Copy link
Contributor Author

Thank you for your contribution!! I'm concerned about XXE for XML, will have to look into the lib

Good callout. The way that libxml2 is initialized it does not look like it should resolve external entity references. However, I will try to add some tests to validate that specifically as well.

Note - added some specific XXE validation testing as well.

@cognitivegears
Copy link
Contributor Author

Thanks so much for the feedback so far! I hope to have some more information later today or tomorrow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants