Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding support for data uri sanitisation #31721

Merged

Conversation

sahil-seth
Copy link
Contributor

@sahil-seth sahil-seth commented Oct 1, 2024

Changes

This PR aims to address this feature request #31719 where we can now sanitise possible data uris coming in the logs.

Context

Documentation (please check one with an [x])

  • No documentation update is required

How I've tested my work (please select one)

I have verified these changes via:

  • Newly added/modified unit tests, or

lib/logger/utils.ts Outdated Show resolved Hide resolved
@sahil-seth sahil-seth changed the title adding support for data uri sanitisation feat: adding support for data uri sanitisation Oct 1, 2024
@sahil-seth
Copy link
Contributor Author

@rarkins should we go ahead with this change?
The workflows seem to be waiting an approval from a maintainer

rarkins
rarkins previously requested changes Oct 1, 2024
Copy link
Collaborator

@rarkins rarkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These Regex seem too aggressive and will match some unintentional/irrelevant data messages too

@sahil-seth
Copy link
Contributor Author

@rarkins Good point, but do you have any examples that you can think of, which might match to the new regex but should not ideally?
Asking cause it might help me optimise the regex quicker.

@rarkins
Copy link
Collaborator

rarkins commented Oct 1, 2024

No, because there are infinite examples possible, so you should provide some examples of where data uri's need redacting and we make the Regex tighter. For example should it occur only at the start of a field?

@sahil-seth
Copy link
Contributor Author

Here are some data URI examples which I aimed to redact with this change:

data:application/pkcs8;kid=example-key-123;base64,MIIBIjANBgkqhkiG9w0BAp1s1F3lQfZ9c/GbE=
data:application/jwt;kid=jwt-key-789;base64,eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
data:application/x-pem-file;kid=pem-cert-202;base64,LS0tLS1CZWdpbiBDRVJUSUZJQ0FURS0tLS0t
data:audio/wav;kid=audio-key-707;base64,UklGRjIAAABXQVZFZm10IBAAAAABAAEAESsAACJWAAACABAAZGF0YYAAAA

Also, upon a quick check with an AI assistant, the regex doesn't seem to match unnecessary strings apart from URLs or data URI like fields.

@sahil-seth
Copy link
Contributor Author

For example should it occur only at the start of a field?

Yes, it can. It might be safer to redact the whole string after data cause it takes away the knowledge of what kind of key it is, whether it is jwt or pkcs8 or pem-file along with the key itself.

WDYT?

@rarkins
Copy link
Collaborator

rarkins commented Oct 1, 2024

Please provide a full log example from today (you can redact parts of it yourself). Eg where does this appear today?

@sahil-seth
Copy link
Contributor Author

Screenshot 2024-10-01 at 12 28 30 PM

Here is an example

@rarkins
Copy link
Collaborator

rarkins commented Oct 1, 2024

Great, so can we start with the case where we match against ^data: and it's the whole value of the log message? (as in your examples)

@sahil-seth
Copy link
Contributor Author

@rarkins I'm not sure If I got you this time, could you please elaborate on the suggestion?

@sahil-seth
Copy link
Contributor Author

@rarkins just following up on above message,
do you mean we update the regex to include the strings starting with data: ?

@rarkins
Copy link
Collaborator

rarkins commented Oct 3, 2024

Yes, the regex should be strict to match only where the log content starts and ends with a data URI

@viceice
Copy link
Member

viceice commented Oct 3, 2024

we can even more tighten the regex to also require the content-type after data:

^data:[0-9a-z-]+/[0-9a-z-]+;.+

@sahil-seth
Copy link
Contributor Author

sahil-seth commented Oct 4, 2024

@rarkins @viceice
Just to clarify and make sure we are on the same page,

Are we talking about dataUriCredRe OR urlRe when we talk about tightening the regex?

lib/logger/utils.ts Outdated Show resolved Hide resolved
@rarkins
Copy link
Collaborator

rarkins commented Oct 4, 2024

It's better if you revert your changes to urlRe

@sahil-seth
Copy link
Contributor Author

sahil-seth commented Oct 4, 2024

@rarkins I've reverted the change to urlRe and added a "if" case for data URIs using the regex suggested above, LMK if this looks fine?

lib/logger/utils.ts Outdated Show resolved Hide resolved
lib/logger/utils.ts Outdated Show resolved Hide resolved
sahil-seth and others added 3 commits October 4, 2024 19:47
Co-authored-by: Michael Kriese <michael.kriese@visualon.de>
Co-authored-by: Michael Kriese <michael.kriese@visualon.de>
lib/logger/utils.ts Outdated Show resolved Hide resolved
lib/logger/utils.ts Outdated Show resolved Hide resolved
Co-authored-by: Michael Kriese <michael.kriese@visualon.de>
@viceice
Copy link
Member

viceice commented Oct 4, 2024

please run prettier to fix linter

@sahil-seth
Copy link
Contributor Author

@viceice done!

@sahil-seth
Copy link
Contributor Author

@rarkins @viceice Does this PR looks good to merge now?
once the pipelines are triggered...

@rarkins rarkins enabled auto-merge October 7, 2024 13:18
@viceice viceice dismissed rarkins’s stale review October 7, 2024 15:22

automerge enabled

@sahil-seth
Copy link
Contributor Author

sahil-seth commented Oct 8, 2024

@rarkins @viceice thanks for enabling auto-merge but looks like the pipelines need to be run once again?

@viceice
Copy link
Member

viceice commented Oct 8, 2024

@rarkins @viceice thanks for enabling auto-merge but looks like the pipelines need to be run once again?

stop merging main, we're using GitHub merge queue. if you push something then actions need new approval because you're a first time contributer

@rarkins rarkins added this pull request to the merge queue Oct 8, 2024
Merged via the queue into renovatebot:main with commit dbd69e9 Oct 8, 2024
38 checks passed
@renovate-release
Copy link
Collaborator

🎉 This PR is included in version 38.111.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

zemnmez-renovate-bot added a commit to zemn-me/monorepo that referenced this pull request Oct 8, 2024
##### [`v38.111.0](https://github.com/renovatebot/renovate/releases/tag/38.111.0)

##### Features

-   adding support for data uri sanitisation ([#31721](renovatebot/renovate#31721)) ([dbd69e9](renovatebot/renovate@dbd69e9))

##### Miscellaneous Chores

-   **deps:** update linters to v8.8.0 ([#31849](renovatebot/renovate#31849)) ([05accc8](renovatebot/renovate@05accc8))
github-merge-queue bot pushed a commit to zemn-me/monorepo that referenced this pull request Oct 8, 2024
##### [`v38.111.0](https://github.com/renovatebot/renovate/releases/tag/38.111.0)

##### Features

-   adding support for data uri sanitisation ([#31721](renovatebot/renovate#31721)) ([dbd69e9](renovatebot/renovate@dbd69e9))

##### Miscellaneous Chores

-   **deps:** update linters to v8.8.0 ([#31849](renovatebot/renovate#31849)) ([05accc8](renovatebot/renovate@05accc8))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants