Skip to content

refs #10610 / fixed #13994 - compile --rule-file pattern only once / extracted regular expressions code to separate file #6211

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

firewave
Copy link
Collaborator

No description provided.

@firewave
Copy link
Collaborator Author

This is mainly so the issues are reported on start-up instead on each file. It does not improve performance much.

The refactoring and improved testing will also make it easier to add support for PCRE2 since PCRE is EOL since quite a while (see https://trac.cppcheck.net/ticket/10610).

}

rule.regex = std::make_shared<Regex>(rule.pattern);
const std::string regex_err = rule.regex->compile();

This comment was marked as outdated.

@@ -293,6 +299,7 @@ class CPPCHECKLIB WARN_UNUSED Settings {
std::string id = "rule"; // default id
std::string summary;
Severity severity = Severity::style; // default severity
std::shared_ptr<Regex> regex;

This comment was marked as outdated.

lib/regex.h Outdated
std::string mPattern;

class Data;
std::shared_ptr<Data> mData;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be a shared pointer because the it is used in Settings are they are being copied (the copies in production code will hopefully go away soon) and we have dynamic memory.

This means that multiple threads will use this but as the used data is const there should not be any issues. I did not see anything related to thread safety in the PCRE documentation. I will check again though. I hope any potential issues will be caught by the CI.

Thinking about that we a --rule test which actually has multiple input files. Wil add that.

@firewave firewave marked this pull request as draft March 30, 2024 14:41
@firewave
Copy link
Collaborator Author

So it seems the shared pointers do not seem to work with forked processes.

But there is something else:

[processfs_1.cpp :0]: (error) Internal error: Child process exited with 1

I wonder where the whitespace in the filename is coming from.

@firewave

This comment was marked as outdated.

@firewave

This comment was marked as outdated.

@firewave firewave changed the title compile --rule-file pattern only once / extracted regular expressions code to separate file / improved errorhandling of --rule and --rule-file= compile --rule-file pattern only once / extracted regular expressions code to separate file Apr 11, 2024
@firewave firewave force-pushed the regex branch 3 times, most recently from 1fb855b to b300dc5 Compare July 2, 2024 07:44
@firewave firewave force-pushed the regex branch 3 times, most recently from e6c61a6 to 2b3ebf5 Compare July 16, 2024 09:01
@firewave firewave force-pushed the regex branch 2 times, most recently from 1b2c244 to 2ed6ba0 Compare October 19, 2024 12:21
@firewave firewave marked this pull request as ready for review October 19, 2024 12:22
@firewave firewave marked this pull request as draft October 19, 2024 12:23
@firewave firewave marked this pull request as ready for review October 19, 2024 12:46
@firewave
Copy link
Collaborator Author

So it seems the shared pointers do not seem to work with forked processes.

The problem is that the memory behind the shared pointer is leaked in the forked process:

=================================================================
==8818==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 1248 byte(s) in 13 object(s) allocated from:
    #0 0x560b0513391f in malloc (/home/runner/work/cppcheck/cppcheck/cmake.output/bin/testrunner+0x4a791f) (BuildId: 6af36c8d50c9763d4f05b83d43e79ba4537c85c0)
    #1 0x7f7c196469b9  (/lib/x86_64-linux-gnu/libpcre.so.3+0x559b9) (BuildId: 3982f316c887e3ad9598015fa5bae8557320476a)

Indirect leak of 1056 byte(s) in 4 object(s) allocated from:
    #0 0x560b0513391f in malloc (/home/runner/work/cppcheck/cppcheck/cmake.output/bin/testrunner+0x4a791f) (BuildId: 6af36c8d50c9763d4f05b83d43e79ba4537c85c0)
    #1 0x7f7c1961f2a3  (/lib/x86_64-linux-gnu/libpcre.so.3+0x2e2a3) (BuildId: 3982f316c887e3ad9598015fa5bae8557320476a)

SUMMARY: AddressSanitizer: 2304 byte(s) leaked in 17 allocation(s).

@firewave firewave marked this pull request as draft October 19, 2024 16:55
firewave referenced this pull request Apr 2, 2025
CMake 4.0 removed support for projects targeting versions prior to 3.5.
@firewave
Copy link
Collaborator Author

firewave commented Jul 6, 2025

I think we should just allow the leak for now so we can get this refactoring merged and can proceed with this.

@firewave firewave force-pushed the regex branch 2 times, most recently from 0937a56 to 085f2f5 Compare July 6, 2025 21:05
@firewave firewave marked this pull request as ready for review July 6, 2025 21:06
@firewave firewave force-pushed the regex branch 4 times, most recently from 1b28fd1 to e8d3bc1 Compare July 6, 2025 22:28
@firewave firewave changed the title compile --rule-file pattern only once / extracted regular expressions code to separate file refs #10610 - compile --rule-file pattern only once / extracted regular expressions code to separate file Jul 7, 2025
@firewave firewave changed the title refs #10610 - compile --rule-file pattern only once / extracted regular expressions code to separate file refs #10610 / fixed #13994 - compile --rule-file pattern only once / extracted regular expressions code to separate file Jul 7, 2025
@firewave firewave added the merge-after-next-release Wait with merging this PR until after the next Release label Jul 7, 2025
@danmar
Copy link
Owner

danmar commented Jul 8, 2025

I would really like to drop --rule-file completely. It's an inferior format to --addon. However it seems some users hates python completely and refuse to deal with it.
I would rather drop the PCRE dependency completely. Why not use std::regexp. I know that --rule-file will be slower then but this feature has been deprecated for 10-15 years. Also if you have 100 files in the project then the rule regexp is compiled 100 times or so, regexp compilation is not THAT slow.. I assume that cppcheck parser will take way longer time than the regexp..

@firewave
Copy link
Collaborator Author

firewave commented Jul 9, 2025

So let's deprecate the rule (build) options without a target in the next version and switch the code to use std::regex (I have that prepared locally) so we can drop the PCRE dependency with the release after. I think it is fair for a deprecated feature to deteriorate in that way.

I would really like to drop --rule-file completely. It's an inferior format to --addon. However it seems some users hates python completely and refuse to deal with it.

We could re-write rules as addons internally and thus no longer need the actual code. But addons are magnitudes slower because of the process invocation and that would be even worse than using std::regex. Needs some performance data generated.

I would rather drop the PCRE dependency completely. Why not use std::regexp. I know that --rule-file will be slower then but this feature has been deprecated for 10-15 years. Also if you have 100 files in the project then the rule regexp is compiled 100 times or so, regexp compilation is not THAT slow.. I assume that cppcheck parser will take way longer time than the regexp..

It is only compiled once. It is the matching which is performed one the analysis.

@firewave
Copy link
Collaborator Author

firewave commented Jul 9, 2025

Also if it is deprecated, this PR should be safe to be merged for the current release.

@firewave firewave removed the merge-after-next-release Wait with merging this PR until after the next Release label Jul 9, 2025
@danmar
Copy link
Owner

danmar commented Jul 11, 2025

It is only compiled once.

That is what this PR is about isn't it? Otherwise the PR title is misleading. How much time does PR save if I have 100 files? Some rough test is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants