Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurability of analyzer versions #3997

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cservakt
Copy link
Collaborator

@cservakt cservakt commented Aug 28, 2023

The modification can be used to validate the version of each analyzer. The user can specify all desired analyzers after the --analyzers flag, as was possible before. Also, the exact version number can be typed for each analyzer, separated by an '==' char. For example: 'CodeChecker analyze compile_commands.json -o reports --analyzers clangsa==14.0.0 cppcheck==2.7'. If the version number was not the same as in the current environment, error message would be logged and the system exit would trigger. The user can enumerate the analyzers with or without versions, for example: 'CodeChecker analyze compile_commands.json -o reports --analyzers clangsa=14.0.0 cppcheck'.

@cservakt cservakt added WIP 💣 Work In Progress CLI 💻 Related to the command-line interface, such as the cmd, store, etc. commands analyzer 📈 Related to the analyze commands (analysis driver) new feature 👍 New feature request labels Aug 28, 2023
@cservakt cservakt added this to the release 6.23.0 milestone Aug 28, 2023
Copy link
Collaborator

@Szelethus Szelethus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a big fan of hard errors in favour of warnings. CodeChecker has a lot of output, and these warnings are hard to miss. Also, if I explicitly request an analyzer on a specific version, and the specified version is not found, I might prefer to fix that before sinking countless hours into an analysis.

This idea might clash with the philosophy we have with clang-tidy not having a single checker enabled (in which case we actually prefer a warning to a hard error).

Also, there migth be several versions of the binary available from config files or PATH, but we we don't try to look for them, is that correct?

for analyzer_name in analyzers:
analyzer_name, analyzer_version = analyzer_name.split('=', 1) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
analyzer_name, analyzer_version = analyzer_name.split('=', 1) \
analyzer_name, requested_version = analyzer_name.split('=', 1) \

Does this sound any better? Not sure, just an idea.

Copy link
Collaborator Author

@cservakt cservakt Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I've already fixed the variable name.

Copy link
Contributor

@bruntib bruntib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements, thanks! Also, some test code could check this new feature.

"'config/package_layout.json' file!",
"'PATH' environment variable, the "
"'config/package_layout.json' file "
"and the argument parameters!",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"and the argument parameters!",
"and the --analyzers flag!",

@@ -152,9 +152,10 @@ def __add_plugin_load_flags(cls, analyzer_cmd: List[str]):
analyzer_cmd.extend(["-load", plugin])

@classmethod
def get_version(cls, env=None):
def get_version(cls, analyzer_binary=None, env=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, analyzer_binary shouldn't be a parameter of this function. an analyzer class belongs to a specific analyzer binary which is stored in cls.analyzer_binary().

This comment goes to the get_version() and version_info() functions in all analyzer classes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, it is better to have analyzer_binary parameter, because there is a version_compatible class method in the cppchecker analyzer that also calls the version_info method, and in this case the analyzer_binary parameter must be specified as configured_binary. The version_compatible methods of the analyzers just return true and they do not validate their version, but it is better to have the same syntax for all version_info methods.

Comment on lines 213 to 214
else:
return []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch is not needed. The function could return None by default if the version number can't be fetched.

@@ -347,7 +347,6 @@ def add_arguments_to_parser(parser):
dest='analyzers',
metavar='ANALYZER',
required=False,
choices=analyzer_types.supported_analyzers,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this in analyzer/codechecker_analyzer/cmd/check.py too.

@@ -347,7 +347,6 @@ def add_arguments_to_parser(parser):
dest='analyzers',
metavar='ANALYZER',
required=False,
choices=analyzer_types.supported_analyzers,
default=argparse.SUPPRESS,
help="Run analysis only with the analyzers "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The help message could be extended by some info about the version number. Don't forget to apply the changes in docs/analyzer/user_guide.md too.

for analyzer_name in analyzers:
analyzer_name, analyzer_version = analyzer_name.split('=', 1) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider using the same format as other tools are supporting when providing version info. For example, in requirements.txt you can write this: my_dependency==1.2.3. In the future > could be used too, but for now equality check is enough.

@cservakt
Copy link
Collaborator Author

I'm a big fan of hard errors in favour of warnings. CodeChecker has a lot of output, and these warnings are hard to miss. Also, if I explicitly request an analyzer on a specific version, and the specified version is not found, I might prefer to fix that before sinking countless hours into an analysis.

This idea might clash with the philosophy we have with clang-tidy not having a single checker enabled (in which case we actually prefer a warning to a hard error).

Also, there migth be several versions of the binary available from config files or PATH, but we we don't try to look for them, is that correct?

I think the warning message is sufficient in this case. We print similar message when other analyzer problems occur, for example, when the binary of one analyzer is not found, but the others are.

@cservakt cservakt requested a review from dkrupp as a code owner August 30, 2023 11:44
@cservakt cservakt force-pushed the analyzer-version-config branch 5 times, most recently from 4315b5a to 420cda9 Compare August 30, 2023 12:27
The modification can be used to validate the version of each analyzer.
The user can specify all desired analyzers after the --analyzers flag, as was possible before. Also, the exact version number can be typed for each analyzer, separated by an '==' char. For example: 'CodeChecker analyze compile_commands.json -o reports --analyzers clangsa==14.0.0 cppcheck==2.7'.
If the version number was not the same as in the current environment, the analyzer would be disabled. The user can enumerate the analyzers with or without versions, for example: 'CodeChecker analyze compile_commands.json -o reports --analyzers clangsa==14.0.0 cppcheck'.
@cservakt cservakt requested a review from bruntib August 30, 2023 12:46
@Szelethus
Copy link
Collaborator

I don't intend to die on this hill, but I'm not yet convinced that a soft warning is the way to go. Breaking existing functionality is never a step to be taken lightly, but this is a new feature, we actually can be quite strict.

I foresee users setting an explicit version of the analyzer trying to achieve a reproducable analysis across different machines. I don't really see the scenario where the desired behaviour would be to just (almost) silently disable it and go on.

Also, you haven't commented on the other idea, checking whether the analyzer with the requested version is available from the path. I suspect we only get hold of a single binary and look no further. This could be especially cool for people that need to have multiple compiler versions on hand (which is rather common among C++ developers).

The modification can be used to validate the version of each analyzer. The user can specify all desired analyzers after the --analyzers flag, as was possible before. Also, the exact version number can be typed for each analyzer, separated by an '==' char. For example: 'CodeChecker analyze compile_commands.json -o reports --analyzers clangsa==14.0.0 cppcheck==2.7'. If the version number was not the same as in the current environment, error message would be logged and the system exit would trigger. The user can enumerate the analyzers with or without versions, for example: 'CodeChecker analyze compile_commands.json -o reports --analyzers clangsa=14.0.0 cppcheck'.
@cservakt cservakt removed the WIP 💣 Work In Progress label Sep 22, 2023
@cservakt
Copy link
Collaborator Author

cservakt commented Oct 6, 2023

I don't intend to die on this hill, but I'm not yet convinced that a soft warning is the way to go. Breaking existing functionality is never a step to be taken lightly, but this is a new feature, we actually can be quite strict.

I foresee users setting an explicit version of the analyzer trying to achieve a reproducable analysis across different machines. I don't really see the scenario where the desired behaviour would be to just (almost) silently disable it and go on.

Also, you haven't commented on the other idea, checking whether the analyzer with the requested version is available from the path. I suspect we only get hold of a single binary and look no further. This could be especially cool for people that need to have multiple compiler versions on hand (which is rather common among C++ developers).

Now, we give error message if an analyzer version is incorrect.

In the future, it would be great not only to validate the version number, but also to allow the user to select it.

@@ -140,8 +139,7 @@ def perform_analysis(args, skip_handlers, actions, metadata_tool,
analyzers = args.analyzers if 'analyzers' in args \
else analyzer_types.supported_analyzers
analyzers, errored = analyzer_types.check_supported_analyzers(analyzers)
analyzer_types.check_available_analyzers(analyzers, errored)

analyzer_types.check_available_analyzers(analyzers, errored, args)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a rather hard project a while ago, trying to factor args out of a function mostly responsible for CodeChecker cmd diff (#3863). The problem with args is that, in my mind, its supposed to be a very temporary object -- the 'main' file or function receives this object as a representation of the user's program arguments, parses it, and discards it.

How would you unit test a function that requires monstrositty of an object? You'd spend 30 minutes trying to assemble all the small little parts of args until the test finally runs. If you decide to thread args out of a function, and all the functions that it calls that also take args, you'll spend about a week doing it.

I am perfectly aware that I'm speaking of greater evils than what is seen here (pretty much a slippery slope argument), but these, imo, decent reasons against it :^)

def analyzer_env(cls):
"""
A subclass should have a analyzer_env method
to return the env for analyzer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is about about as descriptive, as saying this:

  • return the context of the analyzer
  • return the parameters of the analyzer
  • return the stuff about the analyzer
  • return the details of the analyzer
  • return the analyzer data

You get my point :^) Do you mean to return the environmental variables? Why is this important? Do analyzers run with their own environmental variables?

def analyzer_binary(cls):
"""
A subclass should have a analyzer_binary method
to return the bin of analyzer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this ever be None?

def version_info(cls):
"""
A subclass should have a version_info method
to return with the analyzer's version number.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected format? Like this: 0.3.5 or like this: v1.1.? Must it be valid as per StrictVersion's definition?

requested_version = StrictVersion(requested_version)
if requested_version != bin_version:
LOG.error(
f"Given version: {requested_version}, found version "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f"Given version: {requested_version}, found version "
f"Requested version: {requested_version}, found version "

Comment on lines +330 to +332
Run the Cppcheck version command
parse the output
and return the analyzer's version.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Run the Cppcheck version command
parse the output
and return the analyzer's version.
Run the Cppcheck version command parse the output
and return the analyzer's version.

@@ -254,7 +254,8 @@ analyzer arguments:
--analyzers ANALYZER [ANALYZER ...]
Run analysis only with the analyzers specified.
Currently supported analyzers are: clangsa, clang-
tidy.
tidy, cppcheck. Version number can be also specified
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give an example? Also, did we ever change the actual --help output? I can't see it anywhere.


@classmethod
@abstractmethod
def analyzer_env(cls):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, there is no need to expose the analyzer's environment on this interface. The specific analyzers should know internally what environment to use and this doesn't need to be visible outside.

"""
if analyzers and (not errored
or 'analyzers' not in args
or not any("analyzer" in e[1].lower() for e in errored)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't check the content of the description string for the reasons of the error. If an analyzer is in errored then it's either not executable, or the version doesn't match. In both cases they are in errored.
Actually this whole first branch is not necessary. We could emit an error in the two "else" branches. Even they are so similar that maybe they can be merged together.

@@ -184,7 +202,18 @@ def check_supported_analyzers(analyzers):
# Check version compatibility of the analyzer binary.
if analyzer_bin:
analyzer = supported_analyzers[analyzer_name]
if not analyzer.version_compatible(analyzer_bin, check_env):
if requested_version:
bin_version = StrictVersion(str(analyzer.version_info()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case of ClangSA the analyzer.version_info() returns a ClangVersionInfo object which is a custom object. This doesn't convert to a string that could be fed by StrictVersion. If ClangSA.version_info() would also return a StrictVersion then no conversion would be needed.

Comment on lines +257 to +258
tidy, cppcheck. Version number can be also specified
for analyzers to verify them.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put this extra text to the argparse "help" attribute too. This documentation is supposed to be a copy-pase of the actual CodeChecker analyze --help page.

Copy link
Member

@dkrupp dkrupp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add testcases.

Also some use cases are not working:

  1. specifying multiple required analyzers without version should work

CodeChecker analyze . --analyzers clang-tidy,clangsa -o reports

[ERROR 2023-10-11 20:15] - Analyzer 'clang-tidy,clangsa' is enabled but CodeChecker is failed to execute analysis with it: 'Analyzer unsupported by CodeChecker!'. Please check your 'PATH' environment variable, the 'config/package_layout.json' file and the --analyzers flag!
[ERROR 2023-10-11 20:15] - Failed to run command because one or more given analyzers cannot be found on your machine!

  1. Specifying some analyzers with version should work, but not it throws an exception:

(CodeChecker venv-dev) ednikru@seliiuts02064[20:46][ednikru/test-projects/xerces-c]$ CodeChecker analyze . --analyzers clangsa==13.0.0, clang-tidy==13.0.0 -o reports
Traceback (most recent call last):
File "/local/ednikru/codechecker/build/CodeChecker/lib/python3/codechecker_common/cli.py", line 209, in main
sys.exit(args.func(args))
File "/local/ednikru/codechecker/build/CodeChecker/lib/python3/codechecker_analyzer/cmd/analyze.py", line 1076, in main
analyzer.perform_analysis(args, skip_handlers, actions, metadata_tool,
File "/local/ednikru/codechecker/build/CodeChecker/lib/python3/codechecker_analyzer/analyzer.py", line 141, in perform_analysis
analyzers, errored = analyzer_types.check_supported_analyzers(analyzers)
File "/local/ednikru/codechecker/build/CodeChecker/lib/python3/codechecker_analyzer/analyzers/analyzer_types.py", line 207, in check_supported_analyzers
requested_version = StrictVersion(requested_version)
File "/app/vbuild/UBUNTU18-x86_64/python/3.9.17/lib/python3.9/distutils/version.py", line 40, in init
self.parse(vstring)
File "/app/vbuild/UBUNTU18-x86_64/python/3.9.17/lib/python3.9/distutils/version.py", line 137, in parse
raise ValueError("invalid version number '%s'" % vstring)
ValueError: invalid version number '13.0.0,

See also the small suggestions inline.

@@ -292,7 +292,6 @@ def add_arguments_to_parser(parser):
dest='analyzers',
metavar='ANALYZER',
required=False,
choices=analyzer_types.supported_analyzers,
default=argparse.SUPPRESS,
help="Run analysis only with the analyzers "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss the version definition information from the help text.

Run analysis only with the analyzers specified. Currently supported analyzers are: clangsa, clang-tidy, cppcheck.
Optionally, you can specify the analyzer version as follows: clangsa==13.0.0, clang-tidy=12.1.0, cppcheck
The version string is prefix matched. If the version information is omitted, any analyzer version will be accepted, but all the lister analyzers must be available.

"'PATH' environment variable and the "
"'config/package_layout.json' file!",
analyzer_binary, reason)
LOG.error("Analyzer '%s' is enabled but CodeChecker is failed to "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this error message because it does not tell me where it found the analyzer binary.

Consider this error message instead (we dont need the "Please check your PATH..." part) :
CodeChecker analyze . --analyzers clangsa==13.0.1 -o reports
[ERROR 2023-10-11 20:09] - clangsa analyzer with the required version "13.0.1" not found. clangsa analyzer with version "13.0" found instead at /app/vbuild/UBUNTU18-x86_64/clang/13.0.0/bin/clang
[ERROR 2023-10-11 20:09] - Failed to run command because one or more given analyzers could not be found on your machine

@@ -1098,7 +1099,8 @@ analyzer arguments:
--analyzers ANALYZER [ANALYZER ...]
Run analysis only with the analyzers specified.
Currently supported analyzers are: clangsa, clang-
tidy.
tidy, cppcheck. Version number can be also specified
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe in this documentation the version matching sytnax.(prefix matching?) in this section of the document https://github.com/Ericsson/codechecker/blob/master/docs/analyzer/user_guide.md#configuring-clang-version

@dkrupp dkrupp modified the milestones: release 6.23.0, release 6.24.0 Nov 9, 2023
@bruntib bruntib marked this pull request as draft March 21, 2024 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
analyzer 📈 Related to the analyze commands (analysis driver) CLI 💻 Related to the command-line interface, such as the cmd, store, etc. commands new feature 👍 New feature request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants