-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add incremental mode to analysis command. #719
Add incremental mode to analysis command. #719
Conversation
Check why test_analyze_and_parse.py test cases in travis fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Fix failing regression test cases.
- Add new test cases for the incremental analysis into tests/functional/analyze_and_parse
libcodechecker/libhandlers/store.py
Outdated
buildaction.analyzer_type = analyzer_types.CLANG_SA | ||
elif os.path.basename(f).startswith("clang-tidy_"): | ||
buildaction.analyzer_type = analyzer_types.CLANG_TIDY | ||
buildaction.analyzer_type = analyzer_types.CLANG_SA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do we use the buildaction.analyzer_type field in the strore? If not used, remove.
|
||
out_file_name = str(self.buildaction.analyzer_type) + \ | ||
'_' + analyzed_file_name + '_' + uid + '.plist' | ||
out_file_name = hashlib.md5(file_name).hexdigest() + '.plist' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the hash, keep the source file name also in the plist file name to help the users to identify which plist file belongs to which source file.
So use something like: out_file_name = analyzed_file_name + hashlib.md5(file_name).hexdigest() + '.plist'
@@ -137,6 +137,16 @@ def add_arguments_to_parser(parser): | |||
help="Annotate the ran analysis with a custom name in " | |||
"the created metadata file.") | |||
|
|||
parser.add_argument('-f', '--force', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest to change the parameter name to 'clean' from 'force'. We usually use the force parameter to enforce some potentially dangerous action like (rm -rf) which otherwise would trigger interactive questions from users. If this parameter is omitted, then there is no such question, but the analysis runs in incremental mode.
So the clean mode expresses the fact that we clean the previous results and perform a new analysis.
default=False, | ||
help="Delete analysis reports stored in the output " | ||
"directory. (By default, CodeChecker would keep " | ||
"reports and overwrites only those plist files " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the type of files is a command-line argument, I would not use the word plist
here.
Also, let's move this command between --type
and --name
. The order of arguments matter, and this is more logical to follow the "output" arguments.
7d35c14
to
1724fda
Compare
efa762a
to
d2cf1f1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing semantics:
- If a file failed to analyze earlier but now we can analyze it, the failure file should be deleted (as now exists).
- If a file is changed in the project, and a new log file is created (which contains only one build command for the modified file), and analyze is called, analyze says the compilation command is already analyzed. Thus, a changed file can not be reanalyzed without having to reanalyze the full project.
- If this newer analysis call succeeds, the plist should be rewritten.
- If it fails, the plist should be deleted, and a failure file should be written.
@@ -139,6 +143,12 @@ def check(check_data): | |||
context.severity_map, | |||
skip_handler) | |||
|
|||
rh.analyzed_source_file = source | |||
if os.path.exists(rh.analyzer_result_file): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The culprit the disability on reanalyzing changed files from newer build.json
s.
@@ -75,7 +78,7 @@ def worker_result_handler(results, metadata, output_path): | |||
source_map[f[:-7]] = sfile.read().strip() | |||
os.remove(f) | |||
|
|||
metadata['result_source_files'] = source_map | |||
metadata['result_source_files'].update(source_map) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be careful here about source files that fail to analyze but successfully analyzed earlier on. They should be removed from the source map.
@@ -6,7 +6,7 @@ | |||
|
|||
from abc import ABCMeta | |||
import os | |||
import uuid | |||
import hashlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imports are alphabetically sorted.
help="Delete analysis reports stored in the output " | ||
"directory. (By default, CodeChecker would keep " | ||
"reports and overwrites only those files that " | ||
"were update by the current build command.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closing )
missing.
dest="clean", | ||
required=False, | ||
action='store_true', | ||
default=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default is explain the help. We use default=argparse.SUPPRESS
so the help formatter doesn't create a (default: False)
to the help output.
NOTE! If default is suppress
, you can't use if args.clean:
as it will be an error if --clean
is not given, you need to use if 'clean' in args
instead!
cf0176f
to
04b18a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @csordasmarton. It works nicely! 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General directions look good, I have some questions though.
@@ -32,17 +32,18 @@ def worker_result_handler(results, metadata, output_path): | |||
successful_analysis = defaultdict(int) | |||
failed_analysis = defaultdict(int) | |||
skipped_num = 0 | |||
existed_num = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not really like the name existed
. Maybe reanalyzed
is better.
@@ -100,6 +109,7 @@ def check(check_data): | |||
output_dir, skip_handler = check_data | |||
|
|||
skipped = False | |||
existed = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
out_file_name = str(self.buildaction.analyzer_type) + \ | ||
'_' + analyzed_file_name + '_' + uid + '.plist' | ||
out_file_name = analyzed_file_name + '_' + \ | ||
hashlib.md5(file_name).hexdigest() + '.plist' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it sufficient to only include the filename? What about files that compiled multiple times with different compilation commands? Maybe it is okay to commit it like this for now, but I would include a FIXME note here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was puzzled for a bit as I specifically tested by hand if multiple files are analyzed in with different compilation setups in the buildlog and I properly got the plists duplicated with content appropriate for the build environment.
If you see right above in the previous line, the file_name
variable is created from the original build command. This variable should be renamed, though, it seems to confuse people who don't read diff with enough context. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, maybe it is worth to add a FIXME that in the future we might want to filter the commands. E.g. do not consider separate warning settings as separate compilation commands.
@@ -289,11 +302,18 @@ def main(args): | |||
with open(args.skipfile, 'r') as skipfile: | |||
metadata['skip_data'] = [l.strip() for l in skipfile.readlines()] | |||
|
|||
# Update metadata dictionary with old values. | |||
metadata_file = os.path.join(args.output_path, 'metadata.json') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the current state of this metadata files? The requirement of metadata makes us incompatible with scan-build. Do we need it? Should we make it optional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no "incompatibility", the store
command works without metadata.json
if we define works
as puts the bugs found in the folder into the database
.
The pre-5.8
plist
command, as update mode was not supported, badly worked. It was expected from the user to always store a directory of plists into an empty run.
Metadata JSON is needed to connect plist
files to source files and source build actions, so that the current storage system (on master
) knows when to eliminate bugs linked to a (sourcefile, buildaction)
pair – because the new bugs will be stored completely replacing the old ones. With the onset of #709, #724 and the possible upcoming schema changes, the proper requirement on having a metadata.json
will be revised.
@@ -21,5 +21,5 @@ tidy_check.cpp:5:5: found assert() with side effect [misc-assert-side-effect] | |||
assert(++i); | |||
^ | |||
|
|||
clang-tidy found 1 defect(s) while analyzing tidy_check.cpp | |||
Found 1 defect(s) while analyzing tidy_check.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be great to have a test case where both sa and tidy finds defects to see that the numbers are added up correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Xazax-hun Mentioned a comment about a file_name
variable, and I agree with making a test on the number adding up.
4b95930
to
81739a5
Compare
I don't think it's a good idea to sum up the defects because the parse processing the plist files one-by-one. |
81739a5
to
b32a107
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Let's roll.
This commit resolves #702 issue.