Add incremental mode to analysis command. #719

csordasmarton · 2017-07-11T09:13:55Z

This commit resolves #702 issue.

dkrupp · 2017-07-17T08:26:36Z

Check why test_analyze_and_parse.py test cases in travis fail.

dkrupp

Fix failing regression test cases.
Add new test cases for the incremental analysis into tests/functional/analyze_and_parse

dkrupp · 2017-07-17T08:32:56Z

libcodechecker/libhandlers/store.py

-        buildaction.analyzer_type = analyzer_types.CLANG_SA
-    elif os.path.basename(f).startswith("clang-tidy_"):
-        buildaction.analyzer_type = analyzer_types.CLANG_TIDY
+    buildaction.analyzer_type = analyzer_types.CLANG_SA


what do we use the buildaction.analyzer_type field in the strore? If not used, remove.

dkrupp · 2017-07-17T08:37:35Z

libcodechecker/analyze/analyzers/result_handler_base.py


-            out_file_name = str(self.buildaction.analyzer_type) + \
-                '_' + analyzed_file_name + '_' + uid + '.plist'
+            out_file_name = hashlib.md5(file_name).hexdigest() + '.plist'


Besides the hash, keep the source file name also in the plist file name to help the users to identify which plist file belongs to which source file.
So use something like: out_file_name = analyzed_file_name + hashlib.md5(file_name).hexdigest() + '.plist'

dkrupp · 2017-07-17T08:41:19Z

libcodechecker/libhandlers/analyze.py

@@ -137,6 +137,16 @@ def add_arguments_to_parser(parser):
                        help="Annotate the ran analysis with a custom name in "
                             "the created metadata file.")

+    parser.add_argument('-f', '--force',


I suggest to change the parameter name to 'clean' from 'force'. We usually use the force parameter to enforce some potentially dangerous action like (rm -rf) which otherwise would trigger interactive questions from users. If this parameter is omitted, then there is no such question, but the analysis runs in incremental mode.

So the clean mode expresses the fact that we clean the previous results and perform a new analysis.

whisperity · 2017-07-17T09:09:38Z

libcodechecker/libhandlers/analyze.py

+                        default=False,
+                        help="Delete analysis reports stored in the output "
+                             "directory. (By default, CodeChecker would keep "
+                             "reports and overwrites only those plist files "


Because the type of files is a command-line argument, I would not use the word plist here.

Also, let's move this command between --type and --name. The order of arguments matter, and this is more logical to follow the "output" arguments.

whisperity

Missing semantics:

If a file failed to analyze earlier but now we can analyze it, the failure file should be deleted (as now exists).
If a file is changed in the project, and a new log file is created (which contains only one build command for the modified file), and analyze is called, analyze says the compilation command is already analyzed. Thus, a changed file can not be reanalyzed without having to reanalyze the full project.
- If this newer analysis call succeeds, the plist should be rewritten.
- If it fails, the plist should be deleted, and a failure file should be written.

whisperity · 2017-07-21T11:57:48Z

libcodechecker/analyze/analysis_manager.py

@@ -139,6 +143,12 @@ def check(check_data):
                                                          context.severity_map,
                                                          skip_handler)

+            rh.analyzed_source_file = source
+            if os.path.exists(rh.analyzer_result_file):


The culprit the disability on reanalyzing changed files from newer build.jsons.

whisperity · 2017-07-21T11:59:18Z

libcodechecker/analyze/analysis_manager.py

@@ -75,7 +78,7 @@ def worker_result_handler(results, metadata, output_path):
            source_map[f[:-7]] = sfile.read().strip()
        os.remove(f)

-    metadata['result_source_files'] = source_map
+    metadata['result_source_files'].update(source_map)


Be careful here about source files that fail to analyze but successfully analyzed earlier on. They should be removed from the source map.

whisperity · 2017-07-21T11:59:35Z

libcodechecker/analyze/analyzers/result_handler_base.py

@@ -6,7 +6,7 @@

 from abc import ABCMeta
 import os
-import uuid
+import hashlib


Imports are alphabetically sorted.

whisperity · 2017-07-21T12:00:30Z

libcodechecker/libhandlers/analyze.py

+                        help="Delete analysis reports stored in the output "
+                             "directory. (By default, CodeChecker would keep "
+                             "reports and overwrites only those files that "
+                             "were update by the current build command.")


Closing ) missing.

whisperity · 2017-07-21T12:04:44Z

libcodechecker/libhandlers/analyze.py

+                        dest="clean",
+                        required=False,
+                        action='store_true',
+                        default=False,


The default is explain the help. We use default=argparse.SUPPRESS so the help formatter doesn't create a (default: False) to the help output.

NOTE! If default is suppress, you can't use if args.clean: as it will be an error if --clean is not given, you need to use if 'clean' in args instead!

whisperity

Thank you, @csordasmarton. It works nicely! 🙂

Xazax-hun

General directions look good, I have some questions though.

Xazax-hun · 2017-07-24T11:33:46Z

libcodechecker/analyze/analysis_manager.py

@@ -32,17 +32,18 @@ def worker_result_handler(results, metadata, output_path):
    successful_analysis = defaultdict(int)
    failed_analysis = defaultdict(int)
    skipped_num = 0
+    existed_num = 0


I do not really like the name existed. Maybe reanalyzed is better.

Xazax-hun · 2017-07-24T11:34:39Z

libcodechecker/analyze/analysis_manager.py

@@ -100,6 +109,7 @@ def check(check_data):
        output_dir, skip_handler = check_data

    skipped = False
+    existed = False


Same as above.

Xazax-hun · 2017-07-24T11:38:20Z

libcodechecker/analyze/analyzers/result_handler_base.py

-            out_file_name = str(self.buildaction.analyzer_type) + \
-                '_' + analyzed_file_name + '_' + uid + '.plist'
+            out_file_name = analyzed_file_name + '_' + \
+                hashlib.md5(file_name).hexdigest() + '.plist'


Is it sufficient to only include the filename? What about files that compiled multiple times with different compilation commands? Maybe it is okay to commit it like this for now, but I would include a FIXME note here.

I was puzzled for a bit as I specifically tested by hand if multiple files are analyzed in with different compilation setups in the buildlog and I properly got the plists duplicated with content appropriate for the build environment.

If you see right above in the previous line, the file_name variable is created from the original build command. This variable should be renamed, though, it seems to confuse people who don't read diff with enough context. 🙂

In this case, maybe it is worth to add a FIXME that in the future we might want to filter the commands. E.g. do not consider separate warning settings as separate compilation commands.

Xazax-hun · 2017-07-24T11:39:48Z

libcodechecker/libhandlers/analyze.py

@@ -289,11 +302,18 @@ def main(args):
        with open(args.skipfile, 'r') as skipfile:
            metadata['skip_data'] = [l.strip() for l in skipfile.readlines()]

+    # Update metadata dictionary with old values.
+    metadata_file = os.path.join(args.output_path, 'metadata.json')


What is the current state of this metadata files? The requirement of metadata makes us incompatible with scan-build. Do we need it? Should we make it optional?

There is no "incompatibility", the store command works without metadata.json if we define works as puts the bugs found in the folder into the database.

The pre-5.8 plist command, as update mode was not supported, badly worked. It was expected from the user to always store a directory of plists into an empty run.

Metadata JSON is needed to connect plist files to source files and source build actions, so that the current storage system (on master) knows when to eliminate bugs linked to a (sourcefile, buildaction) pair – because the new bugs will be stored completely replacing the old ones. With the onset of #709, #724 and the possible upcoming schema changes, the proper requirement on having a metadata.json will be revised.

Xazax-hun · 2017-07-24T11:41:47Z

tests/functional/analyze_and_parse/test_files/tidy_check.output

@@ -21,5 +21,5 @@ tidy_check.cpp:5:5: found assert() with side effect [misc-assert-side-effect]
    assert(++i);
    ^

-clang-tidy found 1 defect(s) while analyzing tidy_check.cpp
+Found 1 defect(s) while analyzing tidy_check.cpp


Maybe it would be great to have a test case where both sa and tidy finds defects to see that the numbers are added up correctly.

whisperity

@Xazax-hun Mentioned a comment about a file_name variable, and I agree with making a test on the number adding up.

csordasmarton · 2017-07-24T14:36:21Z

I don't think it's a good idea to sum up the defects because the parse processing the plist files one-by-one.

whisperity

Agreed. Let's roll.

csordasmarton added the enhancement 🌟 label Jul 11, 2017

whisperity added this to the 6.0 pre1 milestone Jul 14, 2017

dkrupp requested changes Jul 17, 2017

View reviewed changes

dkrupp reviewed Jul 17, 2017

View reviewed changes

whisperity reviewed Jul 17, 2017

View reviewed changes

bruntib force-pushed the version6 branch from 502343d to a4e53b2 Compare July 17, 2017 11:53

csordasmarton force-pushed the incremental_analyze branch 5 times, most recently from 7d35c14 to 1724fda Compare July 18, 2017 15:41

whisperity force-pushed the version6 branch from a4e53b2 to ed9b726 Compare July 19, 2017 06:31

csordasmarton force-pushed the incremental_analyze branch 6 times, most recently from efa762a to d2cf1f1 Compare July 21, 2017 09:19

whisperity self-requested a review July 21, 2017 10:02

whisperity requested changes Jul 21, 2017

View reviewed changes

csordasmarton force-pushed the incremental_analyze branch 2 times, most recently from cf0176f to 04b18a9 Compare July 21, 2017 14:43

whisperity self-requested a review July 21, 2017 15:03

whisperity approved these changes Jul 24, 2017

View reviewed changes

Xazax-hun reviewed Jul 24, 2017

View reviewed changes

whisperity requested changes Jul 24, 2017

View reviewed changes

csordasmarton force-pushed the incremental_analyze branch 2 times, most recently from 4b95930 to 81739a5 Compare July 24, 2017 14:29

Add incremental mode to analysis command.

b32a107

csordasmarton force-pushed the incremental_analyze branch from 81739a5 to b32a107 Compare July 24, 2017 14:45

whisperity approved these changes Jul 24, 2017

View reviewed changes

whisperity merged commit 760a304 into Ericsson:version6 Jul 24, 2017

csordasmarton deleted the incremental_analyze branch July 31, 2017 07:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add incremental mode to analysis command. #719

Add incremental mode to analysis command. #719

csordasmarton commented Jul 11, 2017

dkrupp commented Jul 17, 2017

dkrupp left a comment

dkrupp Jul 17, 2017

dkrupp Jul 17, 2017

dkrupp Jul 17, 2017

whisperity Jul 17, 2017

whisperity left a comment

whisperity Jul 21, 2017

whisperity Jul 21, 2017

whisperity Jul 21, 2017

whisperity Jul 21, 2017

whisperity Jul 21, 2017

whisperity left a comment

Xazax-hun left a comment •

edited

Loading

Xazax-hun Jul 24, 2017

Xazax-hun Jul 24, 2017

Xazax-hun Jul 24, 2017

whisperity Jul 24, 2017

Xazax-hun Jul 24, 2017

Xazax-hun Jul 24, 2017

whisperity Jul 24, 2017 •

edited

Loading

Xazax-hun Jul 24, 2017

whisperity left a comment •

edited

Loading

csordasmarton commented Jul 24, 2017 •

edited

Loading

whisperity left a comment

Add incremental mode to analysis command. #719

Add incremental mode to analysis command. #719

Conversation

csordasmarton commented Jul 11, 2017

dkrupp commented Jul 17, 2017

dkrupp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whisperity left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whisperity left a comment

Choose a reason for hiding this comment

Xazax-hun left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whisperity Jul 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

whisperity left a comment • edited Loading

Choose a reason for hiding this comment

csordasmarton commented Jul 24, 2017 • edited Loading

whisperity left a comment

Choose a reason for hiding this comment

Xazax-hun left a comment •

edited

Loading

whisperity Jul 24, 2017 •

edited

Loading

whisperity left a comment •

edited

Loading

csordasmarton commented Jul 24, 2017 •

edited

Loading