Skip to content

Commit 7b4721e

Browse files
author
ABaldwinHunter
committed
Tune python duplication remediation points
- Reduce AST threshold from 40 to 32 (classic is 28) - update point formula to match classic computation Change from remediations_points = x * score to remediation_points = x + (score-threshold) * y This change increases parity with classic and overall increases the number of duplication issues reported and the penalties assigned them for python. - Reduce AST threshold from 40 to 31 (classic is 28) - update point formula to match classic computation Change from remediations_points = x * score to remediation_points = x + (score-threshold) * y This change increases parity with classic and overall increases the number of duplication issues reported and the penalties assigned them for python. Note on mass difference: The mass of node corresponds to its size. Specifying a minimum threshold tells Code Climate to ignore duplication in nodes below a certain size (e.g. one liners). The issue's Flay score is the result of its **mass** * **number of occurrences** (or number of occurrences ^ 2, if the code is identical). Comparing issue **mass** between parser in Platform and Classic: | Platform | Classic | Platform / Classic | ------------ | --------------- | ------------ 42 | 39 | 1.07 45 | 40 | 1.125 66 | 57 | 1.15789 123 | 109 | 1.1284 126 | 93 | 1.3548 246 | 218 | 1.1284 I've estimated the factor of mass difference to be ~ 1.15 Since the default Python duplication mass threshold on Classic was 28, and `28 * 1.15 = 32.19999`, I've lowered our current default threshold for Python on Platform from 40 to 32. On Classic, Python duplication issues were penalized in terms of remediation points as follows: `1_500_000 + overage * 50_000` where overage = **score** - **threshold** ( and score = f(mass) ) I've kept the base points but lowered the per_cost to 30_000 to account for the difference in mass parsing (which gets amplified in the points calculation.
1 parent 77f00d6 commit 7b4721e

File tree

2 files changed

+13
-4
lines changed

2 files changed

+13
-4
lines changed

lib/cc/engine/analyzers/python/main.rb

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,20 @@ module Python
1212
class Main < CC::Engine::Analyzers::Base
1313
LANGUAGE = "python"
1414
DEFAULT_PATHS = ["**/*.py"]
15-
DEFAULT_MASS_THRESHOLD = 40
16-
BASE_POINTS = 1000
15+
DEFAULT_MASS_THRESHOLD = 32
16+
BASE_POINTS = 1_500_000
17+
POINTS_PER_OVERAGE = 30_000
18+
19+
def calculate_points(issue)
20+
BASE_POINTS + (overage(issue) * POINTS_PER_OVERAGE)
21+
end
1722

1823
private
1924

25+
def overage(issue)
26+
issue.mass - mass_threshold
27+
end
28+
2029
def process_file(path)
2130
Node.new(::CC::Engine::Analyzers::Python::Parser.new(File.binread(path), path).parse.syntax_tree, path).format
2231
end

spec/cc/engine/analyzers/python/main_spec.rb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
"path" => "foo.py",
2828
"lines" => { "begin" => 1, "end" => 1 },
2929
})
30-
expect(json["remediation_points"]).to eq(54000)
30+
expect(json["remediation_points"]).to eq(3_000_000)
3131
expect(json["other_locations"]).to eq([
3232
{"path" => "foo.py", "lines" => { "begin" => 2, "end" => 2} },
3333
{"path" => "foo.py", "lines" => { "begin" => 3, "end" => 3} }
@@ -54,7 +54,7 @@
5454
"path" => "foo.py",
5555
"lines" => { "begin" => 1, "end" => 1 },
5656
})
57-
expect(json["remediation_points"]).to eq(18000)
57+
expect(json["remediation_points"]).to eq(1_920_000)
5858
expect(json["other_locations"]).to eq([
5959
{"path" => "foo.py", "lines" => { "begin" => 2, "end" => 2} },
6060
{"path" => "foo.py", "lines" => { "begin" => 3, "end" => 3} }

0 commit comments

Comments
 (0)