Skip to content

Tune remediation points to match Classic #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions lib/cc/engine/analyzers/analyzer_base.rb
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ class Base
::RubyParser::SyntaxError,
]

DEFAULT_MASS_THRESHOLD = 28
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to clarify that this is intentional and vetted: previously these were different between PHP & Python/Javascript, and this change makes them the same, but keeps Ruby different.

The previous values were, I believe, picked after a lot of manual testing to get something that felt "right". What the "right" default threshold choice is for a language is dependent on how the parser the engines uses for that language ends up expressing ASTs. I.e. small parser differences can change how verbose a mass threshold ends up seeming.

Based on the classic code, it looks like this value was taken directly from there (and it didn't apply to Python in classic). I'd personally be pretty surprised if the AST results of all the languages was the same between classic & this engine: at the very least I'm pretty sure the JS parser behavior is different.

For that reason, I'm not sure the classic thresholds will make sense here, because they are likely to produce different results here than they did in classic. Do you have a sample corpus you've been using to verify same results for each of these languages between classic/these changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wfleming Thanks for the tip about the thought behind the thresholds and parsing. I can investigate that.

I reference the classic rubric you mentioned in my commit, along with the Python source.

I've been keeping track of test repos here. (removed link)

In general, it seemed that on classic, duplication and complexity issues had much greater grade impact than they did on Platform.

I tested the current duplication setup on app - but could do some further testing with this exact setup on additional repos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a question of effort vs accuracy, but if the intent is to get results similar to classic, we should probably have a small corpus of files for each language, get the overall mass of each file as calculated both by classic & by this engine, and compare those to see if the thresholds (and to a lesser extent per_point values) need to be scaled to get matching results.

BASE_POINTS = 1_500_000
POINTS_PER_OVERAGE = 50_000

def initialize(engine_config:)
@engine_config = engine_config
end
Expand All @@ -37,6 +41,10 @@ def base_points
self.class::BASE_POINTS
end

def points_per_overage
self.class::POINTS_PER_OVERAGE
end

private

attr_reader :engine_config
Expand Down
2 changes: 0 additions & 2 deletions lib/cc/engine/analyzers/javascript/main.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@ class Main < CC::Engine::Analyzers::Base
"**/*.jsx"
]
LANGUAGE = "javascript"
DEFAULT_MASS_THRESHOLD = 40
BASE_POINTS = 3000

private

Expand Down
2 changes: 0 additions & 2 deletions lib/cc/engine/analyzers/php/main.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,8 +14,6 @@ class Main < CC::Engine::Analyzers::Base
"**/*.inc",
"**/*.module"
]
DEFAULT_MASS_THRESHOLD = 10
BASE_POINTS = 4_000

private

Expand Down
2 changes: 0 additions & 2 deletions lib/cc/engine/analyzers/python/main.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@ module Python
class Main < CC::Engine::Analyzers::Base
LANGUAGE = "python"
DEFAULT_PATHS = ["**/*.py"]
DEFAULT_MASS_THRESHOLD = 40
BASE_POINTS = 1000

private

Expand Down
8 changes: 2 additions & 6 deletions lib/cc/engine/analyzers/reporter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -60,19 +60,15 @@ def flay

attr_reader :engine_config, :language_strategy, :io

def mass_threshold
@mass_threshold ||= language_strategy.mass_threshold
end

def new_violation(issue)
hashes = flay.hashes[issue.structural_hash]
Violation.new(language_strategy.base_points, issue, hashes)
Violation.new(language_strategy, issue, hashes)
end

def flay_options
{
diff: false,
mass: mass_threshold,
mass: language_strategy.mass_threshold,
summary: false,
verbose: false,
number: true,
Expand Down
2 changes: 1 addition & 1 deletion lib/cc/engine/analyzers/ruby/main.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ class Main < CC::Engine::Analyzers::Base

]
DEFAULT_MASS_THRESHOLD = 18
BASE_POINTS = 10_000
POINTS_PER_OVERAGE = 100_000
TIMEOUT = 300

private
Expand Down
32 changes: 27 additions & 5 deletions lib/cc/engine/analyzers/violation.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@ module Analyzers
class Violation
attr_reader :issue

def initialize(base_points, issue, hashes)
@base_points = base_points
DEFAULT_POINTS = 1_500_000

def initialize(language, issue, hashes)
@language = language
@issue = issue
@hashes = hashes
end
Expand All @@ -32,9 +34,29 @@ def report_name
"#{current_sexp.file}-#{current_sexp.line}"
end

def calculate_points
if issue.mass >= threshold
base_points + (overage * points_per_overage)
else
DEFAULT_POINTS
end
end

private

attr_reader :base_points, :hashes
attr_reader :language, :hashes

def base_points
language.base_points
end

def points_per_overage
language.points_per_overage
end

def threshold
language.mass_threshold
end

def current_sexp
@location ||= sorted_hashes.first
Expand All @@ -56,8 +78,8 @@ def name
end
end

def calculate_points
base_points * issue.mass
def overage
issue.mass - threshold
end

def format_location
Expand Down
2 changes: 1 addition & 1 deletion spec/cc/engine/analyzers/javascript/main_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"path" => "foo.js",
"lines" => { "begin" => 1, "end" => 1 },
})
expect(json["remediation_points"]).to eq(297000)
expect(json["remediation_points"]).to eq(6400000)
expect(json["other_locations"]).to eq([
{"path" => "foo.js", "lines" => { "begin" => 2, "end" => 2} },
{"path" => "foo.js", "lines" => { "begin" => 3, "end" => 3} }
Expand Down
4 changes: 2 additions & 2 deletions spec/cc/engine/analyzers/php/main_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@
"path" => "foo.php",
"lines" => { "begin" => 2, "end" => 6 },
})
expect(json["remediation_points"]).to eq(176000)
expect(json["remediation_points"]).to eq(3450000)
expect(json["other_locations"]).to eq([
{"path" => "foo.php", "lines" => { "begin" => 10, "end" => 14} },
])
Expand All @@ -61,7 +61,7 @@
end

def printed_issue
issue = {"type":"issue","check_name":"Identical code","description":"Similar code found in 1 other location","categories":["Duplication"],"location":{"path":"foo.php","lines":{"begin":2,"end":6}},"remediation_points":176000,"other_locations":[{"path":"foo.php","lines":{"begin":10,"end":14}}],"content":{"body": read_up}}
issue = {"type":"issue","check_name":"Identical code","description":"Similar code found in 1 other location","categories":["Duplication"],"location":{"path":"foo.php","lines":{"begin":2,"end":6}},"remediation_points":3450000,"other_locations":[{"path":"foo.php","lines":{"begin":10,"end":14}}],"content":{"body": read_up}}
issue.to_json + "\0\n"
end

Expand Down
2 changes: 1 addition & 1 deletion spec/cc/engine/analyzers/python/main_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
"path" => "foo.py",
"lines" => { "begin" => 1, "end" => 1 },
})
expect(json["remediation_points"]).to eq(54000)
expect(json["remediation_points"]).to eq(4000000)
expect(json["other_locations"]).to eq([
{"path" => "foo.py", "lines" => { "begin" => 2, "end" => 2} },
{"path" => "foo.py", "lines" => { "begin" => 3, "end" => 3} }
Expand Down
2 changes: 1 addition & 1 deletion spec/cc/engine/analyzers/ruby/main_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@
"path" => "foo.rb",
"lines" => { "begin" => 1, "end" => 5 },
})
expect(json["remediation_points"]).to eq(360000)
expect(json["remediation_points"]).to eq(3300000)
expect(json["other_locations"]).to eq([
{"path" => "foo.rb", "lines" => { "begin" => 9, "end" => 13} },
])
Expand Down
50 changes: 50 additions & 0 deletions spec/cc/engine/analyzers/violation_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
RSpec.describe CC::Engine::Analyzers::Violation, in_tmpdir: true do
include AnalyzerSpecHelpers

describe "#calculate_points" do
context "when issue mass exceeds threshold" do
it "calculates mass overage points" do
language = double(:language, base_points: 100, points_per_overage: 5, mass_threshold: 10)
issue = double(:issue, mass: 15)
hashes = []

expected_points = 100 + 5 * (issue.mass - 10)

violation = CC::Engine::Analyzers::Violation.new(language, issue, hashes)
points = violation.calculate_points

expect(points).to eq(expected_points)
end
end

context "when issue mass is less than threshold" do
it "uses default" do
language = double(:language, base_points: 100, points_per_overage: 5, mass_threshold: 18)
issue = double(:issue, mass: 15)
hashes = []

expected_points = CC::Engine::Analyzers::Violation::DEFAULT_POINTS

violation = CC::Engine::Analyzers::Violation.new(language, issue, hashes)
points = violation.calculate_points

expect(points).to eq(CC::Engine::Analyzers::Violation::DEFAULT_POINTS)
end
end

context "when issue mass equals threshold" do
it "calculates remediation points" do
language = double(:language, base_points: 100, points_per_overage: 5, mass_threshold: 18)
issue = double(:issue, mass: language.mass_threshold)
hashes = []

expected_points = 100 + 5 * (issue.mass - language.mass_threshold)

violation = CC::Engine::Analyzers::Violation.new(language, issue, hashes)
points = violation.calculate_points

expect(points).to eq(expected_points)
end
end
end
end