Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CSV format for get_info and allow including any violation types #110

Merged
merged 8 commits into from
Oct 13, 2023
14 changes: 13 additions & 1 deletion lib/packs/cli.rb
Original file line number Diff line number Diff line change
Expand Up @@ -129,9 +129,21 @@ def update
end

desc 'get_info [ packs/my_pack packs/my_other_pack ]', 'Get info about size and violations for packs'
option :include_date, type: :boolean, default: false, aliases: :d, banner: "Include today's date as part of the data (useful to take snapshots over time)"
option :format, type: :string, default: 'detail', aliases: :f, banner: 'Specify the output format (detail, csv)'
option :types, type: :string, default: 'privacy,dependency', aliases: :t, banner: 'List of validation types to include (privacy,dependency,architecture)'
sig { params(pack_names: String).void }
def get_info(*pack_names)
Private.get_info(packs: parse_pack_names(pack_names))
selected_types = options[:types].to_s.downcase.split(',')
invalid_types = selected_types - POSIBLE_TYPES
raise StandardError, "Invalid type(s): #{invalid_types.join(', ')}. Possible types are: #{POSIBLE_TYPES.join(', ')}" unless invalid_types.empty?

Private.get_info(
packs: parse_pack_names(pack_names),
format: options[:format].to_sym,
types: selected_types.map(&:to_sym),
include_date: options[:include_date]
)
exit_successfully
end

Expand Down
110 changes: 82 additions & 28 deletions lib/packs/private.rb
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@
require 'packs/private/pack_relationship_analyzer'
require 'packs/private/interactive_cli'

require 'date'

module Packs
module Private
extend T::Sig
Expand Down Expand Up @@ -479,45 +481,97 @@ def self.packwerk_package_to_pack(package)
Packs.find(package.name)
end

sig { params(packs: T::Array[Packs::Pack]).void }
def self.get_info(packs: Packs.all)
inbound_violations = {}
outbound_violations = {}
sig do
params(
packs: T::Array[Packs::Pack],
format: Symbol,
types: T::Array[Symbol],
include_date: T::Boolean
).void
end
def self.get_info(packs: Packs.all, format: :detail, types: %i[privacy dependency architecture], include_date: false)
require 'csv' if format == :csv

today = Date.today.iso8601
violations = {
inbound: {},
outbound: {}
}

ParsePackwerk.all.each do |p|
p.violations.each do |violation|
outbound_violations[p.name] ||= []
outbound_violations[p.name] << violation
inbound_violations[violation.to_package_name] ||= []
inbound_violations[violation.to_package_name] << violation
violations[:outbound][p.name] ||= []
violations[:outbound][p.name] << violation
violations[:inbound][violation.to_package_name] ||= []
violations[:inbound][violation.to_package_name] << violation
end
end

all_inbound = T.let([], T::Array[ParsePackwerk::Violation])
all_outbound = T.let([], T::Array[ParsePackwerk::Violation])
all = {
inbound: T.let([], T::Array[ParsePackwerk::Violation]),
outbound: T.let([], T::Array[ParsePackwerk::Violation])
}
packs.each do |pack|
all_inbound += inbound_violations[pack.name] || []
all_outbound += outbound_violations[pack.name] || []
all[:inbound] += violations[:inbound][pack.name] || []
all[:outbound] += violations[:outbound][pack.name] || []
end

puts "There are #{all_inbound.select(&:privacy?).sum { |v| v.files.count }} total inbound privacy violations"
puts "There are #{all_inbound.select(&:dependency?).sum { |v| v.files.count }} total inbound dependency violations"
puts "There are #{all_outbound.select(&:privacy?).sum { |v| v.files.count }} total outbound privacy violations"
puts "There are #{all_outbound.select(&:dependency?).sum { |v| v.files.count }} total outbound dependency violations"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the original code with the 4 puts easier to read and understand. I understand that there's duplication. I would prefer a small amount of duplication over an abstraction layer that makes it harder for new-to-the-code-base developer to understand.

If you agree, that would mean that we don't need these above:

      violations = {
        inbound: {},
        outbound: {}
      }
      directions = violations.keys
      dir_x_types = directions.product(types)

And we could simply the printing below.

Otherwise, I'm happy to merge this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

The problem with that is that then we can't easily support new violation types. Or even allow the user to select which ones they want from the CLI.

In my case, I'm currently interested in architecture violations and not dependency nor privacy for the time being.

An intermediate solution would be to have duplicate lines one for inbound and one for outbound, but still loop through a list of violation types. Would you be ok with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about now, @professor? I removed the directions and dir_x_types variables and the looping over directions. I can't really remove the looping over types as that's one of the main purposes of this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update to the PR.

When I look at the original code, I would find something like [1] easier to read than [2]

# [1]
puts "There are #{all_inbound.select(&:privacy?).sum { |v| v.files.count }} total inbound privacy violations" if types.include?(:privacy)
puts "There are #{all_inbound.select(&:dependency?).sum { |v| v.files.count }} total inbound dependency violations" if types.include?(:dependency)
puts "There are #{all_inbound.select(&: architecture?).sum { |v| v.files.count }} total inbound dependency violations" if types.include?(: architecture)
# [2]
types.each do |type|
  key = ['inbound', type, 'violations'].join('_').to_sym
  row[key] = (violations[:inbound][pack.name] || []).select { _1.type.to_sym == type }.sum { |v| v.files.count }
end
...
types.each do |type|
    key = ['inbound', type, 'violations'].join('_').to_sym
    puts "There are #{row[key]} inbound #{type} violations"
end

[1] will allow the user to select which types they want displayed. Maybe I'm missing something else with the change. Thanks again @oboxodo for improving the code with this feature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback, @professor. Here are my thoughts:

Yes, [1] is more readable than [2]. But [1] is hardcoded to only support privacy, dependency, and architecture. IIRC I've seen visibility checks too in packwerk and then that code would need to be changed in order to support that (or any other future check FWIW). [2] has the benefit of not needing to be updated because it supports any checks that could come in the future (as long as they support the same API as the current ones which seems to be the goal).

On top of that, you're showing [2] as two separate loops but those loops are there for different reasons. The goal of the first loop is to collect the violations regardless of how they're going to be reported. I introduced that row hash as an intermediate format-independent way to collect the data, then there's separate code to interpret that row and convert it into either the csv format or the detail (original) format.

I've actually thought a follow-up PR could make this separation more explicit and then the current method would be left only with the pure data-collection code and then delegate outputting to different methods depending on the selected format.

I'd really like to avoid hardcoding anything related to the violation types. Thanks to this, this was the only required change in order to support architecture violations. Adding visibility or any other future extension should be as simple as that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for the record... all of that reasoning is in line with the work I did on my first PR to this project in #104 where I also removed some hardcoded call to dependency? with a more dynamic check, allowing support for both privacy and architecture.

case format
when :csv
headers = ['Date', 'Pack name', 'Owned by', 'Size', 'Public API']
headers.delete('Date') unless include_date
types.each do |type|
headers << "Inbound #{type} violations"
headers << "Outbound #{type} violations"
end
puts CSV.generate_line(headers)
else # :detail
puts "Date: #{today}" if include_date
types.each do |type|
inbound_count = all[:inbound].select { _1.type.to_sym == type }.sum { |v| v.files.count }
outbound_count = all[:outbound].select { _1.type.to_sym == type }.sum { |v| v.files.count }
puts "There are #{inbound_count} total inbound #{type} violations"
puts "There are #{outbound_count} total outbound #{type} violations"
end
end

packs.sort_by { |p| -p.relative_path.glob('**/*.rb').count }.each do |pack|
puts "\n=========== Info about: #{pack.name}"

owner = CodeOwnership.for_package(pack)
puts "Owned by: #{owner.nil? ? 'No one' : owner.name}"
puts "Size: #{pack.relative_path.glob('**/*.rb').count} ruby files"
puts "Public API: #{pack.relative_path.join('app/public')}"

inbound_for_pack = inbound_violations[pack.name] || []
outbound_for_pack = outbound_violations[pack.name] || []
puts "There are #{inbound_for_pack.select(&:privacy?).sum { |v| v.files.count }} inbound privacy violations"
puts "There are #{inbound_for_pack.flatten.select(&:dependency?).sum { |v| v.files.count }} inbound dependency violations"
puts "There are #{outbound_for_pack.select(&:privacy?).sum { |v| v.files.count }} outbound privacy violations"
puts "There are #{outbound_for_pack.flatten.select(&:dependency?).sum { |v| v.files.count }} outbound dependency violations"

row = {
date: today,
pack_name: pack.name,
owner: owner.nil? ? 'No one' : owner.name,
size: pack.relative_path.glob('**/*.rb').count,
public_api: pack.relative_path.join('app/public')
}

row.delete(:date) unless include_date

types.each do |type|
key = ['inbound', type, 'violations'].join('_').to_sym
row[key] = (violations[:inbound][pack.name] || []).select { _1.type.to_sym == type }.sum { |v| v.files.count }
key = ['outbound', type, 'violations'].join('_').to_sym
row[key] = (violations[:outbound][pack.name] || []).select { _1.type.to_sym == type }.sum { |v| v.files.count }
end

case format
when :csv
puts CSV.generate_line(row.values)
else # :detail
puts "\n=========== Info about: #{row[:pack_name]}"

puts "Date: #{row[:date]}" if include_date
puts "Owned by: #{row[:owner]}"
puts "Size: #{row[:size]} ruby files"
puts "Public API: #{row[:public_api]}"
types.each do |type|
key = ['inbound', type, 'violations'].join('_').to_sym
puts "There are #{row[key]} inbound #{type} violations"
key = ['outbound', type, 'violations'].join('_').to_sym
puts "There are #{row[key]} outbound #{type} violations"
end
end
end
end

Expand Down
16 changes: 15 additions & 1 deletion lib/packs/private/interactive_cli/use_cases/get_info.rb
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,23 @@ def perform!(prompt)
selected_packs = PackSelector.single_or_all_pack_multi_select(prompt, question_text: 'What pack(s) would you like info on?')
end

format = prompt.select('What output format do you want?', %w[Detail CSV])

types = prompt.multi_select(
'What violation types do you want stats for?',
%w[Privacy Dependency Architecture]
)

include_date = !prompt.no?('Should the current date be included in the report?')

puts "You've selected #{selected_packs.count} packs. Wow! Here's all the info."

Private.get_info(packs: selected_packs)
Private.get_info(
packs: selected_packs,
format: format.downcase.to_sym,
types: types.map(&:downcase).map(&:to_sym),
include_date: include_date
)
end
end
end
Expand Down