-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: x/tools/go/analysis: change singlechecker/multichecker to allow customization of package loading #53215
Comments
We can forward information from *packages.Modules to the *Pass when a flag is set on the Analyzer (like RunDespiteErrors). This would still be per
I don't think I know what this means. Can you be more specific about the API?
It is not yet clear to me whether this will fit into the Also the checkers cache on We can potentially expose a different part of
2cents: I am not seeing an advantage of doing Option 2 over Option 1 yet. One could just pass Load as parameter. |
Change https://go.dev/cl/410254 mentions this issue: |
The kind of initialization I want to do should run outside the analysis - for example, accessing the network to load extra info such as vulndb queries. We don't want to do that in the middle of some analyzer step is running in arbitrary time. We want the network access to run in predictable manner. https://go-review.googlesource.com/c/tools/+/410125/1 demonstrates a simple example analyzer that controls package loading and injects custom initialization logic using a proposed api (option 1) At this moment,
Smaller API surface - Instead of two ways ( |
I'm not convinced that either of these options is actually the best solution to the specific problem at hand (that we discussed in person), which is that you have an analysis (similar to https://go-review.googlesource.com/c/tools/+/408715) you wish to apply to all the packages of a program in topological order, but your analysis also needs global information about the program (a set of vulndb entries) that forces you to call packages.Load before doing the analysis. I think this indicates that the analysis simply isn't an instance of the kind of modular analysis that the go/analysis framework was designed to support. It would be relatively straightforward to modify the analysis to perform its own parallel depth-first traversal over the go/packages.Package graph. It wouldn't need to use Facts because it could assume a single address space. (Indeed, I developed the sketch in the CL above using go/packages directly and then converted it to go/analysis once it was working. It wasn't a difficult change and it would be easy to undo.) An alternative approach would be to design an interface to the vulndb that allows each go/analysis Pass to query the database piecemeal as it encounters the need. This would of course make more calls to the database (one per package, or perhaps as few as one per package on the critical path) but they could be cached in the local file system to avoid repeat calls from developer machines. |
Module approach in single address space is still beneficial because it allows incremental updates. For example, when a user is editing a package, we don't need to recompute the whole program again but can use the facts from previous analysis for unchanged packages. We can invent our own wheel to allow such incremental analysis but writing this in a custom way would make it harder to be adopted by existing linters or gopls.
This is what I want to avoid. In gopls, for example, diagnostics and analysis run can occur at arbitrary time while users edits a file. I expect the vulndb query and local copy update to be completely decoupled from the actual analysis (either updated periodically, or only when go.mod change is detected) |
@adonovan and I had a chat and we decided to explore a different approach https://go-review.googlesource.com/c/tools/+/410126 is an attempt to demonstrate this approach. I think we can optimize the local source format further. |
Analyzer-specific flags that specify input files or state to sideload (e.g. Still this doesn't solve the original problem -
I agree that it's not ideal to expose the dependency on go/packages from singlechecker/multichecker's APIs, but given that singlechecker/multichecker didn't work without go/packages API for a while, I wonder if it's time to embrace go/packages in the singlechecker/multichecker (note: this issue is about singlechecker/multichecker APIs, not |
I think it would be reasonable to expose an API for running analyzers on a whole program loaded using go/packages, that allows the client to parse flags, load the program, and analyze it under its control, and to process the results without necessarily printing them. The underlying code would be shared with singlechecker and multichecker. Some care is required to ensure that the program is loaded using a configuration that is acceptable to the analysis that follows. And the Analyzer / Pass API should never depend on go/packages. It would also be ok to expose go/packages configuration options through the single/multichecker CLI, such as the -test flag you added today. But single/multichecker are really not intended to be used as anything other than the main function of a standalone program. |
+1 to everything Alan said (except calling it "whole program"). I think it is reasonable to expose what is effectively a lower level API for |
FWIW a hack you could do shove this into a modular analyzer is to split into 2 processes that:
This should force the modular analysis machinery to discard all of the previous results. The resulting analyzer would be tricky to run from command line but this may not be an issue for you. This does not solve custom loading. |
I think exposing part of #30231 We've seen many lint-like tools came up with their own solutions and copying of checker package internals (which I think is totally fine, but would be nice if we can help people avoiding such copying when possible.) Re: #53215 (comment) That is what https://go-review.googlesource.com/c/tools/+/410126 is already doing - except |
What is the current proposal? Thanks. |
I suspect both of these should be deprecated in favor of a better package. |
Change https://go.dev/cl/411907 mentions this issue: |
The {single,multi}checker packages provide the main function for a complete application, as a black box. Many users want the ability to customize the analyzer behavior with additional logic, as described in the attached issues. This change creates a new package, go/analysis/checker, that exposes an Analyze pure function---one that avoids global flags, os.Exit, logging, profiling, and other side effects---that runs a set of analyzers on a set of packages loaded (by the client) using go/packages, and presents the graph of results in a form that allows postprocessing. This is just a sketch. API feedback welcome. DO NOT SUBMIT Updates golang/go#30231 Updates golang/go#30219 Updates golang/go#31007 Updates golang/go#31897 Updates golang/go#50265 Updates golang/go#53215 Updates golang/go#53336 Change-Id: I745d319a587dca506564a4624b52a7f1eb5f4751
A sketch of a new API of components for building go/packages-based analysis drivers can be found in https://go.dev/cl/411907. This CL contains an Analyze function that takes an extensible Input struct, initially listing packages and analyzers, and returns an extensible Output struct, initially containing the roots of the action graph. An action is a step of work done by the checker, identified by the pair (analyzer, package). The rest of the action graph can be found by chasing pointers, analogous to go/packages. Each action exposes its analyzer, package, error, result, diagnostics, and facts. The API in that CL is still not sufficient to implement all of multichecker. We still need functions for:
(It is possible to test most analyzers today with analysistest, but it too is a monolith that could be broken into reusable chunks of functionality--to load packages, run the checkers, verify the expectations, verify the suggested fixes--while also allowing ad hoc test logic to access all of the result data structures that are currently encapsulated.) I have not sketched the API for these four features yet. My initial instinct was to publish a set of functions similar in form to the existing ones, but @hyangah raised the possibility that several of these operations might usefully be considered instances of a general post-processing interface, perhaps similar to the one in golangci-lint, which is used for a range of tasks including: filtering diagnostics:
transforming diagnostics:
and side effects:
It's not obvious to me that we need the full generality of another extensible plug-in mechanism, but clearly the question warrants more thought. At this stage, the main questions to decide are:
|
FWIW, I recently stumbled on what I believe is a good use case of this: in a server environment, run vet checkers and store the results in a database. I believe this could be accomplished using this driver: use packages.Load to load the packages, run vet analyzers on it, and in the post-processing step save the results to the db. |
That does seem like a totally reasonable program to want to write, but the way I would write it today without the checker API is by fork+execing the multichecker executable and saving its JSON output. The overheads of process startup compared to everything else are negligible, and there are isolation benefits to using multiple processes. |
This is a fair point. But it does impose some complexity on users. Most likely in the form of bundling an extra executable. Not insurmountable, but comparatively inconvenient. |
The {single,multi}checker packages provide the main function for a complete application, as a black box. Many users want the ability to customize the analyzer behavior with additional logic, as described in the attached issues. This change creates a new package, go/analysis/checker, that exposes an Analyze pure function---one that avoids global flags, os.Exit, logging, profiling, and other side effects---that runs a set of analyzers on a set of packages loaded (by the client) using go/packages, and presents the graph of results in a form that allows postprocessing. This is just a sketch. API feedback welcome. DO NOT SUBMIT Updates golang/go#30231 Updates golang/go#30219 Updates golang/go#31007 Updates golang/go#31897 Updates golang/go#50265 Updates golang/go#53215 Updates golang/go#53336 Change-Id: I745d319a587dca506564a4624b52a7f1eb5f4751
The {single,multi}checker packages provide the main function for a complete application, as a black box. Many users want the ability to customize the analyzer behavior with additional logic, as described in the attached issues. This change creates a new package, go/analysis/checker, that exposes an Analyze pure function---one that avoids global flags, os.Exit, logging, profiling, and other side effects---that runs a set of analyzers on a set of packages loaded (by the client) using go/packages, and presents the graph of results in a form that allows postprocessing. This is just a sketch. API feedback welcome. DO NOT SUBMIT Updates golang/go#30231 Updates golang/go#30219 Updates golang/go#31007 Updates golang/go#31897 Updates golang/go#50265 Updates golang/go#53215 Updates golang/go#53336 Change-Id: I745d319a587dca506564a4624b52a7f1eb5f4751
I'm confused as to what happened here. This was a proposal that had not yet been accepted, and it got closed by https://go.dev/cl/411907. Did this not need to be a proposal? Or was the idea just replaced by a different one? Thanks. |
@ianlancetaylor the idea was superseded by #61324. |
Thanks. |
Background
The existing singlechecker.Main and multichecker.Main are simple:
they take package patterns string args (e.g. ./...), loaded them with a hard-coded
go/packages.Config
, and applied analyzers on the loaded packages. They also hand flags common across many analyzers, and handle error parsing, etc.They are convenient, but not flexible enough to support analyzers that need special setup, more involved go/packages.Config (e.g. excluding tests, loading modules, applying overlays). An example analyzer I want to write:
It needs to initialize its state based on the module information loadable with go/packages.NeedModules load mode. Current APIs don't use NeedModules nor expose the loaded packages.
The initialization needs to access the loaded packages - e.g. this loads vulnerability info from a remote DB, and the loaded go/packages.Package and their modules are needed to retrieve only relevant info.
Since
singlechecker.Main
andmultichecker.Main
handle flag parsing, attempting to parse flags or command arguments before them will clobber their flag parsing logic.Potential workaround is either to load the packages twice or to write the analyzer in a way that the very first call of its Run triggers the remaining initialization work. , which isn't ideal.
I propose we introduce a new API that provides hooks and potential customization options. With the change users can
Option 1: A new
Run
that takes aConfig
parameter:Option 2: A package-level global variable for stubbing:
We can consider a package-level global
Load
orConfig
variable that affects the existingMain
's functionality.Since
singlechecker
andmultichecker
are used as the 'main' of an analysis tool and already made use of many process-wide global variables, I expect the introduction of a global, exported variable is not too bad.cc @golang/tools-team @adonovan @timothy-king
The text was updated successfully, but these errors were encountered: