-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: Go 2: parameter tags for static analysis tools #24889
Comments
Would
be legal? What about the below?
A natural generalization of this and struct tags would be to also allow any variable or constant declaration to be tagged. Was this considered? A perhaps more useful generalization is tags on any type, like
Could these coexist with some rule about how to combine the tags from the type tag and the parameter tag? If they cannot coexist it doesn't let you annotate things like "don't copy values of this type" or "the zero value of this type shouldn't be used" without copying the tag to every function accepting values of that type. Sometimes things are just properties of the type and exposing that to static analyzers is useful. |
Yes.
Yes.
Yes. I don't have any use cases for that, so I didn't suggest it. Note that in C a variable with a Tags on any type likely makes sense given this proposal, but we need to consider not just what it looks like for simple types like |
I like the idea of making tags usable in more places, and I concur that this will be great for building static analysis tools. But, I don't think Go should lean further into purely unstructured strings as tags. They are too prone to typos, and are difficult to namespace. Relatedly, they are difficult to version (imagine upgrading from a threadsafety v1 package to a v2 package). Permitting string constants from imported packages is a good step away from this, but I think we should go further. #23637 has a proposal for more structured tags. I'd prefer to see something like that. One could imagine that in addition to allowing structs in the list of tags, we could have an additional rule like "An entry in a tag list may reference a top-level variable declaration either in the current package, or another package. This variable declaration must have an initializer expression, which must be itself valid as a tag expression". This would allow your One could also imagine that we could have a further additional rule like "An entry in a tag list may be a function call, where each argument to the function call must be valid as a tag expression. The body of the called function must consist of only a single return expression, which must be valid as a tag expression, with the exception that it may refer to arguments to the function as though they were constant expressions". This would allow writing a tag like |
Or on second thought, it'd probably be better to just leave out those latter two additional complications. Writing (And if we reeally wanted to fix that former, then perhaps a better solution would be to have the lexer insert the "{}" when it sees a "," or a "]") |
Thanks. I want to clearly separate this proposal, which is about permitting tags on parameter types, from another proposal about changing the syntax of tags. Sure, if we change the syntax of tags on struct field types then this proposal would adopt the new syntax too. But those are two different ideas that should be considered separately. |
Fair enough. |
I didn't mean to imply anything about immutable variables or bindings with the variable case. I was mostly interested in why the line would be drawn there. The only use case for variables that I can think of would be to annotate that a global variable isn't changed after package init so that that property can be linted. There might be some other interesting uses for verifying invariants within a func. It's probably worth skipping just on complexity grounds (for the language and for the reader), though. ———
Either the struct/interface or the tag can be large so whether you choose struct/interface tag def or struct/interface def tag, there are always going to be extreme cases where one or both are large. If it's after, it's going to start on the same line as the
and
are the same in the sense that you could replace S with its definition in that example without juggling. Regularity seems preferable. ——— If types and struct fields/func params can both have tags there needs to be a rule to deal with collisions. I think the simplest would be to treat the tagspace like a map: if the same key is used again, its value wins. That way, in
the parameter Allowing them on both also allows repeated type+tag combos to be named like
which could help documenting some of the more abstruse ——— While I agree that #23637 is the right place to talk about that proposal, I'd note that allowing const strings in addition to literals as proposed here would work better with that one. If you wanted to extend an existing pre-defined tag, you'd need to copy the original definition inline or do
unless you allow
(which isn't going to win any syntax awards). |
While I appreciate the value of tagged declarations (I was even the original proposer of the feature, which was an expansion of a protocol-buffer-supporting feature of Sawzall), I find them very ugly in practice. Putting them on every declaration will make code messy and unreadable. Worse, each new static tool will introduce new annotations, producing an ever growing list of quoted strings on each declaration. I know it's an important feature for structs, and I appreciate the need for them there. In an ideal world, though, I'd find another way. They have become a blight on the page. Please let's not make it worse. |
@robpike currently each new static analysis tool that requires tagging has to introduce a special comment which isn't a lot better and far more ad hoc. Part of the problem is that the tags are strings and part is that they have to be declared each time. With the tweaks I laid out in the post above yours, common patterns could be named via type alias (at least within a package) to reduce repetition and provide a single point for documentation. If this were accepted along with something like #23637 (comment) there's still a change of horrible proliferation but it would be more manageable since the tags would at least be structured, versioned, easily moved around (via type aliases), and they'd autolink to documentation in go doc. There could even (much later) be an x/ repo for collecting tags that many tools agree upon to reduce their number and have a shared single point of truth. It's unpleasant if there are a lot of annotations. Right now there's as much chance for an over proliferation of annotations but when you do need them it's a pain to manage them because they're magic comments and magic strings in struct tags. If there's language support for them, it's at least easier to deal with. |
While I would like to see Go's type system evolve to make Gophers feel safer about their code, I think introducing immutability to Go with src []byte `qual: "const"` instead of with a keyword like src []byte const is worse than having no immutability at all. The former feels like a second-class citizen and immutability should be a first-class citizen. Also, as @robpike said, it doesn't look good on the page. Regardless, I'm glad to see discussion about immutability. |
Annotations are a way to give ad hoc second class citizens uniform first class support. First class immutability would be great but that should be discussed elsewhere. I don't believe @ianlancetaylor's example meant to say "this []byte is immutable": it meant "I promise not to mutate this mutable []byte". They're related but distinct concepts. |
I like the idea of making annotations extensible, and there is certainly precedent for tags intended for external tools. However, in the cases where those have been tried, they tend to converge to standardized annotations anyway. For example, a subset of PEP 3017 annotations in Python were standardized in PEP 484, although to my knowledge they are still only checked by third-party tools such as Generalized syntax may aid in experimentation and extension, but it does not remove the need to standardize the annotations that are found to be useful in practice. |
I figure that a (somewhat dubious) proxy for the potential usage of any extension to struct tags is how many structs have tags currently. To gather data, I wrote a quick counter: https://github.com/jimmyfrasche/tag-struct-ratio It goes through all the .go files in import paths you give it and counts instances of For
That's what I'd expect and roughly the same ratio as my code. For
which was far more than I would have thought. |
@jimmyfrasche That's a nice analysis, but I'm not sure it's applicable. The main use-case for struct tags today is for run-time reflection ( In contrast, the tags Ian is proposing here are explicitly for compile-time analysis: those may or may not be common (depending on what they indicate), but some, such as lock annotations, are likely to be rare. (We do have a few compile-time annotations today as |
Indeed. It's not really an analysis—just a data point. |
I'm withdrawing this proposal. |
A frequent request for Go is a
const
type qualifier along the lines of C (e.g., #20443, #22876). C also has additional type qualifiersvolatile
,restrict
, and_Atomic
, all of which are potentially meaningful for Go.clang supports thread safety analysis for C++ (https://clang.llvm.org/docs/ThreadSafetyAnalysis.html) based on annotations in the source code. This would clearly be applicable to Go code.
The Go language currently has only one type qualifier (channel direction) and has historically resisted adding more. In general, Go encourages programming with code, rather than programming with types. It's not always clear which type qualifiers are useful. It's hard to add new ones.
The main purpose of a type qualifier is to document and enforce function behavior. Adding a
const
qualifier to a function pointer argument informs the reader of the code that the function does not change the pointed-to value using that pointer. The compiler gives an error for any attempt to do so. It may seem that it would also provide an opportunity for compiler optimization, but generally such optimizations are fairly limited. This is because putting aconst
qualifier on one pointer does not mean that the value is not changed in some other way. While there are some opportunities for optimization based on type qualifiers, generally their effect on generated code is minimal.Instead, type qualifiers are best thought of as a kind of compiler-enforced documentation for how the program behaves. The
const
qualifier is useful information for the code reader: this function is not going to change this value.Go is designed to have good support for static analysis tools. Those tools could be used to enforce any set of desired rules. That suggests that rather than adding type qualifiers to the language itself, we instead provide a way to annotate program source code so that separate analysis tools can identify desired type qualifiers and enforce them. Each organization could decide which set of tools it wishes to run.
It follows that we should consider a common mechanism for annotating program source code. While the exact details of each annotation can be left to the tools, there should be a generally agreed convention on how those annotations will be added, so that different tools can cooperate.
We already have a limited set of annotations, implemented by the compiler, in the form of //go: comments serving as compiler directives (see https://golang.org/cmd/compile). Using comments works nicely in that tools that don't care about the particular annotation will quite naturally ignore it. However, it is much less nice when applied to individual function arguments: the natural place to add the comments is within the function signature, but then placement becomes critical and the annotations interfere with reading of the code. Or so it seems to me.
I propose that we instead extend the existing concept of struct tags to permit them to be used in more places. Specifically, permit tags to be placed after the type in a function signature, including the signatures used in interface types.
Thread safety analysis can generally use the existing struct tags:
Function level qualifiers can appear attached to any parameter.
or perhaps we can also permit tags on the signature as a whole.
I propose further that we permit all tags to be string constant expressions, not just string literals. This will permit defining standard tags in packages that may be imported.
These tags will not affect compilation at all, except that they will be placed into reflection information (we'll add a
Tag
method to thereflect.Type
interface) and, naturally, into the export data. Otherwise the new tags will only be used by static analysis tools, possibly including vet.This will provide us with a flexible framework for letting different groups define their own analysis, without, I hope, going too far in the direction of programming by types.
The text was updated successfully, but these errors were encountered: