Skip to content

Conversation

@Rexicon226
Copy link
Contributor

@Rexicon226 Rexicon226 commented Dec 22, 2023

this fixes #2984

this adds a is_autotranslated flag to std.zig.Ast as well as std.zig.Parser. my solution is to create a keyword that only exists in the parser, which when seen updates the flag. the main error logic is handled inside of AstGen, so that if need be, this can be used for other things that might be constrained by an auto generated file. i intentionally do not add the token to the node list for 2 important reasons. first, it has no actual meaning in the code and is simply a "metadata" tag. second, it would mess up a lot of logic that relies on optimizing empty structs, and other checks. it simply makes no sense to spend the time on processing something that gives nothing.

note that the wasm blob needed to be updated because of this syntax affecting stage1

TODO:

  • Fix formatter to correctly ignore the token.
  • Look into doc-gen and why it uses c pointers.
  • Look into some stdlib c pointers to decide if they are truly needed.
outdated, initial message, to give context to mlugg's reply

first,
the solution is a bit strange, but i believe it's the cleanest one. the main problem i faced when implementing this was that making this a "full" token, and appending it to the list would trip a lot of different decls.len == 0 type logic. could i have created special cases for those comparisons, yes. but that would have been a bit unclean and required more code change. i dont see this as a real token, i just think of it as a file descriptor.

second,
you may notice that this error is turned off inside of test blocks. this was not a requirement of the issue, however i see no other way around this. either we could update 500 files with the token, or just let c pointers exist inside of test blocks. there is no other usage for c pointers in my opinion other than for testing.

third,
all of the stdlib changes you see, i simply moved [*c] to [*]. this is possibly not the correct solution, however i just dont know that specific code to tell. please tell me if there is something else that should be used.

i believe this pr is a move forward in the right direction. it's a big breaking change, and is something that should be thought through very carefully, so take your time.

any feedback is very welcome, i only felt confident enough to pr this once the testing suite completed on my machine completed.

(you may notice my shift key is a bit dead)

@mlugg
Copy link
Member

mlugg commented Dec 22, 2023

I haven't looked through the diff properly, but here are some comments at a glance.

  • The error here has been implemented in the parser. That's not ideal, because it blocks later compilation errors. Instead, this check should live in AstGen. This allows other AstGen errors to be caught at the same time, and in future, may allow some Sema errors to be identified too. (The presence of the keyword should simply set a flag on Ast.)

  • Turning this error off within test blocks does not make sense. Tests are completely normal Zig code, and should not be treated any differently - the reasons we wish to avoid C pointers in general Zig code still apply just as well within test blocks.

    either we could update 500 files with the token, or just let c pointers exist inside of test blocks.

    Then we should update the 500 files with the token. Just because a task is tedious doesn't mean we should just implement a strange rule in the language to work around having to do it.

    there is no other usage for c pointers in my opinion other than for testing.

    To reiterate: C pointers have no more use in tests than they do anywhere else in the language. Any usage of them in user code (aside from Zig behavior tests which are testing C pointers) is incorrect.

  • Regarding the following point:

    all of the stdlib changes you see, i simply moved [*c] to [*]. this is possibly not the correct solution [...]

    Please do not make blanket possibly-incorrect standard library changes like this. Every use of [*c] in the standard library must be correctly updated (or otherwise eliminated); having incorrect code in the standard library harms maintainability and reliability, and potentially makes things harder for future contributors. We particularly want to ensure the correctness of code at the ABI boundary, which is also where we are most likely to see current uses of [*c].

    [...] however i just dont know that specific code to tell. please tell me if there is something else that should be used.

    There is no magic shortcut here; nobody is familiar with the entire standard library. For each instance of [*c], you must look at the relevant code and/or documentation, and determine what the correct pointer type is. As above, tedium is no reason to avoid correctness - and believe me, I'm familiar with tedious manual migrations for language changes.

    This will block merge.

@Rexicon226
Copy link
Contributor Author

Thank you for the comments. I have no strong opinions on any of the methods I used, so I would be happy to move this to AstGen (will do shortly).

Then we should update the 500 files with the token. Just because a task is tedious doesn't mean we should just implement a strange rule in the language to work around having to do it.

Understood, will do so. My concern comes from "outsiders" looking at the stdlib as a learning material and seeing this token strewn around everywhere, and make incorrect assumptions, whatever they may be. But again, no strong opinion, if you believe it best to simply update all of the affected files, that is what I'll do.

Please do not make blanket possibly-incorrect standard library changes like this.

Apologies, I just realized that I never marked this PR a draft. I meant for this to be the first stage to get other's opinions on my approach. I plan on making sure it's all correct!

@Rexicon226 Rexicon226 marked this pull request as draft December 22, 2023 23:06
@Rexicon226 Rexicon226 force-pushed the kill_c_pointer branch 2 times, most recently from 3a8bde4 to 50d9903 Compare December 25, 2023 13:52
@Rexicon226 Rexicon226 marked this pull request as ready for review January 2, 2024 04:29
@@ -1,3 +1,5 @@
thisfileisautotranslatedfromc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this has been discussed already, but for the uncommon correct uses of C pointers in std (testing functionality that interacts with them), how about using @Type to programmatically construct C pointer types? Since the check is implemented in AstGen, I believe that should still work, and it would avoid having to infect the entire file with the autotranslated status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was discussed, and that was the consensus iirc.
If you look at my checklist I still have to look through the stdlib uses of the keyword and see if they are truly needed. I will construct with with @Type if they actually are needed 👍

@@ -1,3 +1,5 @@
thisfileisautotranslatedfromc;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is not auto translated from C.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I know. Please note that I still need to go through the instances of thisfileisautotranslatedfromc; that needed to be added, and determine which ones are actually needed. As I said in Ian's review, I will construct them from @Type if they are truly needed.

@andrewrk
Copy link
Member

andrewrk commented May 9, 2024

I'm sorry, I didn't review this in time, and now it has bitrotted. Furthermore, so many pull requests have stacked up that I can't keep up and I am therefore declaring Pull Request Bankruptcy and closing old PRs that now have conflicts with master branch.

If you want to reroll, you are by all means welcome to revisit this changeset with respect to the current state of master branch, and there's a decent chance your patch will be reviewed the second time around.

Either way, I'm closing this now, otherwise the PR queue will continue to grow indefinitely.

@andrewrk andrewrk closed this May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

make the C pointer type always a compile error unless the file declares that it is generated code

4 participants