-
Notifications
You must be signed in to change notification settings - Fork 580
Preparation Patches for Refcounted Stack Patch. #20865
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The R modifier to the stack debugging switch -Ds already indicates if an SV's refcount is > 1 or SvTEMP is set (T), or the SV is on the temps stack without SvTEMP set (t), e.g.: => IV(101), <2>IV(102), <T>IV(103) <3t>IV(104) With this commit, it displays SVs with the SvPADTMP flag set as <P>, e.g. => <P>IV(101), <2P>IV(102), <PT>IV(103)
The tryAMAGICftest_MG() macro was doing two checks: 1) seeing whether the filetest operator's arg (*PL_stack_sp) looks to have magic or be a reference, 2) and if the op has children (which will have pushed an arg, unlike (-X _), If both are true, then do full-on magic and/or overload processing. The problem with this is that it checks the arg *before* checking whether there's even an arg. Thus in the case of (-X _), it is actually examining a random SV on the stack (or in the case of nothing on the stack, examining the PL_sv_undef pointer always stored at PL_stack_base[0] as a guard.) It turns out this was harmless - the test for (1) will examine a random (but real) SV and get garbage results, but then the 2nd test will fail anyway, so overloading won't be called. So the fix is to swap the (1) and (2) test order. In addition, I changed the 'has an argument' test from OPf_KIDS to !OPf_REF. These should be mutually exclusive, but the OPf_REF flag formally indicates (-X _), i.e. that no arg has been pushed on the stack. Whether the op has children or not could potentially change in the future, independent of whether it's the (-X _) form. So overall this commit makes no visible functional difference, but may make the code more robust against future changes.
This field is used on DEBUG_LEAKING_SCALARS builds to track the file where the SV was allocated. When freeing the SV, the string was freed but the pointer was left pointing at the freed string. So NULL it out.
A few places were calling pp_pushmark(), when they should have been just directly doing a PUSHMARK()
Perl_leave_adjust_stacks(), which is called by all the scope-exiting ops (pp_leave, pp_leavesub etc), handles scalar context by updating the list of SVs on the stack to be returned to be just the one-item list at the top of the stack (all other items on the stack being discarded). For the special case of scalar context and no items on the return list, it instead puts &PL_sv_undef at the lowest point on the stack and skips most of the rest of the function. The rest of the function includes things like shuffling down any args to be returned, which obliterates any other stuff on that stack that needs discarding. For example in for (qw(a b c)) { ....; return qw(x y z); ... } the stack contains a b c x y z; the a,b,c need discarding and the x,y,z shifting down. This commit removes the 'skip rest' special behaviour, and makes scalar return of an empty list behave the same as a scalar return of a non-empty list. So it pushes &PL_sv_undef at the top of the stack, then goes through the normal "shift and copy the top arg down the stack" code path. This is slightly less efficient, but this is relatively rare condition, and will make converting Perl_leave_adjust_stacks() to handle a reference-counted stack easier.
In something like for $package_var (....) { ... } or more experimentally, for \$lvref (....) { ... } when entering the loop in pp_enteriter, perl would pop the GV/LVREF off the stack, but didn't bump its refcount. Thus it was possible (if unlikely) that it could be freed during the loop. In particular this crashed: $f = "foo"; for ${*$f} (1,2) { delete $main::{$f}; # delete the glob *foo ...; } This will become more serious when the stack becomes refcounted, as popping something off the stack will trigger a refcount decrement on it and thus a possible immediate free of the GV. This commit future-proofs for loops against this by ensuring that the refcount of the SV referred to by cx->blk_loop.itervar_u.svp is appropriately bumped up / down on entering / exiting the loop.
This function is a bit of mess. It gets the label string either from the op, or for OPf_STACKED, from the SV at the top of the stack. This commit reduces the amount of OPf_STACKED tests and repeated stack accesses for the label SV (even after it's notionally been popped off the stack!) By simplifying the code, it will also make it easier to make the changes necessary for a reference-counted stack.
'av' is in scope both for the whole function, and for a small block within that function. Rename the inner variable to av0 to avoid confusion.
This test looks for a 'Use of freed value in iteration' warning, which will soon disappear when this branch makes the stack reference counted. Make the test more modifiable so that it can be made conditional on build options.
These two functions do a slightly odd thing (which has been present since 5.000) when calling out to a CV: they half fake up an OP_ENTERSUB, then call pp_entersub() directly, and only then if it returns a non-null PL_op value, execute the rest of the ops of the sub within a CALLRUNOPS() loop. I can't find any particular reason for this. I guess it might make calling XS subs infinitesimally faster by not have to invoke the runops loop when only a single op is executed (the entersub), but hardly seems worth the effort. Conversely, eliminating this special-case makes the code cleaner, and it will allow the workarounds planned to be added shortly (to the runops loops for reference-counted stacks) to work uniformly. Without this commit, pp_entersub() would get called before runops() has had a chance to fix up the stack if necessary. So this commit *fully* populates the fake OP_ENTERSUB (including type and pp address) then causes pp_entersub to be invoked implicitly from the runops loop rather than explicitly.
Like the previous commit which did it for amagic_call() and call_sv(), this commit makes executing the faked-up OP_ENTEREVAL be executed as part of the runops loop rather than as a separate call. This is to allow shortly fixing up for a reference-counted stack. (CALLRUNOPS() will reify the stack if necessary, while the raw call to pp_entereval() won't fix up the stack unless its part of the runops loop too.) However, this is a bit more complex than call_sv() etc in that there is a good reason for calling pp_entereval() separately. The faked up OP_ENTEREVAL has its op_next set to NULL - this is the op which would normally be returned on failure of the eval compilation. By seeing whether the retuned value from pp_entereval() is NULL or not, eval_sv() can tell whether compilation failed. On the other hand, if pp_entereval() was made to be called as part of the runops loop, then the runops loop *always* finishes with PL_op set to NULL. So we can no lo longer distinguish between compile-failed and compile-succeeded-and-eval-ran-to-completion. This commit moves the entereval into the runops loop, but restores the ability to distinguish in a slightly hacky way. It adds a new private flag for OP_ENTEREVAL - OPpEVAL_EVALSV - which indicates to pp_entereval() that it was called from eval_sv(). And of course eval_sv() sets this flag on the OPpEVAL_EVALSV op it fakes up. If pp_entereval() fails to compile, then if that flag is set, it pushes a null pointer onto the argument stack before returning. Thus by checking whether *PL_stack_sp is NULL or not on return from CALLRUNOPS(), eval_sv() regains the ability to distinguish the two cases.
This XS function is for testing the stack-extending EXTEND() macro. This commit fixes two issues. 1) it uses a nested 'sp' variable declaration in addition to the one declared implicitly, which is confusing. Use a separate variable called new_sp instead. This changes the logic slightly, since the EXTEND() macro messes implicitly with sp, including updating it after a realloc. We have to do that manually now with new_sp. 2) The test function NULLs a couple of items near the top of the (potentially just extended) stack. Where the extend size is zero, it could be NULLing out one or two of the passed arguments on the stack. At the moment these values aren't used any more and are discarded on return; but it will get messy once the stack becomes reference-counted, so only NULL addresses if they're above PL_stack_sp.
A test recently added to check reference count of a stored AV element had to account for extra reference counts from the temporary refs generated by if (\$a[0] == \$j) { ... } Instead, calculate this boolean value in a separate statement so the ref counts are easier to understand.
This function handles perl -Dsv, producing output like STACK 0: MAIN CX 0: BLOCK => CX 1: SUB => UNDEF PV("main"\0) retop=leave STACK 1: MAGIC CX 0: SUB => IV(1) When a CX stack had zero contexts pushed (like can sometimes happen when something has just done a PUSHSTACKi() and no op has pushed a BLOCK or SUB or whatever yet), then the code for determining where the next markstack pointer is (by peeking ahead into the first CX of the next SI) was accessing random garbage at cx[0]. This commit fixes that.
if (!*PL_stack_sp) { | ||
/* In the presence of the OPpEVAL_EVALSV flag, | ||
* pp_entereval() pushes a NULL pointer onto the stack to | ||
* indicate compilation failure */ | ||
PL_stack_sp--; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good change, but I would like to have more than 2.5 hours to between opening the PR and merging it next time so the rest of us also get the chance to review. |
Please don't exaggerate. Its the same code that is in smoke-me/davem/rc_stack2 aka #20858 which has been up for two days, and afaik its the same code as was in #20825 which was pushed last week. So you have had more time to review it than that. It passed tests both locally and in CI, and I approved it[1], so it passed the second set of eyes test, which is what our normal rules require for a merge. Also it is @iabyn code, which I place a high level of trust in. I'm sorry you feel that was too fast for you to have a chance to review, but I don't think this was against the rules in any way. In future however Ill try to give you more time. [1] Implicitly - I volunteered to push it so @iabyn could focus on the main part of the branch, and because he had expressed discomfort and annoyance at being forced to use GitHub for patches. Unfortunately GitHub doesn't deal with "on behalf of" PR's, so I could not actually put my checkmark on it as I was the nominal creator of the PR. But the very act of me pushing it for Dave was an act of approval I feel. |
On Tue, Feb 28, 2023 at 06:02:58AM -0800, Hugo van der Sanden wrote:
@hvds commented on this pull request.
> + if (!*PL_stack_sp) {
+ /* In the presence of the OPpEVAL_EVALSV flag,
+ * pp_entereval() pushes a NULL pointer onto the stack to
+ * indicate compilation failure */
+ PL_stack_sp--;
@iabyn, @demerphq This has already been merged, but I think it would be helpful to comment somewhere what `*PL_stack_sp` would be in the absence of this push: in particular to show that it is safe to read, and that it will not coincidentally be NULL.
I've just pushed this. Is it sufficient?
commit a769d3e
Author: David Mitchell ***@***.***>
AuthorDate: Wed Mar 1 14:39:17 2023 +0000
Commit: David Mitchell ***@***.***>
CommitDate: Wed Mar 1 14:39:17 2023 +0000
eval_sv(): improve code comment
Explain in detail why the return arg on the stack from the eval can't be
NULL on successful compilation.
(Spotted by Hugo)
Affected files ...
M perl.c
Differences ...
diff --git a/perl.c b/perl.c
index ad941061bb..54afc2c6fb 100644
--- a/perl.c
+++ b/perl.c
@@ -3259,7 +3259,14 @@ Perl_eval_sv(pTHX_ SV *sv, I32 flags)
if (!*PL_stack_sp) {
/* In the presence of the OPpEVAL_EVALSV flag,
* pp_entereval() pushes a NULL pointer onto the stack to
- * indicate compilation failure */
+ * indicate compilation failure. Otherwise, the top slot on
+ * the stack will be a non-NULL pointer to whatever scalar or
+ * list value(s) the eval returned. In void context it will
+ * be whatever our caller has at the top of stack at the time,
+ * or the &PL_sv_undef guard at PL_stack_base[0]. Note that
+ * NULLs are not pushed on the stack except in a few very
+ * specific circumstances (such as this) to flag something
+ * special. */
PL_stack_sp--;
goto fail;
}
…--
In my day, we used to edit the inodes by hand. With magnets.
|
That's super, thank you. |
That branch was explicitly marked as "Not for merge yet.", that's a totally different situation. |
This contains the first 16 patches from #20858 (smoke-me/davem/rc_stack2) aka "Smoke me/davem/rc stack2".
These patches are in @iabyn's words:
Since the main refcounted stack patch is quite large it makes sense to get these merged first and get them out of the way while we test, review and refine the important parts of the patch.
My (@demerphq) involvement is purely to shepherd these patches into a PR so they can be merged independently. I did a close review of them and they look fine to me, and include some really useful information on how the internals work.