Skip to content

gh-96461: clarify the meaning of the oparg for CACHE and COPY opcode #96462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jan 12, 2023

Conversation

MatthieuDartiailh
Copy link
Contributor

@MatthieuDartiailh MatthieuDartiailh commented Aug 31, 2022

@@ -398,13 +401,17 @@ The Python compiler currently generates the following bytecode instructions.
Push the *i*-th item to the top of the stack. The item is not removed from its
original location.

The stack is indexed from 1, so COPY 1 will copy TOS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of adding this, we could define TOS(i) at the top along with TOS1, TOS2 etc, and then reword the definition here to use those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried something along those lines in my last commit let me know if you like it better this way.

@@ -382,6 +382,10 @@ The Python compiler currently generates the following bytecode instructions.

**General instructions**

In the following, TOS(i) refer to the *i*-th item on the stack which is index from 1.
Copy link
Member

@iritkatriel iritkatriel Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In the following, TOS(i) refer to the *i*-th item on the stack which is index from 1.
In the following, TOS(i) refers to the *i*-th item on the stack which is indexed from 1.

@@ -382,6 +382,10 @@ The Python compiler currently generates the following bytecode instructions.

**General instructions**

In the following, TOS(i) refer to the *i*-th item on the stack which is index from 1.
We also use as shorthand TOS = TOS(1) which is the top-of-stack.
TOS1, TOS2, TOS3 are the second, third and fourth items on the stack, respectively.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit unfortunate that TOS1 = TOS(2), TOS2 = TOS(3), etc.

@markshannon - would you like to change this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was another reason for my previous formulation that avoided ro shed too much light on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally we count from one for stack items, so I would like to change it. Ideally drop TOS1 etc, all together.
The current explanation is not explicit about what is popped or not, so the shorthand is not helpful.

@@ -395,15 +399,15 @@ The Python compiler currently generates the following bytecode instructions.

.. opcode:: COPY (i)

Push the *i*-th item to the top of the stack. The item is not removed from its
Push TOS(i) to the top of the stack. The item is not removed from its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Push a copy of ith item (counting from 1) to the top of the stack ..."

original location.

.. versionadded:: 3.11


.. opcode:: SWAP (i)

Swap TOS with the item at position *i*.
Swap TOS with TOS(i).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "Swap the first and ith item on the stack (counting from 1).

@MatthieuDartiailh
Copy link
Contributor Author

I can rephrase everything to not use TOS(i) and repeat that the stack is indexed from 1 in each relevant place however they are quite a few occurrences (many more than what this PR currently does).

@iritkatriel do you agree with @markshannon on this ?

@iritkatriel
Copy link
Member

I can rephrase everything to not use TOS(i) and repeat that the stack is indexed from 1 in each relevant place however they are quite a few occurrences (many more than what this PR currently does).

@iritkatriel do you agree with @markshannon on this ?

I agree that we don't need to have both TOS(i) and TOSi, I think removing the TOSis is the smaller change, amirite?

@markshannon - were you suggesting to add "(counting from 1)" on each place where TOS(i) is used? Or just once where it's defined or used for the first time?

@MatthieuDartiailh
Copy link
Contributor Author

Removing the TOSi should be a smaller change yes and would not entail any large duplication.

@MatthieuDartiailh
Copy link
Contributor Author

I went ahead and replaced TOS by TOS(1)

I am not sure this a clear win for the reader TBH. Let me know what you think.

@sweeneyde
Copy link
Member

I don't know if this has already been considered, but one approach might be to eliminate "TOS" terminology altogether and instead refer to stack[-1], stack[-2], stack.pop(), etc., borrowing the well-understood semantics of lists.

  • POP_TOP becomes "Remove stack[-1], i.e., do stack.pop()"
  • COPY becomes stack.append(stack[-i]).
  • SWAP becomes stack[-1], stack[-i] = stack[-i], stack[-1]
  • UNARY_POSITIVE becomes stack[-1] = +stack[-1]
  • BINARY_SUBSCR becomes right = stack.pop(); left = stack.pop(); stack.append(left[right]), or similar.

What do you think?

@MatthieuDartiailh
Copy link
Contributor Author

That may lead to clearer description indeed. I must say I first tried to make the changes minimal but I can undertake a larger changes if it is generallyvthough to be useful.

@MatthieuDartiailh
Copy link
Contributor Author

I went ahead with @sweeneyde suggestion and also fixed a bunch of small issues in separate commits (let me know if they should be split into their own PR). I will wait to get some feedback before dealing with the merge conflict.

@iritkatriel
Copy link
Member

I think @sweeneyde 's suggestion is good.

@MatthieuDartiailh
Copy link
Contributor Author

Thanks @iritkatriel. I will incorporate your changes and work on fixing the conflicts.

@ghost
Copy link

ghost commented Dec 22, 2022

All commit authors signed the Contributor License Agreement.
CLA signed

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not sure why you hate the term "TOS" so much. It's short, and doesn't go into the (irrelevant) issue whether the stack grows up or down.

Comment on lines 404 to 436
Push the i-th item to the top of the stack without removing it from its original
location.::

STACK.append(STACK[-i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add that i should not be zero.

@MatthieuDartiailh
Copy link
Contributor Author

I have no issue with the term TOS. I am simply looking for the best way to document opcode operating on the i-th element of the stack. Ideally I would like whatever is picked to be easy to remember and ideally consistent across the documentation. I like STACK[-i] because to me at least it makes it very obvious that the indexing starts at 1. But I would happily make whichever changes are believed to be best.

@MatthieuDartiailh
Copy link
Contributor Author

I rebased this and forced pushed.

There are places where using TOS instead of STACK[-1] may be more readable and make this change less disruptive but it would hurt uniformity. @iritkatriel @gvanrossum let me know which version you prefer and I will make the relevant changes

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like describing the stack as a Python list, it is unambiguous and clear to any Python developer.

However, you need to be careful to accounts for pops, pushes and sequencing.

E.g. LIST_APPEND is
list.append(STACK[-i-1], STACK.pop())
not
list.append(STACK[-i], STACK[-1])



.. opcode:: BINARY_OP (op)

Implements the binary and in-place operators (depending on the value of
*op*).
``TOS = TOS1 op TOS``.
``STACK[-1] = op STACK[-1]``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are missing the left operand.
STACK[-1] = STACK[-2] op STACK.pop()
or

rhs = STACK.pop()
lhs = STACK.pop()
STACK.push(lhs op rhs)


.. versionadded:: 3.11


.. opcode:: BINARY_SUBSCR

Implements ``TOS = TOS1[TOS]``.
Implements ``STACK[-1] = STACK[-2][STACK[-1]]``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is incorrect, although no more than the old version.
See my comments on BINARY_OP

@MatthieuDartiailh
Copy link
Contributor Author

I rebased to address the conflicts and updated the description of CALL_INSTRINSIC_1

ping @markshannon

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent, thanks for this.

I've one comment on BUILD_TUPLE
And there is still @gvanrossum's comment on COPY to address.

resulting tuple onto the stack.
resulting tuple onto the stack.::

STACK.append(tuple(STACK[-count:]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite correct. The colon should be before the -count and the values need to be popped from the stack.
Also state that count should not be zero (it can be, but the compiler emits LOAD_CONST in that case, so we might change that it the future.

    assert(count > 0)
    STACK, values = STACK[-count:], STACK[:-count]
    STACK.append(tuple(values))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused. BUILD_TUPLE consume the items on top of the stack so if count is 2 that would be STACK[-2],STACK[-1] == STACK[-2:] so shouldn't we write:

assert count > 0
STACK, values = STACK[:-count], STACK[-count:]
STACK.append(tuple(values))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@MatthieuDartiailh
Copy link
Contributor Author

I updated BUILD_TUPLE based on my response to your comment which I believe is correct. Correct if I somehow got it wrong still.

I have made the requested changes; please review again

@bedevere-bot
Copy link

Thanks for making the requested changes!

@markshannon: please review the changes made to this pull request.

@markshannon
Copy link
Member

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants