Pegen improperly memoizes loop rules

# Bug report

In the following part of pegen, the memoized result is only used if `self._should_memoize(node)` is true. Since loop rules are autogenerated, they are not marked with `(memo)`, so this is always false. Despite that, in the end the result is memoized based on the different condition `if node.name`. Because of that, in the generated parser loop rules' results are always stored in the cache, but never accessed later.

https://github.com/python/cpython/blob/8de59c1bb9fdcea69ff6e6357972ef1b75b71721/Tools/peg_generator/pegen/c_generator.py#L608-L647

# Your environment

Does not matter - discovered through manual code analysis


### Linked PRs
* gh-102467
* gh-102473
* gh-102474

	def _handle_loop_rule_body(self, node: Rule, rhs: Rhs) -> None:
	memoize = self._should_memoize(node)
	is_repeat1 = node.name.startswith("_loop1")

	with self.indent():
	self.add_level()
	self._check_for_errors()
	self.print("void *_res = NULL;")
	if memoize:
	self.print(f"if (_PyPegen_is_memoized(p, {node.name}_type, &_res)) {{")
	with self.indent():
	self.add_return("_res")
	self.print("}")
	self.print("int _mark = p->mark;")
	self.print("int _start_mark = p->mark;")
	self.print("void *_children = PyMem_Malloc(sizeof(void ));")
	self.out_of_memory_return(f"!_children")
	self.print("Py_ssize_t _children_capacity = 1;")
	self.print("Py_ssize_t _n = 0;")
	if any(alt.action and "EXTRA" in alt.action for alt in rhs.alts):
	self._set_up_token_start_metadata_extraction()
	self.visit(
	rhs,
	is_loop=True,
	is_gather=node.is_gather(),
	rulename=node.name,
	)
	if is_repeat1:
	self.print("if (_n == 0 \|\| p->error_indicator) {")
	with self.indent():
	self.print("PyMem_Free(_children);")
	self.add_return("NULL")
	self.print("}")
	self.print("asdl_seq _seq = (asdl_seq)_Py_asdl_generic_seq_new(_n, p->arena);")
	self.out_of_memory_return(f"!_seq", cleanup_code="PyMem_Free(_children);")
	self.print("for (int i = 0; i < _n; i++) asdl_seq_SET_UNTYPED(_seq, i, _children[i]);")
	self.print("PyMem_Free(_children);")
	if node.name:
	self.print(f"_PyPegen_insert_memo(p, _start_mark, {node.name}_type, _seq);")
	self.add_return("_seq")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pegen improperly memoizes loop rules #102416

Bug report

Your environment

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Pegen improperly memoizes loop rules #102416

Description

Bug report

Your environment

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions