Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stray comments #2420

Merged
merged 4 commits into from
Sep 23, 2024
Merged

Stray comments #2420

merged 4 commits into from
Sep 23, 2024

Conversation

brucemiller
Copy link
Owner

When characterizing certain blocks of XML, we need to be careful of unexpected comment nodes, since the exact order of appearance can be affected by other code changes. For example, this can affect marking nodes due to \centering or similar. This PR fixes a couple of such cases, and updates some test cases which seem to have been missing ltx_centering (or similar).

@brucemiller brucemiller requested a review from dginev September 22, 2024 17:16
Copy link
Collaborator

@dginev dginev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting subtlety. I left a couple of comments, mostly wondering if we will need this kind of enhanced treatment of "content" children somewhere else.

But it also seems good to merge as-is. Thanks!

last if !$previous || $previous->isSameNode($skip); }
while (@children) {
my $skip = $children[0];
last if !$previous;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since $previous is not changed during this while loop, maybe the check that it's defined can be lifted up? As in:

if ($previous) {
  while (my $skip = shift(@children)) {
    last if $previous->isSameNode($skip);
  }
}

@@ -368,8 +368,8 @@ sub insertBlock {
my $inline = $is_svg || $document->canContain($context_tag, '#PCDATA');
my $container = $document->openElement('ltx:_CaptureBlock_', '_vertical_mode_' => 1, %blockattr);
$document->absorb($contents);
my @nodes = $container->childNodes;
my @node_tags = map { $document->getNodeQName($_); } @nodes;
my @nodes = grep { $_->nodeType != XML_COMMENT_NODE; } $container->childNodes;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we make a new method akin to element_nodes that gets "material" child nodes?
Actually that may be easier with an accept-list rather than a reject-list, as in

sub content_nodes {
  my ($node) = @_;
  return ($node ? grep { my $t = $_->nodeType; $t == XML_ELEMENT_NODE || $t == XML_TEXT_NODE } 
    $node->childNodes : ()); }

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion!

@brucemiller
Copy link
Owner Author

Thanks for the good suggestions; adopted.

Yes, it's possible there are other cases; a superficial scan suggests it's done correctly in most cases, but it's an easy thing to overlook when developing new code and you're obsessed with finding the nodes you want.

@brucemiller brucemiller merged commit 7643037 into master Sep 23, 2024
26 checks passed
@brucemiller brucemiller deleted the stray-comments branch September 23, 2024 17:44
teepeemm pushed a commit to teepeemm/LaTeXML that referenced this pull request Oct 29, 2024
* Be careful of unexpected Comment nodes when scanning element children for specific patterns

* Update tests to recover missing block alignments (center, left,...)

* Code cleanup (suggested by D.Ginev)

* Code cleanup (suggested by D.Ginev)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants