Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to build-toc to address issue #628 #629

Merged
merged 3 commits into from
Dec 29, 2023
Merged

Changes to build-toc to address issue #628 #629

merged 3 commits into from
Dec 29, 2023

Conversation

drgrigg
Copy link
Contributor

@drgrigg drgrigg commented Dec 28, 2023

I've altered where we check for a language tag in build-toc which seems to fix the problem in issue #628.

I've run a comparison between the results of new and old tocs and the only productions which don't match are either ones that always required a manual adjustment to the ToC or a handful of others which have unrelated issues (though some of these mean I should recheck my code in other areas.

For reference, the non-matching ToCs after this change to build-toc are as follows. The line numbers are in the ToC for each production.

  • anatole-france_penguin-island_a-w-evans : line 132
  • arthur-w-pinero_the-second-mrs-tanqueray : line 20
  • friedrich-nietzsche_beyond-good-and-evil_helen-zimmern : line 236
  • george-macdonald_phantastes : line 17
  • honore-de-balzac_shorts-from-scenes-from-private-life_clara-bell_ellen-marriage : line 27
  • james-branch-cabell_domnei : line 17
  • jean-toomer_cane : line 27
  • john-locke_some-thoughts-concerning-education : line 20
  • ludwig-wittgenstein_tractatus-logico-philosophicus_c-k-ogden : line 32
  • p-g-wodehouse_short-fiction : line 82
  • t-e-lawrence_seven-pillars-of-wisdom : line 453
  • wilfred-owen_poetry : line 76

@acabal
Copy link
Member

acabal commented Dec 28, 2023

To clarify, which ones had "unrelated issues"? I know many of those always required manual fixes, but are there some now that have some kinds of issues?

@drgrigg
Copy link
Contributor Author

drgrigg commented Dec 28, 2023 via email

@drgrigg
Copy link
Contributor Author

drgrigg commented Dec 28, 2023 via email

@drgrigg
Copy link
Contributor Author

drgrigg commented Dec 29, 2023

OK, first point: there are zero differences in the ToCs generated with the old master branch build-toc and those generated with the new code in the fixtoc branch. So I think that's OK to merge.

The ToCs in the released versions which differ are as follows:

  • anatole-france_penguin-island_a-w-evans : a known issue where the naming of the Book parts needs to use the subtitles rather than the titles in the hgroup, eg "Trinco" rather than "Modern Times" as the latter is repeated for several of the book parts.
  • arthur-w-pinero_the-second-mrs-tanqueray : the halftitlepage entry in the release ToC is missing abbreviation tags around "Mrs." Do you want me to raise a pull request to fix that?
  • friedrich-nietzsche_beyond-good-and-evil_helen-zimmern : the two entries for chapter-4-65a and chapter-4-73a show up with lower-case "a" in the ToC of the release version, whereas the source has an uppercase "A", which is what build-toc therefore uses.
  • george-macdonald_phantastes : the release ToC correctly has tags for the roman numerals in Epigraph I, Epigraph II and Epigraph III. But build-toc is reduced to using the <title> to get the text for the ToC and of course that doesn't include the "roman" tag. Rather than trying tricky code on this, it's rare enough for it to be left as a manual task, I think.
  • honore-de-balzac_shorts-from-scenes-from-private-life_clara-bell_ellen-marriage : definitely something odd going on here: when a short story starts with a letter or other material in a separate section before the actual story begins in a new section, build-toc is indenting it as though it were a sub-chapter in the previous story. I'll need to look at this closely in the code.
  • james-branch-cabell_domnei : same issue as for george-macdonald_phantastes.
  • jean-toomer_cane : build-toc isn't correctly indenting the ToC entries in each Part, even though data-parent is correctly used. This definitely needs to be looked at in the code, I think it's due to the fact that the short stories are in articles rather than sections.
  • ludwig-wittgenstein_tractatus-logico-philosophicus_c-k-ogden : this has a complex structure, as you know, and is one of our known exceptions to automatic build-toc.
  • p-g-wodehouse_short-fiction : it looks as though the order and number of stories in the collection has changed without build-toc having been run again. Do you want me to raise a pull request to fix that?
  • t-e-lawrence_seven-pillars-of-wisdom : another odd problem to do with indenting in the ToC, this time a spurious indent of the first Appendix, as though it were a subchapter of the Endnotes. I'll need to have a look at the code to see what's going on.
  • wilfred-owen_poetry : the release ToC shows a subtitle for the poem "Strange Meeting", that subtitle being "Another Version". This definitely has to be a manual exception to the build-toc rules.

So, to sum up, I think the changes in the pull request should be applied, but there are several new issues thrown up by my comparison between the released ToCs and what build-toc is generating. But I think these should be the subject of new Issues, which I'll create.

@acabal acabal merged commit 95020b2 into master Dec 29, 2023
2 checks passed
@acabal
Copy link
Member

acabal commented Dec 29, 2023

OK great, thanks. Some of those can't be fixed as the ebooks are exceptions, like Phantastes. If you have time though you can look at the ones that aren't to see if we can get things into shape.

@drgrigg drgrigg deleted the fixtoc branch December 29, 2023 02:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants