Skip to content

[ISSUE-12929] Handle BibTeX reserved characters in comments. #13566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

ankamde
Copy link
Contributor

@ankamde ankamde commented Jul 21, 2025

Closes #12929

Steps to test

  1. Download https://github.com/GuenterPartosch/Convert_CTAN/blob/master/output/CTAN.bib
  2. open in JabRef
  3. See that JabRef says it could not parse one entry

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • [/] Screenshots added in PR description (if change is visible to the user)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@ankamde
Copy link
Contributor Author

ankamde commented Jul 21, 2025

@koppor I've implemented this according to your suggestion:
A) Check if line starts with comment indicator string: %, #, --, *, // - and then treat the whole line as comment
but I had to disable one of the unit test - it looks like after my change it is not a valid scenario.

@ankamde ankamde marked this pull request as ready for review July 21, 2025 16:16
@jabref-machine
Copy link
Collaborator

Note that your PR will not be reviewed/accepted until you have gone through the mandatory checks in the description and marked each of them them exactly in the format of [x] (done), [ ] (not done yet) or [/] (not applicable).

Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too general handling.

Not in line with BibTeX

The solution with checking for column 1 was better.

No tests included for all cases.

Tests seem to contain multiple assert statements

@@ -1079,6 +1079,7 @@ void parsePreambleAndEntryWithoutNewLine() throws IOException {
}

@Test
@Disabled
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this is disabled .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It tests definition of preamble in a comment, here's tested line:

\% Encoding: US-ASCII@preamble{some text and \latex}

According to 'Check if line starts with comment indicator string: %, #, --, *, // - and then treat the whole line as comment' this test should not parse preamble but treat it as commented out definition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then modify the test or remove it? Or add a reason in Disabled("disabled becuase of...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Siedlerchr I didn't remove it because I wanted to start discussion. This test contradicts proposed solution - as I said, I'm not familiar with the BibTeX standart - someone wrote this test as a valid test case and I got puzzled. I think it is better to remove it since it is not a valid test case or leave it but assert that preable won't be parsed but treated as commented out line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please give us time to.think.

We need to inspect existing bib files, setup a CI pipeline to test this behavior. Somewhere wr have started this.

You fan also propose something....

Copy link

trag-bot bot commented Jul 21, 2025

@trag-bot didn't find any issues in the code! ✅✨

@ankamde
Copy link
Contributor Author

ankamde commented Jul 22, 2025

@koppor
Too general handling. -> Fair enough

Not in line with BibTeX -> I'm not familiar with BibTeX, could you elaborate on this so I culd better understand the issue I'm trying to fix?

The solution with checking for column 1 was better. -> Will have a look and refactor

No tests included for all cases. -> What are the missing test cases? I'm not familiar with BibTeX and description of this bug mentions only single case. If you could hint me with the missing cases It will help me to better understand the problem.

Tests seem to contain multiple assert statements -> Do you mean checking various comments in sinlge test or invocation of 'assertEquals' method?

@koppor
Copy link
Member

koppor commented Jul 22, 2025

@ankamde its a good third issue, thus we assume some familiarity with BibTeX. Google for "tame the beast"/BibTeX (you will find a decent description of it). And then search for the biblatex manual.

(On the road, sorry for the short answer; wanted to give you pointers early)

@ThiloteE
Copy link
Member

@ankamde
Copy link
Contributor Author

ankamde commented Jul 22, 2025

Thanks @ThiloteE will have a look. @koppor this seems to be a simple change in the code and if I'm missing test cases perhaps more experienced developers could hint me? Unit test for the parser looks complete to me but any help would be appreciated.

@Test
void parseWithBibLaTeXReservedCharacterInComments() throws IOException {
String bibtexEntry = """
% Type of BibLaTeX entries : @article
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All types of comments checked in a single test. Needs to be split.

Maybe parameterized test?

ParserResult result = parser.parse(Reader.of(bibtexEntry));
Collection<BibEntry> entries = result.getDatabase().getEntries();
BibEntry entry = entries.iterator().next();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See testing at DevDocs that one can also do a comparison of rich objects. Should be done here.

@ankamde
Copy link
Contributor Author

ankamde commented Jul 24, 2025

Thanks for the hints. I unasigned myself from this task - need to learn more about biblatex and I don't want to block this issue. Perhaps someone will provide correct solution faster? I will leave this PR as open for a while.

@Siedlerchr
Copy link
Member

@ankamde Some hints https://mirrors.ctan.org/macros/latex/contrib/biblatex/doc/biblatex.pdf
and https://mirrors.ctan.org/info/bibtex/tamethebeast/ttb_en.pdf

@koppor
Copy link
Member

koppor commented Jul 25, 2025

Idea:

  1. Fork https://github.com/JabRef/jabref-demonstration-libraries
  2. Create folder "testing"
  3. Add .bib and .tex file pairs
  4. Adapt https://github.com/JabRef/jabref-demonstration-libraries/blob/main/.github/workflows/check.yml to check the new .tex files (pdflatex, bibtex, pdflatex, pdflatex) and then upload the result using github upload

A .tex file could look like

\documentclass{article}

\begin{document}

\nocite{*}

\bibliographystyle{plain}
\bibliography{references}

\end{document}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CTAN.bib - @online parsed, but its comment
5 participants