Skip to content

Fix exclude consistency fields 13131 #13305

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ Note that this project **does not** adhere to [Semantic Versioning](https://semv
- We added "Hanging Indent" as the default selected bibliography body format for CSL styles that specify it (e.g. APA). [#13074](https://github.com/JabRef/jabref/pull/13074)
- We fixed an issue where bibliography entries generated from CSL styles had leading spaces. [#13074](https://github.com/JabRef/jabref/pull/13074)
- We fixed an issue where the preview area in the "Select Style" dialog of the LibreOffice integration was too small to display full content. [#13051](https://github.com/JabRef/jabref/issues/13051)
- We excluded specific fields (e.g., `comment`, `pdf`, `sortkey`) from the consistency check to reduce false positives [#13131](https://github.com/JabRef/jabref/issues/13131)
- We fixed an issue where the tab showing the fulltext search results was not displayed. [#12865](https://github.com/JabRef/jabref/issues/12865)
- We fixed an issue showing an empty tooltip in maintable. [#11681](https://github.com/JabRef/jabref/issues/11681)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,62 @@
import java.util.Map;
import java.util.SequencedCollection;
import java.util.Set;
import java.util.stream.Collectors;
import java.util.stream.Stream;
import java.util.function.BiConsumer;


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checkstyle

import org.jabref.logic.bibtex.comparator.BibEntryByCitationKeyComparator;
import org.jabref.logic.bibtex.comparator.BibEntryByFieldsComparator;
import org.jabref.logic.bibtex.comparator.FieldComparatorStack;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.field.Field;
import org.jabref.model.entry.field.SpecialField;
import org.jabref.model.entry.field.StandardField;
import org.jabref.model.entry.field.UserSpecificCommentField;
import org.jabref.model.entry.types.EntryType;

public class BibliographyConsistencyCheck {

// ----------------------------------------------------------------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use proper Java doc, keep comment style

// Fields that we do *not* want to check for consistency:
// – every field in StandardField.AUTOMATIC_FIELDS
// – COMMENT, CROSSREF, CITES, PDF, REVIEW, SORTKEY, SORTNAME, TYPE, XREF, GROUPS
// – plus any SpecialField or UserSpecificCommentField (filtered at runtime)
// ----------------------------------------------------------------
private static final Set<Field> FILTERED_FIELDS = Stream
// start with JabRef’s built‐in AUTOMATIC_FIELDS set
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment, code is self-explanatory

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with AI comments is that if humans and AI both work in the code, it is not ensured if the comments match the implementation if the code is updated.

.concat(
StandardField.AUTOMATIC_FIELDS.stream(),
// then add the extra fields to ignore
Stream.of(
StandardField.COMMENT,
StandardField.CROSSREF,
StandardField.CITES,
StandardField.PDF,
StandardField.REVIEW,
StandardField.SORTKEY,
StandardField.SORTNAME,
StandardField.TYPE,
StandardField.XREF,
StandardField.GROUPS
)
)
.collect(Collectors.toSet());

/**
* Drop any field we don’t want to check.
*/
private static Set<Field> filterFields(Collection<Field> fields) {
Copy link
Member

@subhramit subhramit Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Can you use a better method name, hinting at what we are filtering against? "filterFIelds" is a bit generic. Lets consider filterXFields or removeXFields - think what one word can "X" be, based on the logic?
If no such crisp word is possible, you can ignore this suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we can be more specific here wrt the parameter

Suggested change
private static Set<Field> filterFields(Collection<Field> fields) {
private static Set<Field> filterFields(SequencedSet<Field> fields) {

return fields.stream()
.filter(f -> !FILTERED_FIELDS.contains(f))
.filter(f -> !(f instanceof SpecialField))
.filter(f -> !(f instanceof UserSpecificCommentField))
Comment on lines +59 to +61
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We avoid abbreviations

Suggested change
.filter(f -> !FILTERED_FIELDS.contains(f))
.filter(f -> !(f instanceof SpecialField))
.filter(f -> !(f instanceof UserSpecificCommentField))
.filter(field -> !FILTERED_FIELDS.contains(field))
.filter(field -> !(field instanceof SpecialField))
.filter(field -> !(field instanceof UserSpecificCommentField))

.collect(Collectors.toSet());
}



public record Result(Map<EntryType, EntryTypeResult> entryTypeToResultMap) {
}

Expand Down Expand Up @@ -69,7 +114,10 @@ public Result check(List<BibEntry> entries, BiConsumer<Integer, Integer> entries

List<BibEntry> differingEntries = entryTypeToEntriesMap
.get(entryType).stream()
.filter(entry -> !entry.getFields().equals(commonFields))
.filter(entry -> {
Set<Field> entryFiltered = filterFields(entry.getFields());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name 'entryFiltered' is not descriptive enough. It should convey the purpose or content of the variable more clearly.

return !entryFiltered.equals(commonFields);
})
.sorted(comparatorStack)
.toList();

Expand All @@ -80,17 +128,25 @@ public Result check(List<BibEntry> entries, BiConsumer<Integer, Integer> entries
}

private static void collectEntriesIntoMaps(List<BibEntry> entries, Map<EntryType, Set<Field>> entryTypeToFieldsInAnyEntryMap, Map<EntryType, Set<Field>> entryTypeToFieldsInAllEntriesMap, Map<EntryType, Set<BibEntry>> entryTypeToEntriesMap) {
for (BibEntry entry : entries) {
EntryType entryType = entry.getType();

Set<Field> fieldsInAnyEntry = entryTypeToFieldsInAnyEntryMap.computeIfAbsent(entryType, _ -> new HashSet<>());
fieldsInAnyEntry.addAll(entry.getFields());

Set<Field> fieldsInAllEntries = entryTypeToFieldsInAllEntriesMap.computeIfAbsent(entryType, _ -> new HashSet<>(entry.getFields()));
fieldsInAllEntries.retainAll(entry.getFields());

Set<BibEntry> entriesOfType = entryTypeToEntriesMap.computeIfAbsent(entryType, _ -> new HashSet<>());
entriesOfType.add(entry);
}
entries.forEach(entry -> {
EntryType entryType = entry.getType();

// 1) filter out AUTOMATIC_FIELDS + COMMENT, CROSSREF, CITES, PDF, REVIEW,
// SORTKEY, SORTNAME, TYPE, XREF, GROUPS
// 2) remove any SpecialField or UserSpecificCommentField
Set<Field> filteredFields = filterFields(entry.getFields());

Set<Field> fieldsInAnyEntry = entryTypeToFieldsInAnyEntryMap
.computeIfAbsent(entryType, k -> new HashSet<>());
fieldsInAnyEntry.addAll(filteredFields);

Set<Field> fieldsInAllEntries = entryTypeToFieldsInAllEntriesMap
.computeIfAbsent(entryType, k -> new HashSet<>(filteredFields));
fieldsInAllEntries.retainAll(filteredFields);

Set<BibEntry> entriesOfType = entryTypeToEntriesMap
.computeIfAbsent(entryType, k -> new HashSet<>());
entriesOfType.add(entry);
});
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@
import org.jabref.logic.importer.ImportFormatPreferences;
import org.jabref.logic.importer.fileformat.BibtexImporter;
import org.jabref.model.entry.BibEntry;
import org.jabref.model.entry.field.SpecialField;
import org.jabref.model.entry.field.StandardField;
import org.jabref.model.entry.field.UnknownField;
import org.jabref.model.entry.field.UserSpecificCommentField;
import org.jabref.model.entry.types.StandardEntryType;
import org.jabref.model.util.DummyFileUpdateMonitor;

Expand All @@ -18,6 +20,8 @@
import org.mockito.Answers;

import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.mockito.Mockito.mock;

class BibliographyConsistencyCheckTest {
Expand Down Expand Up @@ -102,4 +106,43 @@ void checkLibraryWithoutIssues(@TempDir Path tempDir) {
BibliographyConsistencyCheck.Result expected = new BibliographyConsistencyCheck.Result(Map.of());
assertEquals(expected, result);
}

/**
* Entries that differ only by automatic, comment, PDF, special or user-specific comment fields
* should produce no consistency results.
*/
@Test
void filteredFieldsAreIgnored() {
// MISC entries differing only in filtered fields:
BibEntry a = new BibEntry(StandardEntryType.Misc, "a")
.withField(StandardField.COMMENT, "note")
.withField(StandardField.PDF, "file.pdf")
.withField(new UserSpecificCommentField("XYZ"), "foo")
.withField(SpecialField.PRIORITY, "high");
BibEntry b = new BibEntry(StandardEntryType.Misc, "b")
.withField(StandardField.COMMENT, "another note")
.withField(StandardField.PDF, "other.pdf");

var result = new BibliographyConsistencyCheck().check(List.of(a, b));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use var. At JabRef, we prefer strict types.
Also, did the compiler not complain here? You sent one argument instead of two, so the unit test is failing. Did you run the test before filing the PR?


// Since all differences are in filtered fields, there should be no entries reported:
assertTrue(result.entryTypeToResultMap().isEmpty(),
"Differences only in filtered fields must be ignored");
Comment on lines +129 to +130
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of assertTrue to check for an empty map is less informative than using assertEquals to directly compare the map to an empty map, which would provide clearer test failure messages.

}

@SuppressWarnings("checkstyle:RegexpMultiline")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not suppress checkstyle warnings without having a good reason to do so.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi subhramit,
thank you for your advice. I will fix it.

@Test
void nonFilteredFieldDifferenceIsReported() {
// Two MISC entries differing in AUTHOR (not filtered)
BibEntry withAuthor = new BibEntry(StandardEntryType.Misc, "1")
.withField(StandardField.AUTHOR, "Knuth");
BibEntry withoutAuthor = new BibEntry(StandardEntryType.Misc, "2");

var result = new BibliographyConsistencyCheck().check(List.of(withAuthor, withoutAuthor));

// AUTHOR should be reported as a unique field
var typeResult = result.entryTypeToResultMap().get(StandardEntryType.Misc);
Comment on lines +141 to +144
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comments regarding var and number of arguments here as https://github.com/JabRef/jabref/pull/13305/files#r2140865139.

assertNotNull(typeResult);
assertTrue(typeResult.fields().contains(StandardField.AUTHOR));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using assertTrue to check if a collection contains an element is less informative than using assertEquals to compare the collection to an expected collection, which would provide clearer test failure messages.

}
}
Loading