Skip to content

Conversation

@ArshKumar84
Copy link
Contributor

Closes #12549

photo_1
photo_2

  • Added YearFieldValueValidityComparator to prefer DOI year when the local year is unrealistic (e.g., before 1800 or after current year) or differs by more than 10 years
  • Enhanced entry type comparison logic to select the DOI entry type if the local entry is misc
  • Applied auto-selection logic in the Merge Entries dialog when merging entries using the DOI fetcher

This PR improves the auto-selection of fields during entry merge from DOI fetch. It ensures that unrealistic or inaccurate local values are automatically replaced by more valid DOI-sourced ones, specifically for the year and entrytype fields.

Steps to test

  1. Open JabRef and create a new BibTeX entry with a year that has a large gap (e.g., 1990) or entry type set to @misc.
  2. Use the "Update with bibliographic data from DOI" feature with a valid DOI.
  3. The Merge Entries dialog should now automatically select the more accurate year (if the difference is >10 years or if the local year is unrealistic) and the better entrytype (not misc).
  4. Verify that the correct fields are selected from the right (DOI) side by default.

Mandatory checks

  • I own the copyright of the code submitted and I license it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • [/] Screenshots added in PR description (if change is visible to the user)
  • [/] Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • [/] Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

…JabRef#12549)

- Added YearFieldValueValidityComparator to prefer DOI year when local year is unrealistic or differs by more than 10 years
- Enhanced entry type comparison logic to select DOI entry type if local is "misc"
- Applied auto-selection in Merge Entries dialog when merging using DOI fetcher
Enhanced field selection logic in the Merge Entries dialog when fetching from DOI to prefer valid years and entry types
Comment on lines 14 to 20
/**
* Compares the validity or appropriateness of two field values.
*
* @param leftValue value from the library (or candidate)
* @param rightValue value from the fetcher (or existing record)
* @return 1 if right is better, 0 if undetermined or equal, -1 if right is worse
*/
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JavaDoc is trivial and simply restates what can be derived from the method signature and implementation. It should provide additional context or reasoning.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, it is OK.

@koppor koppor mentioned this pull request Jul 7, 2025
1 task
@jabref-machine
Copy link
Collaborator

Your code currently does not meet JabRef's code guidelines. We use Checkstyle to identify issues. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Tests / Checkstyle (pull_request)" and click on it.

In case of issues with the import order, double check that you activated Auto Import. You can trigger fixing imports by pressing Ctrl+Alt+O to trigger Optimize Imports.

Please carefully follow the setup guide for the codestyle. Afterwards, please run checkstyle locally and fix the issues, commit, and push.

@ArshKumar84 ArshKumar84 changed the title Fix for issue 12549 Improve merge logic to prefer valid year and entry type Jul 8, 2025
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start. Some more strong typing.

Comment on lines 111 to 113
if (fetcher instanceof DoiFetcher) {
dialog.autoSelectBetterFields();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try with all entries - remove the if guard.

*
* @param leftValue value from the fetcher (or candidate)
* @param rightValue value from the library (or existing record)
* @return 1 if left is better, 0 if undetermined or equal, -1 if left is worse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use an enum for better readabiality. int offers more values which can be confusing for other developers.

--> strong typing!

We accept that not Comparator<String> can be used.

Futhermore, the semantics of Comparator is more on the values itself not on the semantics of the values :)

E.g. 2500 is smaller than 2025 in your code :)

Optional<Integer> leftYear = extractYear(leftValue);
Optional<Integer> rightYear = extractYear(rightValue);

boolean leftYearInRange = (leftYear.get() >= 1800) && (leftYear.get() <= Year.now().getValue());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a +2, because journals tend to publish one year later

private static final Pattern YEAR_PATTERN = Pattern.compile("(\\d{4})");

/**
* Compares the validity or appropriateness of two field values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe really use another term everywhere. Also in the class name.

--> Use plausibility (class name, variable name)

"validtity" is used in other contexts...

* @return 1 if left is better, 0 if undetermined or equal, -1 if left is worse
*/
@Override
public int compare(String leftValue, String rightValue) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this abstract :) (And with the new return type)

Comment on lines 46 to 52
private Optional<Integer> extractYear(String value) {
Matcher matcher = YEAR_PATTERN.matcher(value);
if (matcher.find()) {
return Optional.of(Integer.parseInt(matcher.group(1)));
}
return Optional.empty();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plesae try to use existing Java

java.time.Year#parse(java.lang.CharSequence)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

YearChecker.checkValue() checks the validity of the year based on the last 4 characters of the string,
E.g : "About 2024" is also considered as a valid year and extractYear() extracts the year value from the strings like : "About 2024"

but java.time.Year#parse(java.lang.CharSequence) only expects strictly valid year string like "2025"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then just add a small Java comment noting this :)
Multiline is OK.

Other code readers might not be aware

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArshKumar84 You requested a review without adressing my comment - maybe you overlooked it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jabref has a Date class which can parse all kind of dates and use that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@ArshKumar84 ArshKumar84 Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion! I looked into the Date class but couldn't find any method that handles year strings like "About 2000", which are treated as valid by the existing YearChecker#checkValue() method.

…JabRef#12549)

- applied enum for comparison result to improve the readability
- Refactored FieldValueValidityComparator to abstract FieldValuePlausibilityComparator
- Enabled plausibility comparison across all fetcher types
@ArshKumar84
Copy link
Contributor Author

ArshKumar84 commented Jul 8, 2025

@koppor I have addressed the changes, please review and suggest if any improvements are needed.

@jabref-machine
Copy link
Collaborator

Your code currently does not meet JabRef's code guidelines. We use OpenRewrite to ensure "modern" Java coding practices. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Tests / OpenRewrite (pull_request)" and click on it.

The issues found can be automatically fixed. Please execute the gradle task rewriteRun from the rewrite group of the Gradle Tool window in IntelliJ, then check the results, commit, and push.

@ArshKumar84 ArshKumar84 requested a review from koppor July 8, 2025 13:38
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, better

Please DO NOT request reviews, if tests are failing. 😅

Comment on lines 4 to 6
LEFT_BETTER(-1),
RIGHT_BETTER(1),
UNDETERMINED(0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need numbers. Just the constants.

Remove all code below, please

if (leftYearInRange) {
int diff = Math.abs(leftYear.get() - rightYear.get());
if (diff > 10) {
return ComparisonResult.fromInt(Integer.compare(rightYear.get(), leftYear.get()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline the method from abovce --- this is OK - no need to craft a method only for one caller FROM ANOTHER class (this is the real issue here)

- Removed unnecessary method call from ComparisonResult in YearFieldValuePlausibilityComparator
- Removed unnecessary code in enum ComparisonResult
- Added comment for extractYear() in YearFieldValuePlausibilityComparator
Co-authored-by: Oliver Kopp <kopp.dev@gmail.com>
@trag-bot
Copy link

trag-bot bot commented Jul 9, 2025

@trag-bot didn't find any issues in the code! ✅✨

Co-authored-by: Oliver Kopp <kopp.dev@gmail.com>
@koppor koppor added this pull request to the merge queue Jul 9, 2025
Merged via the queue into JabRef:main with commit f52fcb4 Jul 9, 2025
Siedlerchr added a commit that referenced this pull request Aug 2, 2025
* upstream/main:
  Also label PR if good first issue is made (#13526)
  New Crowdin updates (#13529)
  chore(deps): update dependency org.apache.logging.log4j:log4j-to-slf4j to v2.25.1 (#13528)
  chore: bump-okhttp-4.12.0-to-5.0.0 (#13521)
  Have the picker always on top (#13525)
  Refactor PushToApplications and split into logic and GUI (#13514)
  Add field to change HTTP port (#13479)
  Fix trigger of comment
  Use Java 11 isBlank (#13523)
  Add run openrewrite (#13524)
  Comment ion PR just opened (#13522)
  Improve merge logic to prefer valid year and entry type (#13506)
  Update dependency com.konghq:unirest-modules-gson to v4.4.12 (#13517)
  Add rpm target (#13516)
  Revert module name changes for remaining 'unnamed' Jars (#13515)
  fix: revert Java module names to restore Status Log compatibility in JabRef 5.15 (#13511)
  update java vendor in devcontainer and sdkmanrc (#13513)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

When fetching from DOI: Dialog should select field data based on validity

4 participants