-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Improve merge logic to prefer valid year and entry type #13506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…JabRef#12549) - Added YearFieldValueValidityComparator to prefer DOI year when local year is unrealistic or differs by more than 10 years - Enhanced entry type comparison logic to select DOI entry type if local is "misc" - Applied auto-selection in Merge Entries dialog when merging using DOI fetcher
Enhanced field selection logic in the Merge Entries dialog when fetching from DOI to prefer valid years and entry types
jabgui/src/main/java/org/jabref/gui/mergeentries/newmergedialog/ThreeWayMergeView.java
Show resolved
Hide resolved
| /** | ||
| * Compares the validity or appropriateness of two field values. | ||
| * | ||
| * @param leftValue value from the library (or candidate) | ||
| * @param rightValue value from the fetcher (or existing record) | ||
| * @return 1 if right is better, 0 if undetermined or equal, -1 if right is worse | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JavaDoc is trivial and simply restates what can be derived from the method signature and implementation. It should provide additional context or reasoning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, it is OK.
|
Your code currently does not meet JabRef's code guidelines. We use Checkstyle to identify issues. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Tests / Checkstyle (pull_request)" and click on it. In case of issues with the import order, double check that you activated Auto Import. You can trigger fixing imports by pressing Ctrl+Alt+O to trigger Optimize Imports. Please carefully follow the setup guide for the codestyle. Afterwards, please run checkstyle locally and fix the issues, commit, and push. |
koppor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start. Some more strong typing.
| if (fetcher instanceof DoiFetcher) { | ||
| dialog.autoSelectBetterFields(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's try with all entries - remove the if guard.
| * | ||
| * @param leftValue value from the fetcher (or candidate) | ||
| * @param rightValue value from the library (or existing record) | ||
| * @return 1 if left is better, 0 if undetermined or equal, -1 if left is worse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use an enum for better readabiality. int offers more values which can be confusing for other developers.
--> strong typing!
We accept that not Comparator<String> can be used.
Futhermore, the semantics of Comparator is more on the values itself not on the semantics of the values :)
E.g. 2500 is smaller than 2025 in your code :)
| Optional<Integer> leftYear = extractYear(leftValue); | ||
| Optional<Integer> rightYear = extractYear(rightValue); | ||
|
|
||
| boolean leftYearInRange = (leftYear.get() >= 1800) && (leftYear.get() <= Year.now().getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a +2, because journals tend to publish one year later
| private static final Pattern YEAR_PATTERN = Pattern.compile("(\\d{4})"); | ||
|
|
||
| /** | ||
| * Compares the validity or appropriateness of two field values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe really use another term everywhere. Also in the class name.
--> Use plausibility (class name, variable name)
"validtity" is used in other contexts...
| * @return 1 if left is better, 0 if undetermined or equal, -1 if left is worse | ||
| */ | ||
| @Override | ||
| public int compare(String leftValue, String rightValue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this abstract :) (And with the new return type)
| private Optional<Integer> extractYear(String value) { | ||
| Matcher matcher = YEAR_PATTERN.matcher(value); | ||
| if (matcher.find()) { | ||
| return Optional.of(Integer.parseInt(matcher.group(1))); | ||
| } | ||
| return Optional.empty(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plesae try to use existing Java
java.time.Year#parse(java.lang.CharSequence)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
YearChecker.checkValue() checks the validity of the year based on the last 4 characters of the string,
E.g : "About 2024" is also considered as a valid year and extractYear() extracts the year value from the strings like : "About 2024"
but java.time.Year#parse(java.lang.CharSequence) only expects strictly valid year string like "2025"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then just add a small Java comment noting this :)
Multiline is OK.
Other code readers might not be aware
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ArshKumar84 You requested a review without adressing my comment - maybe you overlooked it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jabref has a Date class which can parse all kind of dates and use that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public class Date { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion! I looked into the Date class but couldn't find any method that handles year strings like "About 2000", which are treated as valid by the existing YearChecker#checkValue() method.
…JabRef#12549) - applied enum for comparison result to improve the readability - Refactored FieldValueValidityComparator to abstract FieldValuePlausibilityComparator - Enabled plausibility comparison across all fetcher types
|
@koppor I have addressed the changes, please review and suggest if any improvements are needed. |
|
Your code currently does not meet JabRef's code guidelines. We use OpenRewrite to ensure "modern" Java coding practices. You can see which checks are failing by locating the box "Some checks were not successful" on the pull request page. To see the test output, locate "Tests / OpenRewrite (pull_request)" and click on it. The issues found can be automatically fixed. Please execute the gradle task |
koppor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, better
Please DO NOT request reviews, if tests are failing. 😅
| LEFT_BETTER(-1), | ||
| RIGHT_BETTER(1), | ||
| UNDETERMINED(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need numbers. Just the constants.
Remove all code below, please
| if (leftYearInRange) { | ||
| int diff = Math.abs(leftYear.get() - rightYear.get()); | ||
| if (diff > 10) { | ||
| return ComparisonResult.fromInt(Integer.compare(rightYear.get(), leftYear.get())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inline the method from abovce --- this is OK - no need to craft a method only for one caller FROM ANOTHER class (this is the real issue here)
- Removed unnecessary method call from ComparisonResult in YearFieldValuePlausibilityComparator - Removed unnecessary code in enum ComparisonResult - Added comment for extractYear() in YearFieldValuePlausibilityComparator
jablib/src/main/java/org/jabref/logic/bibtex/comparator/ComparisonResult.java
Outdated
Show resolved
Hide resolved
...b/src/main/java/org/jabref/logic/bibtex/comparator/YearFieldValuePlausibilityComparator.java
Outdated
Show resolved
Hide resolved
Co-authored-by: Oliver Kopp <kopp.dev@gmail.com>
|
@trag-bot didn't find any issues in the code! ✅✨ |
Co-authored-by: Oliver Kopp <kopp.dev@gmail.com>
* upstream/main: Also label PR if good first issue is made (#13526) New Crowdin updates (#13529) chore(deps): update dependency org.apache.logging.log4j:log4j-to-slf4j to v2.25.1 (#13528) chore: bump-okhttp-4.12.0-to-5.0.0 (#13521) Have the picker always on top (#13525) Refactor PushToApplications and split into logic and GUI (#13514) Add field to change HTTP port (#13479) Fix trigger of comment Use Java 11 isBlank (#13523) Add run openrewrite (#13524) Comment ion PR just opened (#13522) Improve merge logic to prefer valid year and entry type (#13506) Update dependency com.konghq:unirest-modules-gson to v4.4.12 (#13517) Add rpm target (#13516) Revert module name changes for remaining 'unnamed' Jars (#13515) fix: revert Java module names to restore Status Log compatibility in JabRef 5.15 (#13511) update java vendor in devcontainer and sdkmanrc (#13513)
Closes #12549
YearFieldValueValidityComparatorto prefer DOI year when the local year is unrealistic (e.g., before 1800 or after current year) or differs by more than 10 yearsmiscThis PR improves the auto-selection of fields during entry merge from DOI fetch. It ensures that unrealistic or inaccurate local values are automatically replaced by more valid DOI-sourced ones, specifically for the
yearandentrytypefields.Steps to test
yearthat has a large gap (e.g.,1990) or entry type set to@misc.year(if the difference is >10 years or if the local year is unrealistic) and the betterentrytype(notmisc).Mandatory checks
CHANGELOG.mddescribed in a way that is understandable for the average user (if change is visible to the user)