-
-
Notifications
You must be signed in to change notification settings - Fork 394
feat(parity): migrate json libraries to sqlite #604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- fix: tag name column no longer has unique constraint - fix: tags are referenced by id in db queries - fix: tag_search_panel no longer queries db on initialization; does not regress to empty search window when shown - fix: tag name search no longer uses library grid FilterState object - fix: tag name search now respects tag limit
Have no complaints with my small library, but we should get some bigger samples. |
Addresses #298
Seems to work on my smaller library, but I agree with python on this one. |
Updated tests to account for the way the SQL DB loads multiples of the same tag field type on entries. Currently the SQL DB loads the correct number of fields, but merges and mirrors the tag lists across each similar tag field type. This seems to be a limitation of the DB schema. While I don't have much public documentation on the concept yet, I'm highly considering targeting the "Tag Categories" concept for 9.5 instead of 9.6 to not only address this issue, but to also address the currently broken mixed field editing that needs overhauling anyway. |
@CyanVoxel I'm not complaining, go for it if you need. |
A couple issues that I noticed immediately on my main library:
Note:
|
This is something I also don't like but unfortunately don't know a way around... I actually ended up creating an entire embedded progress bar that updated the values on screen as it the library processed, but discovered that SQLAlchemy's desire to be on a single thread directly headbutts against Qt's desire for progress bars to be on different threads. I was unable to come up with a solution where I could yield feedback from the library while not ruining the db connection.
I've noticed this as well. For me the white window only loads for a split second as it seems to be closed to fast that it never fully loads. There might be somewhere that an
Hmm, there was a bug from 332 that caused lag like this when selecting entries however it should've been fixed in this PR. Could this be thumbnails rendering in the "background" while you try to click on entries? I usually get some lag until everything on the page is rendered. |
Yikes that sounds like a pain to get working. Have you thought about instead of having an indicator for progress, just having an indicator that the program is still running (e.g. a spinner)? This way the window wouldn't be frozen and it would still convey that it isn't done yet
I was already seeing all the thumbnails at that point (although maybe offscreen ones?) |
That might work, assuming I can safely move the entire SQL library opening + conversion + closing process to a single separate thread. I think trying |
I think that would be worth trying. Windows freezing sketches me the hell out, I have wasted way to much time waiting on programs only to kill them after a while and conclude that they must be broken (especially since that part can take multiple minutes for larger libraries) |
Glad I kept around the progress bar code - I was able to get an indefinite progress bar that updates throughout the major migration events. I've also taken the opportunity to catch and display any errors encountered on the UI itself, as well as display a message box with any data discrepancies if necessary. |
I've tested this and it looks really good! I think there are probably still improvements that could be made (there are a bunch of log messages from I will try to read over the code and give a review later |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking quite good overall, just a couple comments
shorthand=tag.shorthand, | ||
color=TagColor.get_color_from_str(tag.color), | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit weird to me, why are the sub tags and aliases added before the tags that their foreign keys point to?
I haven't tested this, but I suspect that, because every operation in those loops is its own DB Transaction, should TS crash after the Aliases, but before the Tags have been added, that there would be database inconsistencies stemming from foreign keys in the Aliases and Subtags pointing to tags that don't exist. This might not matter if the migration just restarts after such a crash (or killing by the user), but seems like it is just asking for problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might've been some negative reinforcement during testing that got me to believe that this was needed, but swapping it to do the tags first seems to work just fine (and logically makes more sense).
While the user shouldn't have the option to complete the migration without the full DB migrated, it's still a great observation.
) -> None: | ||
self.path = path | ||
self.folder = folder | ||
assert not self.id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why is this necessary? Shouldn't the constructor run before anything can set this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went and followed the assertion that was present in the Entry constructor, but yes I don't think this has ever been tripped during debugging and can probably be taken out now
tagstudio/src/qt/ts_qt.py
Outdated
open_status: LibraryStatus = self.lib.open_library(path) | ||
|
||
# Migration is required | ||
if open_status.message and open_status.message.startswith("[JSON]"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure using this message to figure out whether migration is required is the best idea, maybe a migration_required
field in LibraryStatus
would be better? (This way changing what looks like a log message doesn't break functionality)
tag_id = tag.id # Tag IDs start at 0 | ||
sql_color = tag.color.name | ||
json_color = ( | ||
self.json_lib.get_tag(tag_id).color.upper().replace(" ", "_") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit:
.upper().replace(" ", "_")
this conversion is duplicated in TagColor.get_color_from_str
maybe that should be a static method of TagColor
?
* add files * fix: term was parsing ANDList instead of ORList * make mypy happy * ruff format * add missing todo * add more constraint types * add parent property to AST * add BaseVisitor class * make mypy happy * add __init__.py * Revert "make mypy happy" This reverts commit 926d0dd. * refactoring and fixes * rudimentary search field integration * fix: check for None properly * fix: Entries without Tags are now searchable * make mypy happy * Revert "fix: Entries without Tags are now searchable" This reverts commit 19b40af. * fix: changed joins to outerjoins and added missing outerjoin * use query lang instead of tag_id FIlterState * add todos * fix: remove uncecessary line that broke search when searching for exact name * fix tag search * refactoring * fix: path now uses GLOB operator for proper GLOBs * refactoring: remove FilterState.id and implement Library.get_entry_full as replacement * fix: use default value notation instead of if None statement in __post_init__ * remove obsolete Search Mode UI and related code * ruff fixes * remove obsolete tests * fix: item_thumb didn't query entries correctly * fix: search_library now correctly returns the number of *unique* entries * make mypy happy * implement NOT * remove obsolete filename search * remove summary as it is not applicable anymore * finish refactoring of FilterState * implement special:untagged * fix: make mypy happy * Revert changes to search_tags in favor of changes from #604 * fix: also port test changes * fix: remove unneccessary import * fix: remove unused dataclass * fix: AND now works correctly with tags * simplify structure of parsed AST * add performance logging * perf: Improve performance of search by reducing number of required joins from 4 to 1 * perf: double NOT is now optimized out of the AST * fix: bug where pages would show less than the configured number of entries * Revert "add performance logging" This reverts commit c3c7d75. * fix: tag_id search was broken * somewhat adapt the existing autocompletion to this PR * perf: Use Relational Division Queries to improve Query Execution Time * fix: raise Exception so as to not fail silently * fix: Parser bug broke parentheses * little bit of clean up * remove unnecessary comment * add library for testing search * feat: add basic tests * fix: and queries containing just one tag were broken * chore: remove debug code * feat: more tests * refactor: more consistent name for variable Co-authored-by: Travis Abendshien <46939827+CyanVoxel@users.noreply.github.com> * fix: ruff check complaint over double import --------- Co-authored-by: Travis Abendshien <46939827+CyanVoxel@users.noreply.github.com>
* add files * fix: term was parsing ANDList instead of ORList * make mypy happy * ruff format * add missing todo * add more constraint types * add parent property to AST * add BaseVisitor class * make mypy happy * add __init__.py * Revert "make mypy happy" This reverts commit 926d0dd. * refactoring and fixes * rudimentary search field integration * fix: check for None properly * fix: Entries without Tags are now searchable * make mypy happy * Revert "fix: Entries without Tags are now searchable" This reverts commit 19b40af. * fix: changed joins to outerjoins and added missing outerjoin * use query lang instead of tag_id FIlterState * add todos * fix: remove uncecessary line that broke search when searching for exact name * fix tag search * refactoring * fix: path now uses GLOB operator for proper GLOBs * refactoring: remove FilterState.id and implement Library.get_entry_full as replacement * fix: use default value notation instead of if None statement in __post_init__ * remove obsolete Search Mode UI and related code * ruff fixes * remove obsolete tests * fix: item_thumb didn't query entries correctly * fix: search_library now correctly returns the number of *unique* entries * make mypy happy * implement NOT * remove obsolete filename search * remove summary as it is not applicable anymore * finish refactoring of FilterState * implement special:untagged * fix: make mypy happy * Revert changes to search_tags in favor of changes from TagStudioDev#604 * fix: also port test changes * fix: remove unneccessary import * fix: remove unused dataclass * fix: AND now works correctly with tags * simplify structure of parsed AST * add performance logging * perf: Improve performance of search by reducing number of required joins from 4 to 1 * perf: double NOT is now optimized out of the AST * fix: bug where pages would show less than the configured number of entries * Revert "add performance logging" This reverts commit c3c7d75. * fix: tag_id search was broken * somewhat adapt the existing autocompletion to this PR * perf: Use Relational Division Queries to improve Query Execution Time * fix: raise Exception so as to not fail silently * fix: Parser bug broke parentheses * little bit of clean up * remove unnecessary comment * add library for testing search * feat: add basic tests * fix: and queries containing just one tag were broken * chore: remove debug code * feat: more tests * refactor: more consistent name for variable Co-authored-by: Travis Abendshien <46939827+CyanVoxel@users.noreply.github.com> * fix: ruff check complaint over double import --------- Co-authored-by: Travis Abendshien <46939827+CyanVoxel@users.noreply.github.com>
This PR adds the ability for users to migrate TagStudio v9.1-v9.4 JSON libraries to the v9.5 SQLite format.
Attempting to open a library folder with no
ts_library.sqlite
file and ats_library.json
file present will initiate the modal prompt for the user to start the migration process. Closing this prompt early stops the migration process and will not open the library. The migration is not completed until the user clicks the "Finish Migration" button, and thets_library.json
is never deleted as a backup precaution.Migration Checklist
Field Order(The SQL DB doesn't seem built for this but it's fine for now)