-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSVImporter file upload bug fix #1541
Conversation
Working solution Refactor Tests
lib/importer/attach_files_to_work.rb
Outdated
@@ -0,0 +1,14 @@ | |||
# Called in 'object_factory' | |||
module Importer | |||
class AttachFilesToWork < AttachFilesToWorkJob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than doing this, would it be possible to coerce the input data to an Array of files (with only one element)? That means that any upstream changes won't have to be ported down to Hyku?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how overriding upstream changes could become undesirable - I'll work on this thanks!
def update | ||
raise "Object doesn't exist" unless object | ||
run_callbacks(:save) do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are all the callbacks removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't understand their use! Everything seems to work without. With them the method's Assignment Branch Condition size throws a rubocop error - would you prefer me to change the rubocop threshold for this too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adjust the thresholds as needed. If the callbacks aren't used, then make a separate PR to "remove unused callbacks" It's easer to understand PRs if they constrain themselves to a single change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks! I just restored the callbacks - that didn't cause a rubocop failure anymore. I think I was getting that on my initial solution before extracting logic in new methods.
msg = "Updated #{klass.model_name.human} #{obj.id}" | ||
Rails.logger.info("#{msg} (#{Array(attributes[system_identifier_field]).first})") | ||
end | ||
## below methods are commented out to pass rubocop inspection ("Class has too many line"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than commenting out the logging feature, how about changing the threshold in .rubocop_todo.yml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't occur to me to change the enforced practice - will do thanks!
Restore log methods, increase rubocop ClassLength
23a7f7d
to
e6a358a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unclear why you are moving functionality from the Actors into the ObjectFactory. The ObjectFactory should be using the actors and any changes necessary in the actors should be made there.
def update | ||
raise "Object doesn't exist" unless object | ||
@attr = update_attributes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why have you chosen to use an instance variable here? Why not a local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is called in attach_file_to_work
(line 137).
end | ||
destroy_existing_file_set if object.file_sets.present? | ||
attach_file_to_work |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
attaching a file to a work is the responsibility of the work actor. Why are you adding it here in duplicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The work_actor
doesn't do this - the feature is broken which is the reason for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is done in the actor stack here: https://github.com/samvera/hyrax/blob/f43e7048bf3961f48812809897b36d84b6577f68/app/actors/hyrax/actors/create_with_files_actor.rb#L10 now. So you may need to upload the file and then set the uploaded file id in the :uploaded_files
parameter before calling the actor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Will work on this.
end | ||
destroy_existing_file_set if object.file_sets.present? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you want to destroy an existing file set? Why not update it? If you replace one fileset with another, people who have cited an existing URI are going to have their references broken. What's the benefit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question; I am not sure that is the best way forward. This is how I envision the use of the csv importer:
- If I want to update a work with an added file, I would add columns to my csv. I could keep the 1st file or reorder file columns to pick a different one (the thumbnail will be rendered from the 1st).
- If I delete a file from the csv I would expect not to have it appear in the work anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's universal though. What if I only want to update titles. Reuploading all the files is going to be very slow. And I don't want to have to do that just to update some metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi! Apologies for a late reply. Good point.
The method actually destroys the file set only if the file itself needs to be changed - the check is line 142 f.destroy if attributes[:file] != f.title
. I will rename this method for clarity.
But the file is re-uploaded; which is inefficient for sure. I'll make changes.
I see; since the Actors are in Hyrax and we basically only point at them in Hyku the ObjectFactory seemed to be the right place to add code to fix this. I'll try moving the logic in an Actor instead. |
The code is currently broken - I am stuck and feedback is welcome. I tried to rely on an existing actor but for now I get an ActiveFedora::AssociationTypeMismatch Error: |
Hi @cldambrosio ... I had that issue with attributes[:files] so I switched to using attributes[:uploaded_files] and attributes[:remote_files], cf. https://github.com/research-technologies/leaf_addons/blob/master/lib/generators/leaf_addons/templates/lib/importer/factory/base_factory.rb#L15 I couldn't figure out a way to get [:files] working at all. |
Thank you @geekscruff for coming to the rescue :) |
|
Makes sense 😅 I just had not come across it in Hyku so far so I was not sure I'd need to include this. |
Remove method to replace existing file_set Restore code where change is no longer necessary
Hi @jcoyne, do you think this is ready to merge? |
🎉 🙏 😄 |
Per https://github.com/samvera/hyku/releases/tag/v5.0.0, these importers have been deprecated. The "Commit History of Directories" shows that there as been little activity, the recent activity being appeasing Rubocop and working through upgrades. Further, these files are conceptually subsumed by Bulkrax. <details><summary>Commit History of Directories</summary> ```bash ❯ git slog lib/importer * 6a48652 — Hyrax 5 upgrade rubocop fixes & get specs running in CI (#2065) LaRita Robinson, (2023-12-16) * 59626ef — 🧹 Rubocop'd lengths and other low hanging fruit Kirk Wang, (2023-12-13) * fecb4e9 — 🧹 Ran `bundle exec rubocop -a` Kirk Wang, (2023-12-13) * 0a6d047 — Upgrade to Hyrax 3.4.0 (#1811) Rob Kaufman, (2022-07-07) * edafd85 — Rubocop fixes (#1640) Rob Kaufman, (2020-08-04) * 9326387 — update to hyrax 2.2 (#1543) Rob Kaufman, (2018-10-04) * 7526936 — Merge pull request #1541 from ubiquitypress/hyku-csv-importer Rob Kaufman, (2018-08-21) |\ | * 904a369 — Remove :files from attributes Remove method to replace existing file_set Claire, (2018-07-20) (tag: v1.0.0) | * 1cf7f5c — WIP- set uploaded_files param to use existing Actor Claire, (2018-07-19) | * 9650e54 — Restore callbacks Claire, (2018-07-12) | * e6a358a — Coerce UploadedFile to an array and use upstream Job Claire, (2018-07-11) | * 93cb695 — Import files from CSV Claire, (2018-07-06) * | bc51174 — Hyrax 2.1 upgrade (#1529) Rob Kaufman, (2018-07-26) |/ * 116d4b9 — Upgrade Hyrax to 2.0.2 and upgrade gems with security vulnerabilities (#1519) Julie Allinson, (2018-03-22) * d85952b — Changes necessary to get simple CSV import to run Michael J. Giarlo, (2017-08-22) * a4024f4 — Make ObjectFactory rubocop-clean Joe Atzberger, (2017-08-22) * 7618c3a — Refactor csv_importer and improve test accuracy Joe Atzberger, (2017-08-22) * 21eb1e6 — Add a note about a broken method Justin Coyne, (2017-06-15) * 86eedaa — Update Hyrax and other dependencies Justin Coyne, (2017-04-13) * bfecea2 — Refactor importers to use Hyrax's actor stack Justin Coyne, (2017-02-23) * 554b8a8 — Pass ability to the actor Justin Coyne, (2017-02-17) * a8b25cd — Point at the Hyrax repo Justin Coyne, (2016-12-03) * 5d005c9 — [WIP] Hyrax! Justin Coyne, (2016-12-02) * fe229ca — Rename Lerna to Hyku. Fixes #461 Justin Coyne, (2016-11-07) * 1f5fe23 — Update dependencies, Sufia to 7.0.0.rc1 Justin Coyne, (2016-07-12) * d2c8fb1 — Rename hybox to lerna Michael J. Giarlo, (2016-06-08) * 5de0440 — Add a CSV importer Justin Coyne, (2016-05-23) * 5dcbb9b — Use User.batch_user instead of the deprecated batchuser Justin Coyne, (2016-05-20) * d2a6c3c — Update to latest rubocop and rubocop-rspec Chris Beer, (2016-05-20) * 642df79 — MODS/PURL importer Justin Coyne, (2016-05-19) ``` ```bash ❯ git slog spec/lib/importer/ * 1c04b9e — 🧹 Hyrax 5 upgrade additional specs (#2067) LaRita Robinson, (2023-12-18) * 6a48652 — Hyrax 5 upgrade rubocop fixes & get specs running in CI (#2065) LaRita Robinson, (2023-12-16) * fecb4e9 — 🧹 Ran `bundle exec rubocop -a` Kirk Wang, (2023-12-13) * 0a6d047 — Upgrade to Hyrax 3.4.0 (#1811) Rob Kaufman, (2022-07-07) * edafd85 — Rubocop fixes (#1640) Rob Kaufman, (2020-08-04) * f9a3a64 — Use paths based on Rails.root instead of pure relative paths Chris Colvard, (2020-07-08) * 1b3c5ed — rubocop fixes nicholas, (2018-10-12) * aeb9904 — update FactoryGirl to FactoryBot nicholas, (2018-10-09) * 93cb695 — Import files from CSV Claire, (2018-07-06) * 116d4b9 — Upgrade Hyrax to 2.0.2 and upgrade gems with security vulnerabilities (#1519) Julie Allinson, (2018-03-22) * 7618c3a — Refactor csv_importer and improve test accuracy Joe Atzberger, (2017-08-22) * 86eedaa — Update Hyrax and other dependencies Justin Coyne, (2017-04-13) * bfecea2 — Refactor importers to use Hyrax's actor stack Justin Coyne, (2017-02-23) * 8ecb861 — Require rails_helper in one place instead of 100 Joe Atzberger, (2017-02-14) * 7498ecb — Update to latest hyrax Justin Coyne, (2016-12-20) * a8b25cd — Point at the Hyrax repo Justin Coyne, (2016-12-03) * 5d005c9 — [WIP] Hyrax! Justin Coyne, (2016-12-02) * 5de0440 — Add a CSV importer Justin Coyne, (2016-05-23) * d2a6c3c — Update to latest rubocop and rubocop-rspec Chris Beer, (2016-05-20) * 642df79 — MODS/PURL importer Justin Coyne, (2016-05-19) ``` </details>
Enables uploading files via the
import_from_csv
script.refs #1238 #1164
I included assets which I used to see the CSV importer at work, with the command:
ruby bin/import_from_csv tenant ./lib/assets/csv_test.csv ./lib/assets/batch-upload-test/
(where
tenant
needs to be replaced by your actual one - for instancefoo.localhost
)If you try it you may need to adapt the command to your set up. These assets could be removed if preferred.
I commented out 4 breaking tests from the codebase because they don't appear useful. The functionality was broken while the tests passed previously.
Thanks @geekscruff for sharing your code in leaf_addons - I used it :)
@samvera/hyrax-code-reviewers