Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remediate fileset duplication in items list #371

Merged
merged 5 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def index_solr_doc(solr_doc)
solr_doc['is_child_bsi'] ||= object.try(:is_child)
solr_doc['split_from_pdf_id_ssi'] ||= object.try(:split_from_pdf_id)
solr_doc['is_page_of_ssim'] = iiif_print_lineage_service.ancestor_ids_for(object)
solr_doc['member_ids_ssim'] = iiif_print_lineage_service.descendent_member_ids_for(object)
solr_doc['descendent_member_ids_ssim'] = iiif_print_lineage_service.descendent_member_ids_for(object)
end
end
end
Expand Down
6 changes: 2 additions & 4 deletions app/models/concerns/iiif_print/solr_document_decorator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,9 @@ def respond_to_missing?(method_name, include_private = false)
iiif_print_solr_field_names.include?(method_name.to_s) || super
end

# @see https://github.com/samvera/hyrax/commit/7108409c619cd2ba4ae8c836b9f3b429a7e9837b
# consists of member_ids_ssim + its descendents' member_ids (recursively)
def file_set_ids
# Yes, this looks a little odd. But the truth is the prior key (e.g. `file_set_ids_ssim`) was
# an alias of `member_ids_ssim`.
self['member_ids_ssim']
self['descendent_member_ids_ssim'] || self['member_ids_ssim']
end

def any_highlighting?
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,21 @@ def build
presenter_class.for(solr_doc)
elsif Hyrax.config.curation_concerns.include?(solr_doc.hydra_model)
# look up file set ids and loop through those
file_set_docs = load_file_set_docs(solr_doc.try(:member_ids) || solr_doc.try(:[], 'member_ids_ssim'))
file_set_docs = load_file_set_docs(load_file_set_ids(solr_doc))
file_set_docs.map { |doc| presenter_class.for(doc) } if file_set_docs.length
end
end.flatten.compact
end

private

def load_file_set_ids(solr_doc)
solr_doc.try(:descendent_member_ids_ssim) ||
solr_doc.try(:[], 'descendent_member_ids_ssim') ||
solr_doc.try(:member_ids_ssim) ||
solr_doc.try(:[], 'member_ids_ssim')
end

# still create the manifest if the parent work has images attached but the child works do not
def load_file_set_docs(file_set_ids)
return [] if file_set_ids.nil?
Expand Down
9 changes: 8 additions & 1 deletion app/presenters/iiif_print/work_show_presenter_decorator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,20 @@ def iiif_viewer?
#
# @todo Review if this is necessary for Hyrax 5.
def members_include_viewable_image?
all_member_ids = solr_document.try(:member_ids) || solr_document.try(:[], 'member_ids_ssim')
all_member_ids = load_file_set_ids(solr_document)
Array.wrap(all_member_ids).each do |id|
return true if file_type_and_permissions_valid?(member_presenters_for([id]).first)
end
false
end

def load_file_set_ids(solr_doc)
Copy link
Contributor

@ShanaLMoore ShanaLMoore Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocking, but we could extract this method into a module since I think it's the same one declared in app/presenters/iiif_print/iiif_manifest_presenter_factory_decorator.rb#L25

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed... I changed an existing method which was also duplicated. To save time waiting for another build now, maybe we can do it when we do the other ticket about going directly to the first page found?

solr_doc.try(:descendent_member_ids_ssim) ||
solr_doc.try(:[], 'descendent_member_ids_ssim') ||
solr_doc.try(:member_ids_ssim) ||
solr_doc.try(:[], 'member_ids_ssim')
end

# This method allows for overriding to add additional file types to mix in with IiifAv
# TODO: add configuration setting for file types to loop through so an override is unneeded.
def file_type_and_permissions_valid?(presenter)
Expand Down
4 changes: 2 additions & 2 deletions app/services/iiif_print/manifest_builder_service_decorator.rb
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ def apply_metadata_to_canvas(canvas:, presenter:, solr_doc_hits:)
# uses the 'id' property for v3 manifest and `@id' for v2, which is a URL that contains the FileSet id
file_set_id = (canvas['id'] || canvas['@id']).split('/').last
# finds the image that the FileSet is attached to and creates metadata on that canvas
image = solr_doc_hits.find { |hit| hit[:member_ids_ssim]&.include?(file_set_id) }
image = solr_doc_hits.find { |hit| (hit[:descendent_member_ids_ssim] || hit[:member_ids_ssim])&.include?(file_set_id) }
return unless image
# prevents duplicating the child and parent metadata
return if image.id == presenter.id
Expand Down Expand Up @@ -124,7 +124,7 @@ def sort_by_label_v2(hash)
end

def member_ids_for(presenter)
member_ids = presenter.try(:ordered_ids) || presenter.try(:member_ids)
member_ids = presenter.object.solr_document['descendent_member_ids_ssim'] || presenter.try(:ordered_ids) || presenter.try(:member_ids)
member_ids.nil? ? [] : member_ids
end

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
module IiifPrint
module BlacklightIiifSearch
module AnnotationDecorator
INVALID_MATCH_TEXT = "#xywh=INVALID,INVALID,INVALID,INVALID".freeze
INVALID_MATCH_TEXT = "#xywh=0,0,0,0".freeze
##
# Create a URL for the annotation
# use a Hyrax-y URL syntax:
Expand Down
17 changes: 4 additions & 13 deletions lib/iiif_print/lineage_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -47,26 +47,17 @@ def self.ancestry_identifier_for(work)
##
# @param object [#ordered_works, #file_sets, #member_ids]
# @return [Array<String>] the ids of associated file sets and child works
#
# @see
# https://github.com/samvera/hyrax/blob/2b807fe101176d594129ef8a8fe466d3d03a372b/app/indexers/hyrax/work_indexer.rb#L15-L18
# for "clarification" of the comingling of file_set_ids and member_ids
def self.descendent_member_ids_for(object)
return unless object.respond_to?(:member_ids)

# enables us to return parents when searching for child OCR
#
# https://github.com/samvera/hydra-works/blob/c9b9dd0cf11de671920ba0a7161db68ccf9b7f6d/lib/hydra/works/models/concerns/work_behavior.rb#L90-L92
#
# The Hydara::Works implementation of file_set_ids is "members.select(&:file_set?).map(&:id)";
# so no sense doing `object.file_set_ids + object.member_ids`
file_set_ids = object.member_ids
child_ids = object.member_ids
# add in child works & their child works & filesets, recursively
IiifPrint.object_ordered_works(object)&.each do |child|
file_set_ids += Array.wrap(descendent_member_ids_for(child))
child_ids += Array.wrap(descendent_member_ids_for(child))
end
# We must convert these to strings as Valkyrie's identifiers will be cast to hashes when we
# attempt to write the SolrDocument.
file_set_ids.flatten.uniq.compact.map(&:to_s)
child_ids.flatten.compact.map(&:to_s).uniq
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.uniq was moved, because this can contain a combination of valkyrie ids and string ids before the mapping to_s and some duplications could have been missed.

end
class << self
alias descendent_file_set_ids_for descendent_member_ids_for
Expand Down
4 changes: 1 addition & 3 deletions lib/iiif_print/persistence_layer/valkyrie_adapter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,7 @@ def self.object_in_works(object)
# @param object [Valkyrie::Resource]
# @return [Array<Valkyrie::Resource>]
def self.object_ordered_works(object)
child_file_sets = Hyrax.custom_queries.find_child_file_sets(resource: object).to_a
child_works = Hyrax.custom_queries.find_child_works(resource: object).to_a
child_works + child_file_sets
Hyrax.custom_queries.find_child_works(resource: object).to_a
end

##
Expand Down
1 change: 1 addition & 0 deletions spec/factories/newspaper_page_solr_document.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
has_model_ssim: ['NewspaperPage'],
issue_id_ssi: 'abc123',
member_ids_ssim: [file_set.id],
descendent_member_ids_ssim: [file_set.id],
thumbnail_path_ss: '/downloads/123456?file=thumbnail')
end
end
Expand Down
2 changes: 1 addition & 1 deletion spec/models/solr_document_spec.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
require 'spec_helper'
RSpec.describe SolrDocument do
let(:solr_doc) { described_class.new(id: 'foo', member_ids_ssim: ['bar']) }
let(:solr_doc) { described_class.new(id: 'foo', descendent_member_ids_ssim: ['bar']) }

describe 'file_set_ids' do
it 'responds to #file_set_ids' do
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
{ "id" => "child_work123",
"title_tesim" => ["My Child Image"],
"has_model_ssim" => ["Image"],
"member_ids_ssim" => ["child_image_fs123"] }
"descendent_member_ids_ssim" => ["child_image_fs123"] }
end
let(:child_fs_attributes) do
{ "id" => "child_fs123",
Expand Down
Loading