Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hOCR Files not being Indexed #15

Open
HyphenHook opened this issue Feb 20, 2024 · 0 comments
Open

hOCR Files not being Indexed #15

HyphenHook opened this issue Feb 20, 2024 · 0 comments

Comments

@HyphenHook
Copy link

Setup

  1. Got the latest isle-dc stack up
  2. Installed and enabled the module
  3. Ran drush migrate:import islandora_hocr_media_uses
  4. Added islandora_hocr_field:content property to be indexed in Solr via the Search API and also set its Type to Fulltext ("islandora_hocr")
  5. Setup a Repository item with a .tif file and .hocr file in media
  • The .hocr file has the hOCR media use on it
  1. Installed the Solr OCR Highlighting Plugin per instructions from the documentation (In case I messed up the config but here is my configs for Solr)
  • The schema for the installation is here
  • I've also included the plugin's directive in the solrconfig.xml and added the needed lines of config in solrconfig_extra.xml here
  1. Set the correct path for the SOLR_HOCR_PLUGIN_PATH environment variable
  2. Restarted the SOLR container and also Indexed the nodes in Drupal

Problem

I cannot seem to get the hOCR to be indexed into Solr even after all the above setup steps. I've traced the code and found that the processor is properly doing its job of reading the content out and adding the value into Solr. However, using the Solr web interface I cannot see the field when I perform a query. I can see the field as the raw file content in the Solr query if I change the islandora_hocr_field:content property type to Fulltext. The OCR highlighting also doesn't show anything.

Am I missing something from the setup steps that are preventing the module from working? Some guidance would be appreciated! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant