Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation #1332

Merged
merged 12 commits into from
Jul 15, 2020
Merged

Installation #1332

merged 12 commits into from
Jul 15, 2020

Conversation

qadan
Copy link

@qadan qadan commented Nov 13, 2019

GitHub Issue: #1040

What does this Pull Request do?

Adds an installation component overview, adds complete documentation for a base manual installation, and revises/updates some of the documentation for the Ansible playbook installation.

What's new?

  • Component overview section
  • Manual installation guide

How should this be tested?

Build and deploy as usual using mkdocs to follow along locally.

For the Ansible playbook section (now titled "Automatic Provisioning"), the instructions should still get you a functioning site from the Ansible playbook.

For the manual installation section:

  • Spin up an Ubuntu 18.04 box using VirtualBox
  • Give it OpenSSH during the install process for ease of use and fun-ness
  • Log in via a terminal emulator
  • Take 'er from the top and start installing/configuring stuff
  • See if you get an Islandora 8 site out the other end that is functionally similar or the same as the one the Ansible playbook gets you (albeit with some missing functionality like Matomo)

Additional Notes:

In general, it shouldn't contain lies, inaccuracies, missing steps, or unnecessary steps.

Of course, it also needs to conform to the documentation style guide

@ianysong
Copy link

Thank you, Daniel. The instruction of installing all of the components, with some details and examples, is very clear to me. One suggestion is, particularly for the audience who are new to Islandora, that some technical terms may require more information/references or can be put in “Islandora Glossary” after coordinating with “the documenter” who did the glossary.
Also, two sections may be missing, which are installing “Fedora Syn and Blazegraph” (487) , and “Karaf and Apache” (346).
I found two typos:
“ro”-Line 3 in Overview
“ ‘chmod’ ded” –Line 28 in Introduction.
Excellent manual writing!
Thanks from Ian

@qadan
Copy link
Author

qadan commented Dec 6, 2019

Thank you! Made some changes from the comments, as well as a merge conflict issue


**Drush** and **Drupal** are installed simultaneously using [drupal-project](https://github.com/drupal-composer/drupal-project). Drupal will serve up webpages and manage Islandora content, and Drush will help us get some things done from the command-line.

## The Web Application Server - Tomcat, and Cantaloupe
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extraneous comma after "Tomcat".

- Files or directories are not owned by the user who needs access to them, and can therefore not be written to. Check the ownership of files using `ls -la`, and ensure their ownership using `chown USER` for files, and `chown -R USER` for directories
- Replacement variables were left in place in files specified by the guide. Ensure any replacement variables such as server addresses and passwords are swapped out when writing files to the server

For any other issues, don't hesitate to email the [mailing list](islandora@googlegroups.com) to ask for help. If you think that a part of the installation documentation is incorrect or could be improved, please create an issue in the [http://github.com/Islandora/documentation/issues](documentation issues queue) and give it a `documentation` tag. Bear in mind that this guide is built for Ubuntu 18.04 and attempts to give generalized instructions; you will likely naturally encounter situations where your own environment needs to differ from the guide.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "documentation issues queue" link is reversed. It should be [documentation issues queue](http://github.com/Islandora/documentation/issues)

sudo chown www-data:www-data /opt/drupal
sudo chmod 775 /opt/drupal
# Clone drupal-project and build it in our newly-created folder.
git clone https://github.com/drupal-composer/drupal-project.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing this in my home directory caused a number of composer warning about "Cannot create cache directory" when running composer as www-data.

Solr includes an installer that does most of the heavy lifting of ensuring we have a Solr user, a location where Solr lives, and configurations in place to ensure it’s running on boot.

```bash
sudo UNTARRED_SOLR_FOLDER/bin/install_solr_service.sh solr.tgz
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'solr.tgz' should be 'SOLR_TARBALL' as above.


![Specifying the Solr Server](../../assets/specifying_the_solr_server.png)

Click **Save** to add your index and kick off indexing of existing items.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something went wrong with the configuration steps. Saving my configuration did make the original "incompatible Solr schema" message disappear; but saving the index configuration resulted in the following "Server index status" message:

Error while checking server index status: An error occurred while trying to search with Solr: { "error":{ "metadata":[ "error-class","org.apache.solr.core.SolrCoreInitializationException", "root-error-class","java.lang.ClassNotFoundException"], "msg":"SolrCore 'islandora8' is not available due to init failure: Could not load conf for core islandora8: Can't load schema /var/solr/data/islandora8/conf/schema.xml: Plugin init failure for [schema.xml] fieldType \"collated_en\": Error loading class 'solr.ICUCollationField'",

(trimmed for brevity)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I knew this seemed familiar. The default value for solr.install.dir in the config files provided by solr-gsc is ../../.. which does not resolve to the solr install directory. You need to manually edit /var/solr/data/SOLR_CORE/conf/solrcore.properties so that solr.install.dir=/opt/solr.

See this comment on the search_api_solr issues queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AH! I see what happened. You have the solr.install.dir field configured correctly in the screen-shot for setting up the server config in Drupal, but I missed it. Perhaps some narrative text to point out the key fields?


```bash
cd /opt/drupal
drush -y en rdf responsive_image devel syslog serialization basic_auth rest restui search_api_solr search_api_solr_defaults facets content_browser pdf admin_toolbar islandora_defaults controlled_access_terms_defaults islandora_breadcrumbs islandora_iiif islandora_oaipmh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may hit the following error when running this command if the islandora_core_feature module hasn't been enabled yet:

Error: Call to a member function getSettings() on null in /opt/drupal/web/modules/contrib/islandora/modules/islandora_text_extraction/islandora_text_extraction.install on line 16 #0 [internal function]: islandora_text_extraction_install()
#1 /opt/drupal/web/core/lib/Drupal/Core/Extension/ModuleHandler.php(392): call_user_func_array('islandora_text_...', Array)
#2 /opt/drupal/web/core/lib/Drupal/Core/Extension/ModuleInstaller.php(323): Drupal\Core\Extension\ModuleHandler->invoke('islandora_text_...', 'install')

The enable command doesn't seem to enable modules in a specific order, so it might be best to add an instruction to enable islandora_core_feature first, and then the other modules.

Copy link
Contributor

@seth-shaw-unlv seth-shaw-unlv Dec 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, enabling islandora_core_feature before-hand doesn't resolve this, I need to run it twice.... maybe this is a Drupal 8.8 thing....

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also had to run this twice. Although the offending file itself I think can just be deleted: https://github.com/Islandora/islandora/blob/8.x-1.x/modules/islandora_text_extraction/islandora_text_extraction.install

.txt is added by default. Feels crufty.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drush -y config-set system.theme default carapace
# After all of this, run updates and rebuild the cache.
cd /opt/drupal
sudo -u www-data composer update
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine you would have done the composer update before you did all the composer requires above...


```bash
cd /opt/drupal
drush -y -l localhost --userid=1 mim --group=islandora
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure the user you run this as has read permissions on the private key, or else it won't work. (I had to run it as sudo -u www-data drush -y -l localhost:8000 --userid=1 mim --group=islandora

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran mine on port 80 so I didn't need the :8000 in the url param, but otherwise, yes, you need to set --userid=1 and run it as the apache user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification: the userid flag is to run it as the Drupal Admin user (for Drupal permissions). The sudo -u www-data is what runs it as the apache user (for filesystem permissions).

Copy link
Contributor

@seth-shaw-unlv seth-shaw-unlv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few bits here and there as noted in comments.

There are a few bits here and there where it assumes you are using port 8000 and others port 80 for apache; but someone watching their step should catch them.

In the end the Drupal configuration isn't working quite right with the instructions. (E.g. the various media fields didn't get imported for some reason and I had to do a feature import through the web UI.) I'll run through that section again; although not today.

Note: I used Drupal 8.8.0 for this review run-through.


## The Front-Facing CDM - Composer, Drush, and Drupal

Composer will be used to install both Drupal and Drush simultaneously using the [drupal-project](https://github.com/drupal-composer/drupal-project) repository.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we aren't using the Islandora Drupal Project repo?

sudo wget -O tomcat.tar.gz TOMCAT_TARBALL_LINK
sudo tar -zxvf tomcat.tar.gz
sudo mv /opt/TOMCAT_DIRECTORY/* /opt/tomcat
sudo chown -R tomcat:tomcat /opt/TOMCAT_DIRECTORY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be sudo chown -R tomcat:tomcat /opt/tomcat since we already moved the contents of TOMCAT_DIRECTORY there.

Copy link
Contributor

@dannylamb dannylamb Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just ran through this and /opt/tomcat is already owned by the tomcat user before this. Maybe make this sudo rmdir /opt/TOMCAT_DIRECTORY instead?

Copy link
Contributor

@dannylamb dannylamb Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦‍♂️ Nope. You need to chown all the new files in /opt/tomcat and get rid of the empty apache-tomcat-X-X-X.tar.gz type directory.

export JAVA_OPTS="-Djava.awt.headless=true -server -Xmx1500m -Xms1000m"
```
- `PATH_TO_JAVA_HOME`: This will vary a bit depending on the environment, but will likely live in `/usr/lib/jvm` somewhere (e.g., `/usr/lib/jvm/java-8-openjdk-amd64` for an installation on a machine with an AMD processor); again, in an Ubunutu environment you can check a part of this using `update-alternatives --list java`, which will give you the path to the JRE binary within the Java home

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion, not a requirement, the SOLR install text indicates you should configure your system's open file limit "to 65000 to avoid operational disruption". A short description of how to do that so it persists can be found on Medium.

For the Karaf features we’re going to install, we need a few different repositories to be added to the list:

```bash
/opt/karaf/bin/client repo-add mvn:org.apache.activemq/activemq-karaf/ACTIVEMQ_KARAF_VERSION/xml/features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we shouldn't use LATEST form the versions instead of looking them up? I.e. mvn:org.apache.activemq/activemq-karaf/LATEST/xml/features is a working repository ID for me. Similarly with APACHE_CAMEL_VERSION?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about this; we can probably put in LATEST as a default for these two and then hop in here and specify if anything ever blows up later

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: I've done multiple installs using LATEST and it seems to work fine.

<cm:default-properties>
<cm:property name="error.maxRedeliveries" value="5"/>
<cm:property name="in.stream" value="activemq:queue:islandora-connector-ocr"/>
<cm:property name="derivative.service.url" value="http://localhost:8000/hypercube"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes apache running on 8000 instead of 80, like the rest of the docs.

<cm:default-properties>
<cm:property name="error.maxRedeliveries" value="5"/>
<cm:property name="in.stream" value="activemq:queue:islandora-connector-houdini"/>
<cm:property name="derivative.service.url" value="http://localhost:8000/houdini/convert"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

<cm:default-properties>
<cm:property name="error.maxRedeliveries" value="5"/>
<cm:property name="in.stream" value="activemq:queue:islandora-connector-homarus"/>
<cm:property name="derivative.service.url" value="http://localhost:8000/homarus/convert"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here

<cm:default-properties>
<cm:property name="error.maxRedeliveries" value="5"/>
<cm:property name="in.stream" value="activemq:queue:islandora-connector-fits"/>
<cm:property name="derivative.service.url" value="http://localhost:8000/crayfits"/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aaaaaand here.

@seth-shaw-unlv
Copy link
Contributor

Another note: to get data into Fedora, your user needs the fedoraAdmin role applied. E.g. none of the migrations on the final page will push those terms to Fedora unless we first edit the Admin user to have the fedoraAdmin role.


**Karaf**’s job is similar to Tomcat, except where Tomcat is a web-accessible endpoint for Java applets, Karaf is simply meant to be a container for system-level applets to communicate via its OSGI. Alpaca is one such applet; it will broker messages between Fedora and Drupal, and between Drupal and various derivative generation applications.

**Alpaca** contains Karaf services to manage moving information between Islandora, Fedora, and Blazegraphm as well as kicking off derivative services in Crayfish. These will be configured to broker between Drupal and Fedora using an ActiveMQ queue.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi - there's an extra 'm' after Blazegraph


In order for content to be indexed back into Solr, a search index needs to be added to our server. Navigate to `/admin/config/search/search-api/add-index` and check off the things you'd like to be indexed.

![Adding a Search Index](../../assets/adding_a_search_index.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, the media module has not been enabled yet and doesn't show up in the list like it does in this screenshot.

```bash
/opt/karaf/bin/client repo-add mvn:org.apache.activemq/activemq-karaf/ACTIVEMQ_KARAF_VERSION/xml/features
/opt/karaf/bin/client repo-add mvn:org.apache.camel.karaf/apache-camel/APACHE_CAMEL_VERSION/xml/features
/opt/karaf/bin/client repo-add mvn:ca.islandora.alpaca/islandora-karaf/1.0.1/xml/features
Copy link
Contributor

@dannylamb dannylamb Dec 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 1.0.1 is left hardcoded here. It should be "pushed to a variable" like the camel and activemq versions above.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also highlights that we're using 1.0.2 on dev in islandora-playbook, which we're compiling instead of pulling from an artifact. We should just do another alpaca release so we can bump that up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I just released Alpaca 1.0.2 and it's now available via mvn

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna move this over to use LATEST as well, I think ... not sure I really remember a reason why this was hardcoded

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the other ones, it looks like it's going to be safest to hard-code versions when repo-add-ing ... looks like at least at the moment for karaf, you end up with some unreleased SNAPSHOT builds that don't quite work correctly. Probably going to be safest all around to recommend that level of specificity for repos Islandora isn't in charge of

/opt/karaf/bin/client repo-add mvn:org.apache.camel.karaf/apache-camel/APACHE_CAMEL_VERSION/xml/features
/opt/karaf/bin/client repo-add mvn:ca.islandora.alpaca/islandora-karaf/1.0.1/xml/features
# XXX: This shouldn't be strictly necessary, but appears to be a missing
# upstream dependency for some fcrepo features.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fcrepo-camel or something in fcrepo-camel-toolbox.

/opt/karaf/bin/client repo-add mvn:ca.islandora.alpaca/islandora-karaf/1.0.1/xml/features
# XXX: This shouldn't be strictly necessary, but appears to be a missing
# upstream dependency for some fcrepo features.
/opt/karaf/bin/client repo-add mvn:org.apache.jena/jena-osgi-features/3.1.1/xml/features
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs to be 3.13.1

</blueprint>
```

### Restarting Karaf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unneccesary. Karaf hot loads configuration and blueprints when they are deployed, and restarting it doesn't do much of anything at all. When you shut it down it just suspends and then picks back up where you left it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, if anything, I'd suggest telling the reader to la | grep islandora and confirm that all of the bundles are in Active status and not just Resolved or even worse, the dreaded Grace Period. After having run through this in its entirety, at the end fcrepo indexing and derivatives failed to deploy properly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Today I learned!

> 3 | export JAVA_OPTS="-Djava.awt.headless=true -Dcantaloupe.config=/opt/cantaloupe_config/cantaloupe.properties -Dfcrepo.modeshape.configuration=file:///opt/fcrepo/config/repository.json -Dfcrepo.home=/opt/fcrepo/data -Dfcrepo.spring.configuration=file:///opt/fcrepo/config/fcrepo-config.xml -server -Xmx1500m -Xms1000m"

**After**:
> 3 | export JAVA_OPTS="-Djava.awt.headless=true -Dcantaloupe.config=/opt/cantaloupe_config/cantaloupe.properties -Dfcrepo.modeshape.configuration=file:///opt/fcrepo/config/repository.json -Dfcrepo.home=/opt/fcrepo/data -Dfcrepo.spring.configuration=file:///opt/fcrepo/config/fcrepo-config.xml -Dcom.bigdata.rdf.sail.webapp.ConfigParams.propertyFile=/opt/blazegraph/conf/RWStore.properties -server -Xmx1500m -Xms1000m"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least on CentOS 7, you also need to pass in an argument for the log4j.properties: -Dlog4j.configuration=file:/opt/blazegraph/conf/log4j.properties or else tomcat won't pick it up.

Before we can configure the features we’re going to use, they need to be installed. Some of these installations may take some time.

```bash
/opt/karaf/bin/client feature:install fcrepo-service-activemq
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add feature:install camel-blueprint to this list. I needed it before any of the derivatives or indexers would deploy properly.

```
- `ISLANDORA_SYN_TOKEN`: This should be the same token that was established during the installation of Syn in your `syn-settings.xml` file

`/opt/karaf/etc/org.fcrepo.camel.indexing.triplestore | karaf:karaf/644`
Copy link
Contributor

@dannylamb dannylamb Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The filename is missing its .cfg extension.

triplestore.baseUrl=http://localhost:8080/blazegraph/namespace/islandora/sparql
```

`/opt/karaf/etc/ca.islandora.alpaca.indexing.triplestore | karaf:karaf/644`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing .cfg extension here too

triplestore.baseUrl=http://localhost:8080/blazegraph/namespace/islandora/sparql
```

`/opt/karaf/etc/ca.islandora.alpaca.indexing.fcrepo | karaf:karaf/644`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too

@manez
Copy link
Member

manez commented Feb 3, 2020

@qadan Do you possibly have any cycles for these changes this week? It would be awesome to get this pull in for the next release!

@qadan
Copy link
Author

qadan commented Feb 12, 2020

@manez indeed, i'll start trying to hammer this out this week! updates to the PR should be incoming

@qadan
Copy link
Author

qadan commented Feb 18, 2020

Whew, after a slew of changes I think I've incorporated all the feedback from @seth-shaw-unlv and @dannylamb - thanks kindly, this is pretty crazy! Hopefully the above few commits should address all the outstanding issues.

Copy link
Contributor

@dannylamb dannylamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great and has sat for far too long. I"m pulling this in now, and if there's any gaps, we'll identify them as folks in the community come to us.

Many many thanks @qadan 🙇‍♂️

@dannylamb dannylamb merged commit cf87d3c into Islandora:master Jul 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants