-
Notifications
You must be signed in to change notification settings - Fork 0
Installation of a Release
For using xtriples-micro in a CI/CD pipeline of an edition (or other
XML-based data project), we recommend using
Tooling. (For alternatives see
other toolings.) It provides a fully reproducible,
extensible and adaptable environment for downloading and setting up
XML and RDF tools from remote sources. It can download
xtriples-micro and install it as a dependency.
Please note, that the following instruction does not use the Tooling
environment configured in the xtriples-micro project. Instead, we
tell the Tooling environment of the edition, to get us the released
package of xtriples-micro. This way is the clean way, because we
can only get a version of xtriples-micro, that has undergone the
software release cycle of development, testing and releasing. Cloning
the repo, could get us the code in some undefined inter-release state,
which can be contain bugs.[^1] Adding it as a git sub-module is
considered an anti-pattern. Downstream projects should depend on a
release.
Also note, that you should never add the downloaded tools to the project's git/commit history! Only the configuration, telling which tool versions to use, is to be added to the history.
Provided, that Tooling is installed
in the edition's git repository for driving the CI/CD pipeline and
provided that it is installed in the resources (or tooling)
directory of the edition, the edition's directory structure will look
like this example project:
tree . -a
.
├── ALEA.odd
├── common.xml
├── Diwan # the lyrics encoded in TEI
│ ├── ayn
│ │ ├── ayn1
│ │ │ └── ayn1.tei.xml
│ │ ├── ayn10
│ │ │ └── Ayn10.tei.xml
│ │ ├── ayn100
│ │ │ └── ayn100.tei.xml
│ . .
│
├── .gitignore
├── .gitlab
│ └── ci # CI/CD sub-pipelines
│ ├── build.yaml
│ ├── production.yaml
│ └── staging.yaml
├── .gitlab-ci.yml # Main CI/CD pipeline
├── …
├── resources # resources (or tooling) contains the technical resources for the edition
│ ├── build.xml # an Apache Ant build file that drives the CI/CD pipeline
│ ├── catalog.xml # an OASIS XML catalog for the project
│ ├── ci-prod.properties # Ant properties used for running the CI/CD pipeline
│ ├── ci-validation.properties # Ant properties used for the validation of TEI sources
│ ├── ci_settings.xml
│ ├── graph # the graph folder contains XTriples configuration files
│ │ ├── common.xml
│ │ ├── diwan.xml
│ │ ├── extract-collection-with-utils.xsl
│ │ ├── listWit.off
│ │ ├── persons.off
│ │ ├── places.xml
│ │ ├── utils.xsl
│ │ └── witnesses.xml
│ ├── .mvn
│ │ └── wrapper
│ │ ├── maven-wrapper.jar # the only jar required in your repo
│ │ └── maven-wrapper.properties
│ ├── mvnw # Maven wrapper script for Linux and Mac
│ ├── mvnw.cmd # Maven wrapper script for Windows
│ ├── oxbytei-config.xml
│ ├── pom.xml # Apache Maven configuration using for Tooling
│ ├── saxon-local.xml # Saxon configuration file
│ ├── scripts # scripts folder contains patterns for wrapper the tool's scripts
│ │ ├── ant.sh
│ │ ├── classpath.sh
│ │ ├── query.sh
│ │ ├── riot.sh
│ │ ├── sparql.sh
│ │ └── xslt.sh
│ ├── solr # resources for generating the dataset for indexing on Apache Solr
│ │ ├── documents.jsonld
│ │ ├── merge.xsl
│ │ └── solr.sparql
. .
│ ├── target # NEVER ADD THIS FOLDER TO THE GIT HISTORY!
. .The XTriples configuration files, which define how the RDF knowledge
graph is extracted from your TEI sources, are contained in the
resources/graph folder. There are some build targets in
resources/build.xml that run XTriples on these configuration files
and merge the extracted triples into one graph. We look at these in
Example Project. Let's concentrate on the installation of
xtriples-micro and of RDF tools.
We can use Apache Maven, which is at the heart of
Tooling, to get
xtriples-micro. Therefore we add the following lines to
resources/pom.xml.
-
Add a property that tells which version of
xtriples-microto use.<properties> <!-- ... other properties ... --> <xtriples-micro.version>0.5.4</xtriples-micro.version> </properties>
-
Use a Maven plugin, for downloading and unpacking
xtriples-micro. The following lines tell maven to download axtriples-microrelease package with the above defined version from the release assets and to unpack it intoresources/target/dependencies/xtriples.<!-- ... --> <build> <plugins> <!-- ... other plugins ... --> <plugin> <groupId>io.github.download-maven-plugin</groupId> <artifactId>download-maven-plugin</artifactId> <version>2.0.0</version> <executions> <execution> <id>install-xtriples-micro</id> <phase>generate-resources</phase> <goals> <goal>wget</goal> </goals> <configuration> <url>https://github.com/SCDH/xtriples-micro/releases/download/${xtriples-micro.version}/xtriples-${xtriples-micro.version}-package.zip</url> <unpack>true</unpack> <outputDirectory>${project.build.directory}/dependencies/</outputDirectory> </configuration> </execution> </executions> </plugin> <!-- ... other plugins ... --> </plugins> </build>
-
Run
./mvnw packagefrom theresources(ortooling) directory.cd resources ./mvnw package
After this, xtriples-micro is installed in resources/target/dependencies/xtriples:
tree target/dependencies/xtriples
target/dependencies/xtriples
├── saxon.ee.xml
├── saxon.he.xml
├── saxon-local.xml
├── saxon.pe.xml
├── saxon.xml
└── xsl
├── collection.xsl
├── extract-collection.xsl
├── extract-param-doc.xsl
├── extract.xsl
├── to-codepoints.xsl
├── to-seed-params.xsl
├── vocabularies.xsl
└── xtriples.xsl
2 directories, 13 filesFor converting xtriples-micro's NTriples output to other RDF
serializations, Apache Jena's RDF I/O Technology
(RIOT) is most
suitable. With Tooling the installation and making a wrapper script is
very simple.
- Add a version property
pom.xmlfile:<!-- ... --> <properties> <!-- ... other properties ... --> <xtriples-micro.version>0.5.4</xtriples-micro.version> <jena.version>4.10.0</jena.version> </properties> <!-- ... -->
2. Add a dependency `pom.xml` file:
```xml
<!-- ... -->
<dependencies>
<!-- ... other dependencies ... -->
<dependency>
<groupId>org.apache.jena</groupId>
<artifactId>jena-cmds</artifactId>
<version>${jena.version}</version>
<scope>test</scope>
</dependency>
</dependencies>
<!-- ... -->
Note the testing scope! That means, that Jena would not be a transitive dependency, should you ever use the edition as a dependency on a downstream project.
- Assert, that wrapper scripts
resources/scripts/*.share filtered and written toresources/target/bin/and that their execution file system privilege is set. There should be appropriate instructions using themaven-resources-pluginand themaven-antrun-pluginin thepom.xmlwhen using Tooling, Lines 213 and 235. - Add a wrapper script to
resources/scripts/riot.sh:#!/bin/sh JAVAOPTS=-Ddebug="true" CP=$CLASSPATH for j in ${project.build.directory}/lib/*.jar; do CP=$CP:$j done java $JAVAOPTS -cp $CP jena.riot $@
- Run
./mvnw package
You should now Apache Jena RIOT installed and there's a nice wrapper script for conveniently using it:
resources/target/bin/riot.sh -h
riot [--help] [--time] [--base=IRI] [--syntax=FORMAT] [--out=FORMAT] [--count] file ...
Parser control
--sink Parse but throw away output
--syntax=NAME Set syntax (otherwise syntax guessed from file extension)
--base=URI Set the base URI (does not apply to N-triples and N-Quads)
--nobase Pass through relative URIs (best effort)
--check Additional checking of RDF terms
--strict Run with in strict mode
--validate Same as --sink --check --strict
--count Count triples/quads parsed, not output them
--merge Convert quads to triples
--rdfs=file Apply some RDFS inference using the vocabulary in the file
--nocheck Turn off checking of RDF terms
Output control
--output=FMT Output in the given format, streaming if possible.
--formatted=FMT Output, using pretty printing (consumes memory)
--stream=FMT Output, using a streaming format
--compress Compress the output with gzip
Time
--time Time the operation
Symbol definition
--set Set a configuration symbol to a value
General
-v --verbose Verbose
-q --quiet Run with minimal output
--debug Output information for debugging
--help
--version Version informationNote, that the download-maven-plugin can be used for downloading and
unpacking multiple assets, by defining multiple executions. The
following will not only get xtriples-micro, but also an executable
of the command line version of the
Titanuium JSON-LD
processor. The executable will be available at
resources/target/bin/ld-cli.
<!-- ... -->
<properties>
<!-- ... other properties ... -->
<xtriples-micro.version>0.5.4</xtriples-micro.version>
<titanium-cli.version>0.10.0</titanium-cli.version>
</properties>
<!-- ... -->
<build>
<plugins>
<!-- ... other plugins ... -->
<plugin>
<groupId>io.github.download-maven-plugin</groupId>
<artifactId>download-maven-plugin</artifactId>
<version>2.0.0</version>
<executions>
<execution>
<id>install-xtriples-micro</id>
<phase>generate-resources</phase>
<goals>
<goal>wget</goal>
</goals>
<configuration>
<url>https://github.com/SCDH/xtriples-micro/releases/download/${xtriples-micro.version}/xtriples-${xtriples-micro.version}-package.zip</url>
<unpack>true</unpack>
<outputDirectory>${project.build.directory}/dependencies/</outputDirectory>
</configuration>
</execution>
<execution>
<id>install-titanium-cli</id>
<phase>generate-resources</phase>
<goals>
<goal>wget</goal>
</goals>
<configuration>
<url>https://github.com/filip26/ld-cli/releases/download/v${titanium-cli.version}/ld-cli-${titanium-cli.version}-ubuntu-latest.zip</url>
<unpack>true</unpack>
<outputDirectory>${project.build.directory}/bin/</outputDirectory>
</configuration>
</execution>
</executions>
</plugin>
<!-- ... other plugins ... -->
</plugins>
</build>
<!-- ... --> Note, that downloading release assets (and resources from other
arbitrary URLs) with the maven-download-plugin is most
straight-forward method. We could also get these packages from the
Maven package registries. However, downloading from github's maven
package registry would required us to set up authentification, which
would only add complexity.
Please notice again, that you should never add the target folder
(resources/target or tooling/target) to the git history. The
downloaded tools can easily be downloaded running Maven in each
working copy of the edition.
After the above set up the new tools can be used likewise in local set ups as in the CI/CD pipelines. The first thing should allways be installing your reproducible tooling environment:
cd resources # or cd tooling
./mvnw packageOf course, you don't have to use
Tooling; it's just a
recommendation. Instead you can use GNU Make or simply your command
line, for getting and unpacking a release of xtriples-micro.
export XTRIPLES_VERSION=0.5.4
wget https://github.com/SCDH/xtriples-micro/releases/download/$XTRIPLES_VERSION/xtriples-$XTRIPLES_VERSION-package.zip
wget https://your-provider.com/saxon-he.jar
wget https://your-provider.com/jena-riot.jar
wget https://your-provider.com/all-the-other-dependency.jars
export CP=.........
java -cp $CP net.sf.saxon.Transform ...However, this will quickly get very error prone. Furthermore, using Ant's XSLT target has shown to run 60 times faster that driving XSLT for many files through GNU Make, i.e., minutes instead of seconds.
[^1]: Cloing xtriples-micro and using its tooling environment should
only be used for playing around with it and for development.