Skip to content

Installation of a Release

Christian Lück edited this page Sep 22, 2025 · 4 revisions

Installation of a Release

For using xtriples-micro in a CI/CD pipeline of an edition (or other XML-based data project), we recommend using Tooling. (For alternatives see other toolings.) It provides a fully reproducible, extensible and adaptable environment for downloading and setting up XML and RDF tools from remote sources. It can download xtriples-micro and install it as a dependency.

Please note, that the following instruction does not use the Tooling environment configured in the xtriples-micro project. Instead, we tell the Tooling environment of the edition, to get us the released package of xtriples-micro. This way is the clean way, because we can only get a version of xtriples-micro, that has undergone the software release cycle of development, testing and releasing. Cloning the repo, could get us the code in some undefined inter-release state, which can be contain bugs.[^1] Adding it as a git sub-module is considered an anti-pattern. Downstream projects should depend on a release.

Also note, that you should never add the downloaded tools to the project's git/commit history! Only the configuration, telling which tool versions to use, is to be added to the history.

Directory structure

Provided, that Tooling is installed in the edition's git repository for driving the CI/CD pipeline and provided that it is installed in the resources (or tooling) directory of the edition, the edition's directory structure will look like this example project:

tree . -a
.
├── ALEA.odd
├── common.xml
├── Diwan                        # the lyrics encoded in TEI
│   ├── ayn
│   │   ├── ayn1
│   │   │   └── ayn1.tei.xml
│   │   ├── ayn10
│   │   │   └── Ayn10.tei.xml
│   │   ├── ayn100
│   │   │   └── ayn100.tei.xml
│   .   .   
│
├── .gitignore
├── .gitlab
│   └── ci                       # CI/CD sub-pipelines
│       ├── build.yaml
│       ├── production.yaml
│       └── staging.yaml
├── .gitlab-ci.yml               # Main CI/CD pipeline
├── …
├── resources                    # resources (or tooling) contains the technical resources for the edition
│   ├── build.xml                # an Apache Ant build file that drives the CI/CD pipeline
│   ├── catalog.xml              # an OASIS XML catalog for the project
│   ├── ci-prod.properties       # Ant properties used for running the CI/CD pipeline
│   ├── ci-validation.properties # Ant properties used for the validation of TEI sources
│   ├── ci_settings.xml
│   ├── graph                    # the graph folder contains XTriples configuration files
│   │   ├── common.xml
│   │   ├── diwan.xml
│   │   ├── extract-collection-with-utils.xsl
│   │   ├── listWit.off
│   │   ├── persons.off
│   │   ├── places.xml
│   │   ├── utils.xsl
│   │   └── witnesses.xml
│   ├── .mvn
│   │   └── wrapper
│   │       ├── maven-wrapper.jar          # the only jar required in your repo
│   │       └── maven-wrapper.properties
│   ├── mvnw                      # Maven wrapper script for Linux and Mac
│   ├── mvnw.cmd                  # Maven wrapper script for Windows
│   ├── oxbytei-config.xml
│   ├── pom.xml                   # Apache Maven configuration using for Tooling 
│   ├── saxon-local.xml           # Saxon configuration file
│   ├── scripts                   # scripts folder contains patterns for wrapper the tool's scripts
│   │   ├── ant.sh
│   │   ├── classpath.sh
│   │   ├── query.sh
│   │   ├── riot.sh
│   │   ├── sparql.sh
│   │   └── xslt.sh
│   ├── solr                      # resources for generating the dataset for indexing on Apache Solr
│   │   ├── documents.jsonld
│   │   ├── merge.xsl
│   │   └── solr.sparql
.   .
│   ├── target                    # NEVER ADD THIS FOLDER TO THE GIT HISTORY!
.   .

The XTriples configuration files, which define how the RDF knowledge graph is extracted from your TEI sources, are contained in the resources/graph folder. There are some build targets in resources/build.xml that run XTriples on these configuration files and merge the extracted triples into one graph. We look at these in Example Project. Let's concentrate on the installation of xtriples-micro and of RDF tools.

Downloading an Unpacking

We can use Apache Maven, which is at the heart of Tooling, to get xtriples-micro. Therefore we add the following lines to resources/pom.xml.

  1. Add a property that tells which version of xtriples-micro to use.

     <properties>
         <!-- ... other properties ... -->        
         <xtriples-micro.version>0.5.4</xtriples-micro.version>
     </properties>
  2. Use a Maven plugin, for downloading and unpacking xtriples-micro. The following lines tell maven to download a xtriples-micro release package with the above defined version from the release assets and to unpack it into resources/target/dependencies/xtriples.

     <!-- ... -->
     
     <build>
         <plugins>
         
             <!-- ... other plugins ... -->
    
             <plugin>
                 <groupId>io.github.download-maven-plugin</groupId>
                 <artifactId>download-maven-plugin</artifactId>
                 <version>2.0.0</version>
                 <executions>
                     <execution>
                         <id>install-xtriples-micro</id>
                         <phase>generate-resources</phase>
                         <goals>
                             <goal>wget</goal>
                         </goals>
                         <configuration>
                             <url>https://github.com/SCDH/xtriples-micro/releases/download/${xtriples-micro.version}/xtriples-${xtriples-micro.version}-package.zip</url>
                             <unpack>true</unpack>
                             <outputDirectory>${project.build.directory}/dependencies/</outputDirectory>
                         </configuration>
                     </execution>
                 </executions>
             </plugin>
             
             <!-- ... other plugins ... -->
             
         </plugins>
     </build>
  3. Run ./mvnw package from the resources (or tooling) directory.

    cd resources
    ./mvnw package

After this, xtriples-micro is installed in resources/target/dependencies/xtriples:

tree target/dependencies/xtriples
target/dependencies/xtriples
├── saxon.ee.xml
├── saxon.he.xml
├── saxon-local.xml
├── saxon.pe.xml
├── saxon.xml
└── xsl
    ├── collection.xsl
    ├── extract-collection.xsl
    ├── extract-param-doc.xsl
    ├── extract.xsl
    ├── to-codepoints.xsl
    ├── to-seed-params.xsl
    ├── vocabularies.xsl
    └── xtriples.xsl

2 directories, 13 files

Installing additional Tools

Apache Jena RIOT

For converting xtriples-micro's NTriples output to other RDF serializations, Apache Jena's RDF I/O Technology (RIOT) is most suitable. With Tooling the installation and making a wrapper script is very simple.

  1. Add a version property pom.xml file:
     <!-- ... -->
    
     <properties>
         <!-- ... other properties ... -->        
         <xtriples-micro.version>0.5.4</xtriples-micro.version>
         <jena.version>4.10.0</jena.version>
     </properties>
    
     <!-- ... -->
2. Add a dependency `pom.xml` file:
 ```xml
  <!-- ... -->

  <dependencies>
      <!-- ... other dependencies ... -->

      <dependency>
          <groupId>org.apache.jena</groupId>
          <artifactId>jena-cmds</artifactId>
          <version>${jena.version}</version>
          <scope>test</scope>
      </dependency>
  </dependencies>

  <!-- ... -->

Note the testing scope! That means, that Jena would not be a transitive dependency, should you ever use the edition as a dependency on a downstream project.

  1. Assert, that wrapper scripts resources/scripts/*.sh are filtered and written to resources/target/bin/ and that their execution file system privilege is set. There should be appropriate instructions using the maven-resources-plugin and the maven-antrun-plugin in the pom.xml when using Tooling, Lines 213 and 235.
  2. Add a wrapper script to resources/scripts/riot.sh:
    #!/bin/sh
    
    JAVAOPTS=-Ddebug="true" 
    
    CP=$CLASSPATH
    for j in ${project.build.directory}/lib/*.jar; do
        CP=$CP:$j
    done
    
    java $JAVAOPTS -cp $CP jena.riot $@
  3. Run ./mvnw package

You should now Apache Jena RIOT installed and there's a nice wrapper script for conveniently using it:

resources/target/bin/riot.sh -h
riot [--help] [--time] [--base=IRI] [--syntax=FORMAT] [--out=FORMAT] [--count] file ...
  Parser control
      --sink                 Parse but throw away output
      --syntax=NAME          Set syntax (otherwise syntax guessed from file extension)
      --base=URI             Set the base URI (does not apply to N-triples and N-Quads)
      --nobase               Pass through relative URIs (best effort)
      --check                Additional checking of RDF terms
      --strict               Run with in strict mode
      --validate             Same as --sink --check --strict
      --count                Count triples/quads parsed, not output them
      --merge                Convert quads to triples
      --rdfs=file            Apply some RDFS inference using the vocabulary in the file
      --nocheck              Turn off checking of RDF terms
  Output control
      --output=FMT           Output in the given format, streaming if possible.
      --formatted=FMT        Output, using pretty printing (consumes memory)
      --stream=FMT           Output, using a streaming format
      --compress             Compress the output with gzip
  Time
      --time                 Time the operation
  Symbol definition
      --set                  Set a configuration symbol to a value
  General
      -v   --verbose         Verbose
      -q   --quiet           Run with minimal output
      --debug                Output information for debugging
      --help
      --version              Version information

Titanium for JSON-LD framing

Note, that the download-maven-plugin can be used for downloading and unpacking multiple assets, by defining multiple executions. The following will not only get xtriples-micro, but also an executable of the command line version of the Titanuium JSON-LD processor. The executable will be available at resources/target/bin/ld-cli.

    <!-- ... -->

    <properties>
        <!-- ... other properties ... -->        
        <xtriples-micro.version>0.5.4</xtriples-micro.version>
        <titanium-cli.version>0.10.0</titanium-cli.version>
    </properties>

    <!-- ... -->
    
    <build>
        <plugins>
        
            <!-- ... other plugins ... -->

            <plugin>
                <groupId>io.github.download-maven-plugin</groupId>
                <artifactId>download-maven-plugin</artifactId>
                <version>2.0.0</version>
                <executions>
                    <execution>
                        <id>install-xtriples-micro</id>
                        <phase>generate-resources</phase>
                        <goals>
                            <goal>wget</goal>
                        </goals>
                        <configuration>
                            <url>https://github.com/SCDH/xtriples-micro/releases/download/${xtriples-micro.version}/xtriples-${xtriples-micro.version}-package.zip</url>
                            <unpack>true</unpack>
                            <outputDirectory>${project.build.directory}/dependencies/</outputDirectory>
                        </configuration>
                    </execution>
                    <execution>
                        <id>install-titanium-cli</id>
                        <phase>generate-resources</phase>
                        <goals>
                            <goal>wget</goal>
                        </goals>
                        <configuration>
                            <url>https://github.com/filip26/ld-cli/releases/download/v${titanium-cli.version}/ld-cli-${titanium-cli.version}-ubuntu-latest.zip</url>
                            <unpack>true</unpack>
                            <outputDirectory>${project.build.directory}/bin/</outputDirectory>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            
            <!-- ... other plugins ... -->
            
        </plugins>
    </build>
            
    <!-- ... -->        

Note, that downloading release assets (and resources from other arbitrary URLs) with the maven-download-plugin is most straight-forward method. We could also get these packages from the Maven package registries. However, downloading from github's maven package registry would required us to set up authentification, which would only add complexity.

Keep your commit history clean

Please notice again, that you should never add the target folder (resources/target or tooling/target) to the git history. The downloaded tools can easily be downloaded running Maven in each working copy of the edition.

After the above set up the new tools can be used likewise in local set ups as in the CI/CD pipelines. The first thing should allways be installing your reproducible tooling environment:

cd resources   # or cd tooling
./mvnw package

Other toolings

Of course, you don't have to use Tooling; it's just a recommendation. Instead you can use GNU Make or simply your command line, for getting and unpacking a release of xtriples-micro.

export XTRIPLES_VERSION=0.5.4
wget https://github.com/SCDH/xtriples-micro/releases/download/$XTRIPLES_VERSION/xtriples-$XTRIPLES_VERSION-package.zip
wget https://your-provider.com/saxon-he.jar
wget https://your-provider.com/jena-riot.jar
wget https://your-provider.com/all-the-other-dependency.jars
export CP=.........
java -cp $CP net.sf.saxon.Transform ...

However, this will quickly get very error prone. Furthermore, using Ant's XSLT target has shown to run 60 times faster that driving XSLT for many files through GNU Make, i.e., minutes instead of seconds.

Footnotes

[^1]: Cloing xtriples-micro and using its tooling environment should only be used for playing around with it and for development.

Clone this wiki locally