This repository is part of the Find Case Law project at The National Archives.
This folder specifies the configuration of the Marklogic database used by the Case Law public access system. It uses the ml-gradle to manage and maintain a versioned configuration.
For full details of what can be set in the files here, see the ml-gradle documentation. The file layout is explained in the project layout documentation.
-
Install
gradle
. On MacOS, you can usebrew install gradle
. -
If you're running against anything other than development, copy
gradle-development.properties
togradle-{environment}.properties
and set the credentials and hostname for your Marklogic server.
To deploy a marklogic configuration, run gradle mlDeploy -PenvironmentName={environment}
.
The development
environment will be used by default if you don't specify -PenvironmentName
.
Deployment is idempotent, and will automatically configure databases, roles, triggers and modules.
Please also create a Github Release when you deploy.
A docker-compose.yml
file for running Marklogic locally is included. Run docker-compose up -d
to start it; it takes a minute or so, and will raise various HTTP errors if you visit localhost:8000
before that point.
Note: There is currently a known issue with marklogic-docker so instead you might need to run development_scripts/run_local_docker
You'll then need to deploy the configuration (see Deployment, above)
Ensure that MARKLOGIC_HOST
in .env
in the editor and public ui is set to host.docker.internal
in .env
and that the username and password are both admin
if you want to use them with the local instance.
To get some example documents onto the local database, there are development_scripts/populate_top_judgments_and_neighbours.py
and development_scripts/populate_from_caselaw.py
which copy documents from the live caselaw site (they don't import or fake properties) into your local database. (Check https://caselaw.nationalarchives.gov.uk/terms-of-use and get in touch if you intend to download many more than these.)
There are also other ways other importing data as detailed further down the readme but haven't been tested for a while.
You can run the unit tests with gradle mlUnitTest
. This relies on the tests being deployed; use gradle mlDeploy
in the first instance,
and make sure that you have gradle mlWatch -i
running to automatically deploy changes as you make them.
gradle mlGenerateUnitTestSuite
will create a new stub test suite, and gradle mlClearModulesDatabase
might be needed if you create
tests and then later delete them.
The releases are currently manually tagged. Please do not deploy to production without tagging a release. Currently there is no auto-deployment of releases, but we are using releases & tags to keep track of what has been deployed to production.
To create a versioned release, use Github's release process to create a tag and generate release notes.
When deploying to production, check out the tag you want to deploy using (for example) git checkout tags/v1.0.0
then deploy from there. Git will put you into a "detatched head" state, and once you have finished deploying you can
switch back to the main branch (or any branch) by using git checkout branchname
as normal.
TODO: Automatically deploy main to staging, and tags to production using CodeBuild.
(This hasn't been used in a long time)
Place the XML files you want to import in the import
folder of this repo, then run
gradle importDocuments
. The documents will be imported, and the URI will be set as the
full file path and name within import
.
You may want to run gradle publishAllDocuments
(see below) afterwards. All files
are automatically put under management on import, so there is no need to run the manage task.
To export the latest versions of all documents, for instance for bulk processing, you can use:
gradle mlExportToZip -PwhereUrisQuery="const dls = require('/MarkLogic/dls'); cts.uris('', [], dls.documentsQuery())" -PenvironmentName=<env> -PexportPath=export.zip
Two gradle tasks are available for bulk management of documents in a database using CoRB. In production these should not be necessary to use, but are provided in order to automate some development tasks and provide examples for future data migrations.
gradle manageAllDocuments
: Enables version management for all documentsgradle publishAllDocuments
: Sets thepublished
flag for all documentsgradle addAllDocumentsToJudgmentsCollection
: Adds all documents to the 'judgments' collection.
Rather than running an import of a set of files, you can restore from a shared backup. Note that this bucket is currently only available to dxw developers.
- First, navigate to http://localhost:8001/, which will ask for basic auth. Username and password are both
admin
. - Then add AWS credentials to MarkLogic (under Security > Credentials), so it can pull the backup from a shared S3 bucket.
The credentials (AWS access ID & secret key) should be for your
dxwbilling
account. You will need to create them in AWS if you haven't already. - In the Backup/Restore tab in Marklogic for your the
caselaw-content
Judgments database, initiate a restore, using the following as the"directory": s3://tna-judgments-marklogic-backup/
. SetForest topology changed
totrue
. - Uncheck the
security
database when restoring or your passwords will be wiped.
Assuming you have entered the S3 credentials correctly, this will kick off a restore from s3. Once you have the data locally,
you can then back it up locally using the path /var/opt/backup
in the management console. It will be backed up to your local
machine in docker/db/backup
Depending on the backup state, you may need to run gradle manageAllDocuments
and gradle publishAllDocuments
after the restore has finished.
- http://localhost:8000/ this is the query interface where you can browse documents in the
Judgments
database. - http://localhost:8001/ this is the management console where you can administer your database.
- http://localhost:8002/ this is the monitoring dashboard.
- http://localhost:8011/ this is the application server for the Marklogic REST interface
All four URLs use basic auth, username and password are both admin
.