This repository contains the parser that is used to import data from the Enlightenment Architectures XML files into the Sloane Lab knowledge base.
The code of the parser is in the Jupyter notebook Sloane_XML_Parser.ipynb. The following libraries are required:
- Beautiful Soup and lxml, to parse the XML
- Pandas, to manipulate and analyse the data
- Matplotlib, to create charts
- Mapbox, to create maps
- Viapy, to query the VIAF API
The XML catalogues are stored in the data directory, and include the following:
- books_b.xml and books_c.xml are catalogues of books, now held at the British Library
- fossils_1.xml and fossils_5.xml are catalogues of fossils, now held at the Natural History Museum
- miscellanea.xml is a catalogue of miscellaneous objects, now held at the British Museum