Skip to content
Klaus Rettinghaus edited this page Mar 10, 2023 · 15 revisions

CSV2CMI

About

CSV2CMI is a little program to transform a table of letters (given as .csv) into the CMI format. The CMI format is the underlying data format for the web service correspSearch which facilitates searching across diverse distributed letter repositories.

Usage

You have to name your columns as follows:

  • name of the sender: "sender"
  • name of the addressee: "addressee"
  • IDs of the named persons/organisations: "senderID" and "addresseeID"
  • the date, when the letter has been sent: "senderDate"

You may provide places as additional information:

  • where a letter has been sent: "senderPlace" (with the appropriate "senderPlaceID" as proper GeoNames URL)
  • where a letter has been received: "addresseePlace" (with the appropriate "addresseePlaceID" as proper GeoNames URL)

Furthermore you can specify

  • the date, when the letter has been received: "addresseeDate"
  • a note about the letter: "note"
  • literary dates like by the end of the last winter: "senderDateText" or "addresseeDateText"

If your letters are printed across different editions, add an "edition" column and put in there the bibliographic records. Numbering of letters should be stated in a additional column named "key". Alternatively you may enter in this column a link to the edited letter on the web. If a date is put within brackets it sets @cert to "medium", for <persName>, <orgName>, and <placeName> alike @evidence is set. CSV2CMI also supports Extended Date/Time Format Level 0 Features, i.e. in addition to specific dates you may specify an interval like 1884-08-11/1886. Some Level 1 Features and Level 2 Features are supported, too: Extended Interval and Sets , i.e. dates like ../1985-04-12 or [1967-11,1969-10,1970-01..1973-05] may be entered. Interval dates are converted to their respective @from and @to attributes, the outermost dates of sets are stored in @notBefore and @notAfter.

Authority control

For sender and addressee IDs from following authority files are currently supported:

⚠️ Use the full URL to the dataset! So instead of just 303480271 use http://viaf.org/viaf/303480271.

Misc

The resulting file is a minified XML file; Metadata can be given in the configuaration file csv2cmi.ini.

⚠️ Windows Users: Excel-CSV will not work by default, use commas as separators.
⚠️ Check, that your table is using UTF8-encoding!

Options

By default only edited letters (i.e. letters with a given edition) are transferred to CMI output. If you want to convert your complete catalogue, use the -a option.
If the given notes shall be transferred to CMI output this can be achieved with the -n option.
--line-numbers will add a @n to <correspDesc> containing the corresponding line number in the table. This is especially useful to keep track while editing the TEI file manually.
The --verbose mode (option -v) gives more information during the conversion.

Configuration

An INI file (csv2cmi.ini) stores project-specific metadata such as the name and email address of the editor. It uses the specification RFC 5322 for the declaration of the editor.

System requirements

CSV2CMI should run out of the box an every system. If it doesn't, make sure Python3 is installed.

License

This program is available under The MIT License (MIT).

Clone this wiki locally