Skip to content

French novel collection for the ELTeC (European Literary Text Collection)

Notifications You must be signed in to change notification settings

COST-ELTeC/ELTeC-fra

Repository files navigation

DOI

ELTeC-fra

This is the French novel corpus for the ELTeC, the European Literary Text Collection, produced by the COST Action Distant Reading for European Literary History (CA16204, https://distant-reading.net). The current version is v1.0.1.

Note that this corpus is also available in a linguistically-annotated format prepared for direct import into the text analysis tool TXM; see here: 10.5281/zenodo.4274478. This format is based on v1.0.0 of the corpus.

An overview over the authors and works represented in the collection can be gained here: https://distantreading.github.io/ELTeC/fra/index.html.

Contributors

  • Collection editors: Christof Schöch and Lou Burnard
  • Contributors: Pia Geißel, Rezearta Murati, Evegnia Fileva
  • Sources: Bibliothèque nationale de France (Gallica), Ebooks libres et gratuits / Bibliothèque électronique du Québec, CLiGS textbox, Wikisource, Bibebook.com, Atramenta, OBVIL, Project Gutenberg.

Licence

All texts included in this collection are in the public domain. No claim to copyright or similar protections is made for the composition of the corpus, the collection and presentation of the metadata, or the transcription and encoding of the texts.

Citation suggestion

If you use this corpus in your research or teaching, please follow good scholarly practice and use the following citation suggestion to acknowledge your source:

  • French Novel Corpus (ELTeC-fra), edited by Christof Schöch and Lou Burnard. Version v1.0.1, April 2021. In: European Literary Text Collection (ELTeC). COST Action Distant Reading for European Literary History. DOI: https://doi.org/10.5281/zenodo.4662433
@collection{schoech_ELTeCfra_2020,
  title = {French Novel Collection (ELTeC-fra)},
  maintitle = {European Literary Text Collection (ELTeC)},
  editor = {Schöch, Christof and Burnard, Lou},
  version = {v1.0.1},
  year = {2021},
  month = {4},
  publisher = {COST Action Distant Reading for European Literary History},
  url = {https://github.com/COST-ELTeC/ELTeC-fra/},
  doi = {10.5281/zenodo.4662433},
  }

Release notes

General information about ELTeC releases is available at https://github.com/COST-ELTeC/ELTeC.

The concept DOI for all versions of ELTeC-fra is the following: https://doi.org/10.5281/zenodo.3462535.

  • recent changes: A level-2-encoded version valid against the level2_strict schema (with <s>...</s> tags) is now available (June 2023).
  • recent changes: A linguistically-annotated version (level 2 encoding) is now available, ahead of a v2.0.0 release.
  • v1.0.1, April 2021: This release includes 100 novels in level 1 encoding. Minor updates to the metadata were provided with this release. The DOI of this release release is: 10.5281/zenodo.4662433
  • v1.0.0, November 2020: This release includes 100 novels in level 1 encoding. With this release, a corpus compliance score (E5C) of 100 was reached. The DOI of this release release is: 10.5281/zenodo.4264647
  • v0.9.1, June 2020: This release includes 100 novels in level 1 encoding. Some further enhancements remain planned as work towards v1.0.0. See: v0.9.1 and issues in milestone v1.0.0. The E5C score of this release is 97.7/100.
  • v0.9.0, May 2020: There are now 100 novels in level 1 encoding. The corpus composition criteria are met and major bugs are fixed, but some enhancements are still planned as work towards v1.0.0. See: v0.9.0 and issues in milestone v1.0.0.
  • v0.8.0 (deprecated), November 2019: The corpus contains 82 novels encoded at level 1. The corpus composition criteria are not yet fully fulfilled.