Converts PDFs into the ATRIUM JSON format described below:
#########################################################
# T4.1.2 PDF report text extraction -
# suggested JSON output per PDF report.
# Reports may be represented by separate JSON files, 
# or as separate records in a JSONL format file.
# Note - we don't need to include text for individual 
# identified sections as we have full text plus start/end 
# character positions of these sections
#########################################################
{
	"meta": {
		(any metadata about the source document)
	},
	"text": (full text contents of the document),
	"sections": [
		{ 
			"start": (number),	<-- character position
			"end": (number),	<-- character position
			"type": (string) 	<-- e.g. "title", "subtitle", "abstract", "conclusions", "references" etc.
		}
	]
	
}