Skip to content

Template Length metrics

Clay McLeod edited this page Sep 28, 2022 · 5 revisions

The Template Length metrics facet reports statistics regarding the template lengths contained within the file. The report is delivered at under the template_length key within the results.json file. You can easily examine the output of the general facet by using jq:

cat results.json | jq .template_length

Overview

A histogram that supports the range [0, 1024] is instantiated with a counter per bin (initialized to zero). For every record in the file, the facet examines the template length field of the record. If the template length falls within the range of the histogram, the counter for that bin is incremented and the processed record count field is increased by one. If the template length falls outside the range of the histogram, no bins in the histogram are incremented and the ignored record count field is increased by one.

Output

This facet has the following top-level keys,

Key Description
histogram Contains a histogram with capacity [0, 1024] that represents how many records have a given template length within that range.
records Contains metrics related to simple record counting for this facet. Includes details on how many records were processed versus how many were ignored (typically due to the insert size being out of range of the histogram).
summary Contains summary statistics regarding this QC facet, most notably percentages regarding how many template lengths were unknown and how many fell outside of the range of the histogram.

Histogram

As described in the overview of this QC facet above, the histogram field contains the spectrum of record counts for each template size supported by the histogram. These values can be used to plot a template length curve, which should commonly represent a smooth normal or gamma distribution.

Records

Contains the total number of processed and ignored records as defined in the overview of this QC facet above.

Summary

Contains useful statistics regarding (a) the percentage of records that have a template length of 0 [unknown] and (b) the percentage of records that had a template length that fell outside of the histogram's supported size.

Clone this wiki locally