Skip to content

SAM/BAM/CRAM readers don't support @CO header #315

@athos

Description

@athos

The SAM/BAM/CRAM specification has the @CO header for one-line comments. It's defined as below:

@CO: One-line text comment. Unordered multiple @CO lines are allowed. UTF-8 encoding may be
used.
https://github.com/samtools/hts-specs/blob/be74ef71f3fad34eb86af83bd66338d7d569af99/SAMv1.tex#L356

However, the current implementation of the SAM/BAM reader doesn't read the @CO header properly.

Repro

$ samtools view -h header_comment.sam
@SQ	SN:chr1	LN:1000	M5:258e88dcbd3cd44d8e7ab43f6ecb6af0
@CO	This is a comment.
@CO	This is also a comment.
(require '[cljam.io.sam :as sam])

(with-open [r (sam/reader "header_comment.sam")]
  (sam/read-header r))
;=>
{:SQ [{:SN "chr1", :LN 1000, :M5 "258e88dcbd3cd44d8e7ab43f6ecb6af0"}],
 :CO [{:This is a comment. nil} {:This is also a comment. nil}]}

Note that the comment contents themselves are read as keywords including whitespaces.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions