Skip to content

Various spec violations #24

Open
Open
@jakobnissen

Description

Dear @CiaranOMara

I've been working a bit with XAM.jl. Great package, but I've found quite a number of small errors and mistakes I think should be fixed. Making one PR for each would be overwhelming, especially since fixing some of them has implications that are not straightforward. So I'd like to discuss it a bit here:

List of errors/inconsistencies

  • Missing qualities are set to 0xff, but hasquality does not check this, so the result may be wrong
  • refname should be accessible even if the read is unmapped (per specs section 1.4.3)
  • isnextmapped checks if the record is filled, but ismapped doesn't
  • hasauxdata doesn't even check if the record has aux data
  • quality(record) raises an obscure error if a quality is > 127
  • function MetaInfo should check that a tag has two characters per the specs
  • ismapped, isprimary and ispositive strand errors when called on an unfilled record, BUT isnextmapped doesn't
  • hasrefid, hasrefname and hasposition just returns if the record is filled, not what is actually asked. But hasrightposition behaves differently. Worse, refname behaves differently too, meaning you could have hasrefname() == true, yet it still errors when the refname is fetched.

Proposals/enhancements

  • We should just get rid of unfilled records, they make little sense as a concept. Instead, we initialize every record as an empty, but valid record. This way, we never need to check if the record is filled, since it always is.
  • Then we can remove e.g. hasflag since every record always have flags
  • all hasX function should return a Bool
  • if hasX function returns false, the accessor (X) function should error.
  • Rewrite the memory layout of the BAM record so it exactly matches the specifications. This will not result in any performance issues, just make the underlying code simpler and make it easier to interface with C in the future

I've implemented many of these changes at https://github.com/jakobnissen/XAM.jl, but that "fork" is stale, and I'd rather merge the fixes into the original XAM.jl

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions