Skip to content

Large numebr of BAM files leads to Error in vec_interleave_indices(): #450

@NikoLichi

Description

@NikoLichi

Dear Bambu team,

I am running a massive project with 480 BAM files with ~4.8 TB total data.
Following the previous suggestion for Bambu, I am running first the extended annotations (quant = FALSE), with the idea of running the quantification later in batches.

However, the is a major issue when starting the extended annotations:

--- Start extending annotations ---
Error in `vec_interleave_indices()`:
! Long vectors are not yet supported in `vec_interleave()`. Result from interleaving would have size 8857886400, which is larger than the maximum supported size of 2^31 - 1.
Backtrace:
     ▆
  1. ├─bambu::bambu(...)
  2. │ └─bambu:::bambu.extendAnnotations(...)
  3. │   └─bambu:::isore.combineTranscriptCandidates(...)
  4. │     ├─... %>% data.table()
  5. │     └─bambu:::combineSplicedTranscriptModels(...)
  6. │       └─bambu:::updateStartEndReadCount(combinedFeatureTibble)
  7. │         └─... %>% mutate(sumReadCount = sum(readCount, na.rm = TRUE))
  8. ├─data.table::data.table(.)
  9. ├─dplyr::mutate(., sumReadCount = sum(readCount, na.rm = TRUE))
 10. ├─dplyr::group_by(., rowID)
 11. ├─tidyr::pivot_longer(...)
 12. ├─tidyr:::pivot_longer.data.frame(...)
 13. │ └─tidyr::pivot_longer_spec(...)
 14. │   └─vctrs::vec_interleave(!!!val_cols, .ptype = val_type)
 15. │     └─vctrs:::vec_interleave_indices(n, size)
 16. └─rlang::abort(message = message)
Execution halted

Is there anything I could do to run Bambu?

My code looks like:

BAMlist = BAMs_one_per_Line
fa.file <- "/refData/release46/GRCh38.primary_assembly.genome.fa"
gtf.file <-  "/refData/release46/gencode.v46.primary_assembly.annotation.gtf"
bambuAnnotations <- prepareAnnotations(gtf.file)

extendedAnnotations = bambu(reads = BAMlist, annotations = bambuAnnotations, genome = fa.file, quant = FALSE, lowMemory=T, ncore = 14, rcOutDir="MY_PATH/bambu_20241015_all")

As an additional note, I also have the same warning message as some others have reported as issue #407 .

This is with R 4.3.2 and Bioc 3.18 / bambu (3.4.1).
Platform: x86_64-conda-linux-gnu (64-bit)

All the best,
Niko

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions