Release Bpipe Version 0.9.9.7 · ssadedin/bpipe

Summary

This release includes several major new features including prelimary support for
running Bpipe pipelines on cloud providers (Google Cloud, Amazon Web Services), a
new merge point operator for making it easier to
construct parallel pipelines using scatter-gather parallelism. In addition to these,
significant work has been done to dramatically improve performance and reduce
resource consumption on very highly parallel pipelines with large numbers of
input / output files.

Features

Preliminary support for executing pipelines on Google Cloud Services
(Compute Engine) and mounting storage for pipelines from Google Cloud
Storage
Preliminary support for executing pipelines on Amazon Web Services
using EC2 and mounting storage for pipelines from S3
The 'groovy' command can now run embedded groovy (executed outside
Bpipe) using the groovy runtime bundled with Bpipe
Support aliasing to string values in addition to outputs
Experimental support for beforeRun hook in command config: execute
arbitrary groovy code before a command executes
Many performance improvements, esp. for large, highly
parallel pipelines
Support configuration for number of retries for status
polling of HPC jobs (statusPollRetries setting)
Support for 'optional' inputs in pipelines: to make input optional,
suffix with 'optional'. Also can add 'flag' to add flags
in commands eg: ${input.csv.optional.flag('--csv')}
New operator: merge point operator (>>>) automatically configures a stage
to merge outputs from a previous parallel split
Add region.bedFlag(flag) method for convenience when passing
regions to commands
'var' expressions may now be added in the main pipeline script,
not just pipeline stages. These define optional
variables, and provide a default.
JMS support now responds to 'ping' message with 'pong' reply
if JMS 'Reply-To' is set to allow for status monitoring

Fixes

Fix incorrect "abnormal termination" messages
printed to console when pipeline stopped with 'bpipe stop'
Fix incorrect 'pre-existing' printed for outputs that were
created by pipeline
Fix genome not accessible in pipeline the first time downloaded,
printing error
Re-execute checks if a commmand in the same stage has executed
synchronize initialization of dir watcher to fix sporadic
ConcurrentModificationExceptions
Fix empty embedded parallel stage list causing resolution of incorrect
downstream input
Fix leak of 'var' variables across branches when 'using' applied to
pipeline stage
Fix error if 4 or more arguments passed to "to" in transform
Fix bpipe complaining spurious outputs not created on retry,
but not original run
Fix some bugs where branch names were not being observed
Fix branch name sometimes inserted without separating period for transforms
Avoid redundantly putting branch name into files
Improved detail in error / log messages in a few places
Fix missing branch and '..' in filenames
Change: globally defined variables must now be held constant
once pipeline starts
Fix split regions not stable between runs, set region id as branch
name
Fix bed.split producing different splits if run repeatedly on same bed
Fix errors output if SLF4J referenced in user loaded libraries
Fix npe / improve error message when filter used with mismatching output ext
Fix error in stage body resulting in confusing 'no associated storage'
assertion failure
Add 'allowForeign' option to 'from' to let it process non-outputs
Lessen the retries and retry interval when file cannot be cleaned
up

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bpipe Version 0.9.9.7

Summary

Features

Fixes