Skip to content
66 changes: 65 additions & 1 deletion Code/RELICS.v2.R
Original file line number Diff line number Diff line change
Expand Up @@ -527,7 +527,7 @@ check_parameter_list <- function(input.parameter.list, data.file.split){
}
}

# type of area of effect. options are 'normal' (default), 'uniform', 'logistic' (need to implement)
# type of area of effect. options are 'normal' (default), 'uniform', 'slab_and_spike'
if(! 'areaOfEffect_type' %in% par.given){
out.parameter.list$areaOfEffect_type <- 'normal'
}
Expand Down Expand Up @@ -632,6 +632,26 @@ check_parameter_list <- function(input.parameter.list, data.file.split){
}
}
}

# set default parameters for slab and spike AoE
if (out.parameter.list$areaOfEffect_type == "slab_and_spike") {

# set default flanking distance to 500bp
if (! 'flanking.distance' %in% par.given) {
out.parameter.list$flanking.distance <- 500
}

# set default slab AoE to 0.1
if (! 'slab.aoe' %in% par.given) {
out.parameter.list$slab.aoe <- 0.1
}

# set default CRISPR effect range to zero if not provided
if (! 'crisprEffectRange' %in% par.given) {
out.parameter.crisprEffectRange <- 0
}
}


# if(out.parameter.list$areaOfEffect_type == 'normal'){
# if(! 'normal_areaOfEffect_sd' %in% par.given){
Expand Down Expand Up @@ -2866,6 +2886,27 @@ set_up_RELICS_data <- function(input.parameter.list, data.file.split, guide.offs
dnorm(0, mean = 0, sd = input.parameter.list$normal_areaOfEffect_sd)

x$dist_to_seg[which(x$dist_to_seg == max(x$dist_to_seg))] <- 1
x
})
} else if(input.parameter.list$areaOfEffect_type == 'slab_and_spike'){

# create variables to hold flanking distance and slab area of effect
flanking.distance <- input.parameter.list$flanking.distance
slab.aoe <- input.parameter.list$slab.aoe

# iterate through every guide in guide.to.seg.lst
guide.to.seg.lst <- lapply(guide.to.seg.lst, function(x){

# set all segments to have a default AoE of 0
x$dist_to_seg <- rep(0, length(x$dist_to_seg))

# set AoE to slab value if within flanking distance
within.flank <- x$dist_to_seg < flanking.distance
x$dist_to_seg[within.flank] <- input.parameter.list$slab.aoe

# set closest segment to have an AoE of 1 (spike)
x$dist_to_seg[which(x$dist_to_seg == min(x$dist_to_seg))] <- 1

x
})
}
Expand Down Expand Up @@ -2902,6 +2943,29 @@ set_up_RELICS_data <- function(input.parameter.list, data.file.split, guide.offs

x$dist_to_seg <- min.dist

x
})
} else if (input.parameter.list$areaOfEffect_type == 'slab_and_spike'){

# create variables to hold flanking distance and slab area of effect
flanking.distance <- input.parameter.list$flanking.distance
slab.aoe <- input.parameter.list$slab.aoe

# apply slab and spike area of effect for each guide in guide.to.seg.lst
guide.to.seg.lst <- lapply(guide.to.seg.lst, function(x){

# set baseline AoE of 0
min.dist <- rep(0, length(x$dist_to_seg))

# seg segments within flanking distance of 500bp to slab AoE
min.dist[x$dist_to_seg < flanking.distance] <- slab.aoe

# set segments that guide overlaps to 1
min.dist[x$sg_overl_idx] <- 1

# return modified dist_to_seg vector accounting for AoE
x$dist_to_seg <- min.dist

x
})
}
Expand Down
15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ relics.parameters$FS0_label <- 'CD69_promoter' # use all sgRNAs that overlap the
# relics.parameters$background_label <- c('chr', 'exon') # specify what sgRNAs to use to initially train the background parameters
```

6. Specify the number of functional sequences to look for and the number of functional sequences you expect to find. RELICS will look for a total of `max_fs_nr` functional sequences and eight their signal according to the prior which is specified by the `expected_fs_nr`. We recommned setting the `max_fs_nr` to at least `expected_fs_nr` + 3. For larger `expected_fs_nr` we recommend setting `max_fs_nr` to 4/3*`expected_fs_nr`.
6. Specify the number of functional sequences to look for and the number of functional sequences you expect to find. RELICS will look for a total of `max_fs_nr` functional sequences and weight their signal according to the prior which is specified by the `expected_fs_nr`. We recommned setting the `max_fs_nr` to at least `expected_fs_nr` + 3. For larger `expected_fs_nr` we recommend setting `max_fs_nr` to 4/3*`expected_fs_nr`.
```r
# specify the expected number of functional sequences and how many to look for in total
relics.parameters$max_fs_nr <- 15
Expand All @@ -153,12 +153,12 @@ relics.parameters$expected_fs_nr <- 5 # expected based on previous findings by S
relics.parameters$crisprSystem <- 'CRISPRa' # other options: CRISPRcas9, CRISPRi, dualCRISPR
```

8. Give the location of the output directory by setting the `out_dir` flag. Either reference to full path or the path from the current working directory. In this example we will do the latter and assume you are in the `RELICS_tutorial` folder. We recommend you create a new file in which the results are saved. Note, RELICS will NOT create non-existent files for you. In this example, first create the `CD69_tutorial_output` folder, then set the flag:
8. Give the location of the output directory by setting the `out_dir` flag. Either reference to full path or the path from the current working directory. In this example we will do the latter and assume you are in the `RELICS_tutorial` folder. We recommend you create a new directory in which the results are saved. Note, RELICS will NOT create non-existent directories for you. In this example, first create the `CD69_tutorial_output` folder, then set the flag:
```r
relics.parameters$out_dir <- 'CD69_tutorial_output'
```

9. RELICS now explicityl models the count-dispersion relationship. This drastically imporves performance and helps reduce the number of false positives. See the section `Count-Dispersion modeling` below for details on how to best estimate `nr_disp_bins` and `repl_spline_df`:
9. RELICS now explicitly models the count-dispersion relationship. This drastically improves performance and helps reduce the number of false positives. See the section `Count-Dispersion modeling` below for details on how to best estimate `nr_disp_bins` and `repl_spline_df`:
```r
# specify the number of bins to group the guide counts into and the degrees of freedom of the spline function for each replicate
relics.parameters$nr_disp_bins <- 15
Expand Down Expand Up @@ -355,11 +355,18 @@ relics.parameters$normal_areaOfEffect_sd <- 170
relics.parameters$crisprEffectRange <- 415
```

It is also possible to nor model the AoE and assume a uniform parturbation instead where it's equally likely for any base pair to be perturbed
It is also possible to not model the AoE and assume a uniform perturbation instead where it's equally likely for any base pair to be perturbed
```r
relics.parameters$areaOfEffect_type <- 'uniform'
```

A third area of effect approach that can be used with RELICS is a "slab-and-spike" area of effect. In the slab and spike implementation, the genomic regions that overlap the guides are set to a "spike" area of effect of 1, while the base pairs flanking the guide are set to a "slab" area of effect. By default, RELICS uses a flanking distance of 500 base pairs and a "slab" of 0.1, but these parameters can be manually configured as well.
```r
relics.parameters$areaOfEffect_type <- 'slab_and_spike'
relics.parameters$flanking.distance <- 500
relics.parameters$slab.aoe <- 0.1
```


## Input data format contd. (for backward compatibility)
Instead of providing one joint file containing both coordinates and counts it is also possible to supply them separately. In this case the format is the following:
Expand Down