SIG: Bioconductor Infrastructure for Base Modifications

#### Introduction
I am a new PhD Student at the Walter and Eliza Hall institute in Melbourne, Australia. My project is based around methods and tools for the analysis of DNA methylation in long reads using Oxford Nanopore sequencers. My formal background is in statistics but I mainly work on developing software and have a keen interest in efficient and user-friendly computational methods and visualisation. 

#### Expected attendees
Researchers who are interested in base modifications of all kinds, I am interested in DNA but the developed structure should equally support RNA modifications.

#### Should it be held during Developer Day
Probably

#### Description of the topic
(Will update this section after I do some more research and take suggestions)

I think there are things to keep in mind for this:
* Support for long reads, I don't think this is an big issue, I'm not aware of GenomicRanges having any limitations with length of reads, but since I'm interested in Nanopore sequencing, it's vitally important to have this support.
* Read-based tracking, since I'm thinking about long reads, I can potentially detect when two sites along a read have correlated or anti-correlated methylation patterns on the same molecule. So I want to not only keep track of this information but efficiently make queries based on it.
* Support for RNA modifications, there are over a hundred of these, I think extended alphabets are sometimes used for representing DNA modifications but that's likely not feasible without creating FASTQ qual-string-like monstrosities.
* Interoperability with genomic data structures, down the line it's very likely that methylation and mRNA expression will be analysed together, facilitating this kind of analysis is of great interest.

As far as I'm aware there's not a specialised widely supported Bioconductor structure for storing base modification information that also facilitates straightforward querying of common issues. The basics would be to ask for the methylation proportions in a specific region, there should be metadata within objects to separate groups for which this can be asked as well as reporting of coverage at the loci. Additionally it would be useful to query within-read methylation patterns, to inspect correlation between methylation sites within molecules. Compactness of representation is also going to be important, sparse or on-disk representations would be useful to consider, features and query performance probably take second place to storage size.

#### Desired outcome
I'd like to establish a set of queries of interest and a general abstract idea of what data structure(s) might be appropriate.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SIG: Bioconductor Infrastructure for Base Modifications #35

Introduction

Expected attendees

Should it be held during Developer Day

Description of the topic

Desired outcome

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SIG: Bioconductor Infrastructure for Base Modifications #35

Description

Introduction

Expected attendees

Should it be held during Developer Day

Description of the topic

Desired outcome

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions