Scripts to extract mod info from hdf5 files #111

callumparr · 2021-03-30T16:20:46Z

Is there any scripts or example code to extract mod probabilities outputted to hdf5 by the basecaller.py script using h5py. I do not have any experience with python code and I remember seeing some snippet about this but no longer can find.

Something about importing with h5py and then extracting.

import h5py
import numpy as np

with h5py.File('/home/minion/Documents/multi_reads/taiyaki_output/5EU_RNA_only_fastq/5EU_RNA_only_modbase_basecalls.hdf5', 'r') as h5:
        cond_logprobs = h5['/Reads/ffff85d3-dc1e-42d5-8ce8-2357112c1d86'][:120]
        print(h5['mod_long_names'][()]

print(np.where(~np.isnan(cond_longprobs))[0])

print(np.where(cond_probs[:,1] > np.log(0.5)))

print(np.where(cond_probs[:,1] > np.log(0.9)))

print(np.exp(np.nanmax(cond_probs[:,0])))

The text was updated successfully, but these errors were encountered:

lpryszcz · 2021-04-07T08:57:51Z

Hi, you can either use:

fast5mod - it's deprecated
megalodon - it didn't work with our custom-trained RNA models
modPhred - should work with custom-trained DNA/RNA models

Extracting information directly from Fast5 is a bit complicated, but you can do it with some experience in coding. For starting example code check here.

marcus1487 · 2021-04-12T17:57:05Z

There is no dedicated example of this extraction. This output is intended for research demonstration purposes and not necessarily robust downstream anaylsis.

Megalodon should work for custom-trained models. Dump to json and pass in via guppy config file. @lpryszcz have you opened an issue in megalodon with the specific failure mode? It would be great to get a fix into megalodon if that is indeed not working.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scripts to extract mod info from hdf5 files #111

Scripts to extract mod info from hdf5 files #111

callumparr commented Mar 30, 2021 •

edited

Loading

lpryszcz commented Apr 7, 2021

marcus1487 commented Apr 12, 2021

Scripts to extract mod info from hdf5 files #111

Scripts to extract mod info from hdf5 files #111

Comments

callumparr commented Mar 30, 2021 • edited Loading

lpryszcz commented Apr 7, 2021

marcus1487 commented Apr 12, 2021

callumparr commented Mar 30, 2021 •

edited

Loading