Skip to content

Biopython: Work with biological sequence data in Python

Notifications You must be signed in to change notification settings

agmcfarland/biopython_workshop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Biopython: Work with biological sequence data in Python

Welcome to this Biopython workshop!

We will learn how to use Biopython for sequence manipulation, filtering, writing, and BLASTing!

logo

Running workshop code

There are three options to run the workshop code:

  1. Click on the desired workshop day link in the files section

  2. Open the main github directory in google colab and select the notebook (.ipynb) you wish to run.

  3. Click on the green Code button and choose Download Zip, unzip the folder on your computer, open Jupyter Lab (through Anaconda Navigator or the command line), navigate to the folder you just downloaded, and open the appropriate notebook.

Lesson overview

  1. Day 1 – Introduction to strings, Biopython, and Biopython sequences

  2. Day 2 – Opening, closing, and saving sequence files with Biopython

  3. Day 3 – More sequence modification and data extraction

  4. Day 4 – Extracting and storing sequence data, working with GenBank files

  5. Day 5 – BLAST-ing against the NCBI database

Files

In the main directory you will find lesson notebooks, answer notebooks, example data, and annotated images.

Lesson Notebooks

There are five jupyter notebooks, one for each day, with lessons and questions.

  1. Lesson Day 1

  2. Lesson Day 2

  3. Lesson Day 3

  4. Lesson Day 4

  5. Lesson Day 5

Answer Notebooks

There are five jupyter notebooks, one for each day, with the answers to all the exercises in the lessons.

  1. Answers Day 1

  2. Answers Day 2

  3. Answers Day 3

  4. Answers Day 4

  5. Answers Day 5

Data

Different data files of extension .fasta, .fastq, .gbk, and .xml

Biological sequence data types

Examples of biological sequence data types and how Biopython reads them. Useful companion for lessons.

Fasta

fasta

Fastq

fastq

GenBank

gbk

BLAST

blast

Credit and references

Alexander McFarland

Biopython

Northwestern University Information Technology Research Computing Services

Special thanks to Colby Witherup Wood for their assistance

About

Biopython: Work with biological sequence data in Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published