Fall2009/Genome Hacking Exercise
From OpenSourceSoftwarePractice
Genome
The Ultimate Source Code....
Contents |
Introduction
DNA
- Packing, Replication and Transcription
- http://www.youtube.com/watch?v=4PKjF7OumYo&NR=1
- Body Code (From Walter+Eliza Hall Medical Foundation)
Overview
Keeping up with the Human Genome
The Target: Hemoglobin
- Animation of Oxy-Deoxy state
- Animation of Molecular configuration
- Code Reuse
- Source Code
| Subunit | Gene | Chromosomal Locus |
|---|---|---|
| Hb α1 | HBA1 | Chromosome 16 p13.3 |
| Hb α2 | HBA2 | Chromosome 16 p13.3 |
| Hb β | HBB | Chromosome 11 p15.5 |
Chromosomal Locus
p13.3 means
- in the "p" arm (the short arm)
- band 13
- sub-band 3
Download the Human Genome Code
Repository
- http://vimeo.com/1865535
- ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/
- ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/
Genome Reference Consortium Human (GRCh) Build 37
Chromosome 11
wget ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/hs_ref_GRCh37_chr11.agp.gz wget ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/hs_ref_GRCh37_chr11.fa.gz wget ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/hs_ref_GRCh37_chr11.mfa.gz
Chromosome 16
wget ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/hs_ref_GRCh37_chr16.agp.gz wget ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/hs_ref_GRCh37_chr16.fa.gz wget ftp://ftp.ncbi.nih.gov/genomes/H_sapiens/Assembled_chromosomes/hs_ref_GRCh37_chr16.mfa.gz
Look at the Language Reference
Install the Debugger / Editor
ugene
In Linux
sudo apt-get install ugene
Or download binaries for Mac and Windows from
http://ugene.unipro.ru/
See the 3D Structure
http://www.ncbi.nlm.nih.gov/sites/entrez?db=structure&term=HBA1&submit=Go
Sequence
- Open the files GRCCh37 in ugene
- Do CTRL-G and go to location: 226679 to the start of the Gene
- Enable the display of the Aminoacid Translation
- Do CTRL-G and go to location: 226716 to the start of the mRNA translated zone
- Compare with the genetic code below (Protein Sequence of Aminoacids)
MVLSPADKTNVKAAWGKVGAHAGEYGAEALERMFLSFPTTKTYFPHFDLSHGSAQVKGHGKKVADALTNA VAHVDDMPNALSALSDLHAHKLRVDPVNFKLLSHCLLVTLAAHLPAEFTPAVHASLDKFLASVSTVLTSK YR
GRC Ch37
- http://www.ncbi.nlm.nih.gov/nuccore/NC_000016.9?from=226637&to=227562&report=graph
- http://www.ncbi.nlm.nih.gov/nuccore/224589807?report=fasta
HuRef
- Homo sapiens chromosome 16, alternate assembly HuRef, whole genome shotgun sequence
- http://www.ncbi.nlm.nih.gov/nuccore/AC_000148.1?&from=144721&to=145562&report=graph&strand=true
Protein Sequence of Aminoacids for Hemoglobin HBA1
Grep on it
Track a Bug
Sickle Cell Dissease
- http://en.wikipedia.org/wiki/Sickle-cell_disease
- The genetic disorder is due to the mutation of a single nucleotide, from a GAG to GTG codon mutation
- The allele responsible for sickle-cell anaemia is autosomal incomplete dominant and can be found on the short arm of chromosome 11
- http://www.nhlbi.nih.gov/health/dci/Diseases/Sca/SCA_WhatIs.html
