Tuesday, May 31, 2011

Fisher's Exact Test

http://en.wikipedia.org/wiki/Fisher%27s_exact_test

The test is useful for categorical data that result from classifying objects in two different ways; it is used to examine the significance of the association (contingency) between the two kinds of classification. So in Fisher's original example, one criterion of classification could be whether milk or tea was put in the cup first; the other could be whether Ms Bristol thinks that the milk or tea was put in first. We want to know whether these two classifications are associated – that is, whether Ms Bristol really can tell whether milk or tea was poured in first. Most uses of the Fisher test involve, like this example, a 2 × 2 contingency table

Tips on applying for an NSERC scholarship or fellowship

http://www.nserc-crsng.gc.ca/Students-Etudiants/Videos-Videos/SFTips_eng.asp

Monday, May 30, 2011

Secret 2007 - Theme Song - Jay Chou

不能说的秘密 (Bu Neng Shuo De Mi Mi)


http://www.youtube.com/watch?v=hWCZaNxXunY


        by Bingrui on Mar 24th 2011   8:29 am
The cold coffee leaves the coaster
I hold my feelings very far back

I work hard wanting to get the past back
You can still see it as always clearly on my face

The most beautiful thing wasn't the rainy day
It was the eaves that you and I once took shelter under from the rain

The images of our memories
As I'm swinging on the swing
The dream starts to not be sweet

You say gradually let go of love
Then you will walk farther
Why go changing
The time that has already been missed

You use your fingertip
To stop me from saying goodbye
I imagine you by my side
Before I completely lose you

You say gradually let go of love
Then you will walk farther
Perhaps the lot of fate
Only let us meet

Only let us love each other
For this one season of autumn
I only discover after they float down
The fragments of this happiness
How am I going to pick them up?

Frédéric Chopin

http://en.wikipedia.org/wiki/Fr%C3%A9d%C3%A9ric_Chopin

Frédéric François Chopin (French pronunciation: [fʁe.de.ʁik ʃɔ.pɛ̃]; Polish: Fryderyk Franciszek Chopin;[1] 22 February or 1 March 1810[2] – 17 October 1849) was a Polish composer, virtuoso pianist, and music teacher, of French–Polish parentage. He was one of the great masters of Romantic music. He is also known as "the poet of the piano".

http://www.imdb.com/title/tt1037850/

Writing a paper: habits of successful authors

http://blogs.nature.com/naturejobs/2011/05/20/writing-a-paper-habits-of-successful-authors

Perl full path

> use Mail::Sender;
> print $INC{'Mail/Sender.pm'};

Saturday, May 28, 2011

HomoloGene

  HomoloGene is a system for automated detection of homologs among the annotated genes of several completely sequenced eukaryotic genomes. 

Friday, May 27, 2011

P.Eng - Professional Engineer

http://www.peng.ca/english/become/how.html

Only licensed engineers have the right to practice engineering in Canada. A P.Eng. is a professional in every sense of the word. Being a P.Eng. means being responsible and accountable for the work you do. It means the Canadian public will respect you as a professional—committed to public safety and continuous learning. And it gives you the mobility to work wherever your job or your career takes you. As the holder of a P.Eng. in one province or territory in Canada, you can easily obtain a licence in any other province or territory, or be licensed to practice in all 13 provinces and territories. You'll have the engineering expertise and licence you need to take full advantage of career opportunities as they arise. The mobility of professional licensure can even open the door to engineering employment possibilities worldwide. In short, the P.Eng. is your ticket to a great career in engineering.


With more than 20 disciplines, from software engineering to civil and industrial engineering, professional engineers design products, processes and systems that protect the environment, enhance the quality of life, health, safety and well-being of Canadians. Managing some of the world's top companies, professional engineers are leaders working at the forefront of emerging technologies, such as genetic modification, and assisted human reproduction. They design bridges, roads and buildings; systems to purify and deliver drinking water, treat waste, and reduce industrial pollution; safer cars, trains, airplanes and transportation networks; and a myriad of other things that people around the world rely on virtually every day. 

Bioinformatics and Integrated Oncology Program Retreat 2011

Ingenuity Pathway
TargetScan
cytopenias - few cells
Comrad - Comrad: a novel algorithmic framework for the integrated analysis of RNA-Seq and WGSS data http://bioinformatics.oxfordjournals.org/content/early/2011/04/09/bioinformatics.btr184.abstract
edge betweeness - edges that occur on many shortest paths between other edges have higher betweenness than those that do not.

triple negative - most aggressive tumour subtype, can't be detected by common markers (ie oestrogen receptor (ER), Her2 - herceptin, progesterone receptor, MUC1, CEA)
  - BRCA1 and BRCA2 are less commonly used because these are found in germ line?

most mutations are found in tp53

http://www.foxnews.com/health/2011/06/02/scientists-testing-new-drug-for-triple-negative-breast-cancer/

The hallmarks of cancerHanahan D, Weinberg RA
www.ncbi.nlm.nih.gov/pubmed/10647931


MammaPrint

OncotypeDX

stroma - environment surrounding tumour (can be useful for prediction?)
epithelium - tumor region

PAM50 gene set  - Parker 2009
www.aruplab.com/pam50
Parker JS, et al. Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol 2009;27(8):1160–7.

Met: Chris Bajdik
http://www.bccrc.ca/dept/cc/chris-bajdik

keynote: Michael Hallett, McGill Centre for Bioinformatics

Wednesday, May 25, 2011

Ukiyoe

http://t3.gstatic.com/images?q=tbn:ANd9GcSKr6EEGPyaErAmf8CZLJOeNNjJE1Ill0-aCgdYkp6q0WZpTJXFOQ


http://t2.gstatic.com/images?q=tbn:ANd9GcRmaOvz9z5TscxtJ1CGsLJJ_w8CHJP-cZ8lIq0kKdP48CmCQcUw
Utagawa Kuniyoshi

Debug Perl

$ perl -d foo.pl

'-d' turns on debugger


Loading DB routines from perl5db.pl version 1.28
Editor support available.

Enter h or `h h' for help, or `man perldebug' for more help.

main::(./foo.pl:172): my $foo;
                                                                                                                                                               DB<1> h
DB<2> c 100

'c 100' means execute code until line 100

hgWiggle

$ cat ~/.hg.conf
db.host=genome-mysql.cse.ucsc.edu
db.user=genome
db.password=

http://genomewiki.ucsc.edu/index.php/Using_hgWiggle_without_a_database

Copying files over the network

rsync -ave ssh source.server:/path/to/source /destination/dir


http://www.crucialp.com/resources/tutorials/server-administration/how-to-copy-files-across-a-network-internet-in-unix-linux-redhat-debian-freebsd-scp-tar-rsync-secure-network-copy.php

Genome Analysis Toolkit

http://www.broadinstitute.org/gsa/wiki/index.php/Introduction

The Genome Analysis Toolkit (GATK) is a structured programming framework designed to enable rapid development of efficient and robust analysis tools for next-generation DNA sequencers. The GATK solves the data management challenge by separating data access patterns from analysis algorithms, using the functional programming philosophy of Map/Reduce. Consequently, the GATK is structured into data traversals and data walkers that interact through a programming contract in which the traversal provides a series of units of data to the walker, and the walker consumes each datum to generate an output for each datum

Tuesday, May 24, 2011

Cheap flights

http://www.studentuniverse.com/fly/airsearchResults.jsp
http://netholidays.tawpartners.com/

ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.

http://www.openbioinformatics.org/annovar/annovar_startup.html

http://www.ncbi.nlm.nih.gov/pubmed/20601685


Nucleic Acids Res. 2010 Sep;38(16):e164. Epub 2010 Jul 3.

ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data.

Source

Center for Applied Genomics, Children's Hospital of Philadelphia, PA 19104, USA. kai@openbioinformatics.org

Abstract

High-throughput sequencing platforms are generating massive amounts of genetic variation data for diverse genomes, but it remains a challenge to pinpoint a small subset of functionally important variants. To fill these unmet needs, we developed the ANNOVAR tool to annotate single nucleotide variants (SNVs) and insertions/deletions, such as examining their functional consequence on genes, inferring cytogenetic bands, reporting functional importance scores, finding variants in conserved regions, or identifying variants reported in the 1000 Genomes Project and dbSNP. ANNOVAR can utilize annotation databases from the UCSC Genome Browser or any annotation data set conforming to Generic Feature Format version 3 (GFF3). We also illustrate a 'variants reduction' protocol on 4.7 million SNVs and indels from a human genome, including two causal mutations for Miller syndrome, a rare recessive disease. Through a stepwise procedure, we excluded variants that are unlikely to be causal, and identified 20 candidate genes including the causal gene. Using a desktop computer, ANNOVAR requires ∼4 min to perform gene-based annotation and ∼15 min to perform variants reduction on 4.7 million variants, making it practical to handle hundreds of human genomes in a day. ANNOVAR is freely available at http://www.openbioinformatics.org/annovar/.

FASTQ data format

http://nar.oxfordjournals.org/content/38/6/1767.full

COXPRESdb: a database to compare gene coexpression in seven model animals.

Nucleic Acids Res. 2011 Jan;39(Database issue):D1016-22. Epub 2010 Nov 16.
COXPRESdb: a database to compare gene coexpression in seven model animals.
Obayashi T, Kinoshita K.
http://coxpresdb.jp

Phased vs Unphased Genotypes

http://www.nature.com/nrg/journal/v12/n10/full/nrg3054.html
phased haplotypes, which identify the alleles that are co-located on the same chromosome. Because sequence and SNP array data generally take the form of unphased genotypes, it is not directly observed which of the two parental chromosomes, or haplotypes, a particular allele falls on.

 
http://biostar.stackexchange.com/questions/7869/what-are-phased-and-unphased-genotypes

Phased data (use BEAGLE
(Browning and Browning 2007) ) are ordered along one chromosome and so from these data you know the haplotype (set of SNPs in a region). Unphased data are simply the genotypes without regard to which one of the pair of chromosomes holds that allele.  to identify paternally transmitted alleles


parent-offspring trio

 If what you are studying are correlations between, say, pairs of SNPs, and can be influenced by recombination, like linkage disequilibrium or selective sweeps, then you need phased data.

http://www.biomedcentral.com/1471-2164/9/356

Single Nucleotide Polymorphism or SNP is a DNA sequence variation, occurring when a single nucleotide is altered [7].

 A SNP site that contains two different alleles is called biallelic, a SNP site that contains three different alleles is called triallelic and a SNP site that contains four different alleles is called tetraallelic.

Imputation has resulted in the detection of additional associations, particularly when combining data from multiple studies genotyped on different platforms



Sunday, May 22, 2011

ORICON charts

http://oriconcharts.livejournal.com/

OST

http://lets-look.com/index.php?/forum/109-ost-perf-mvs-lyrics/

Origami tips

http://ultimateorigami.net/231110.html

Invictus

http://en.wikipedia.org/wiki/Invictus

"Invictus" is a short Victorian poem by the English poet William Ernest Henley (1849–1903).


Out of the night that covers me,
Black as the pit from pole to pole,
I thank whatever gods may be
For my unconquerable soul.

In the fell clutch of circumstance
I have not winced nor cried aloud.
Under the bludgeonings of chance
My head is bloody, but unbowed.

Beyond this place of wrath and tears
Looms but the Horror of the shade,
And yet the menace of the years
Finds and shall find me unafraid.

It matters not how strait the gate,
How charged with punishments the scroll,
I am the master of my fate:
I am the captain of my soul.

stoicism

sto·i·cism (st -s z m). n. 1. Indifference to pleasure or pain; impassiveness.

Saturday, May 21, 2011

Man by choice

We are male by birth, we are man by choice
Pst. Kong Hee

Albert Einstein

"There are two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle."

Friday, May 20, 2011

First Exon Finder

http://rulai.cshl.edu/tools/FirstEF/

FirstEF* (First Exon Finder) is a 5' terminal exon and promoter prediction program. It consists of different discriminant functions structured as a decision tree. The probabilistic models are optimized to find potential first donor sites and CpG-related and non-CpG-related promoter regions based on discriminant analysis. For every potential first donor site (GT) and an upstream promoter region, FirstEF decides whether or not the intermediate region can be a potential first exon, based on a set of quadratic discriminant functions.

DECIPHER - chromosomal imbalance

DECIPHER

The DECIPHER database of submicroscopic chromosomal imbalance collects clinical information about chromosomal microdeletions/duplications/insertions, translocations and inversions

DECIPHER is an online repository of CNV and phenotype data whose goal is to enable the clinical interpretation of CN variation (Corpas et al., 2012). The web interface includes a number of tracks (associated syndrome, CNV consensus track, haplo-insufficiency track) that facilitate data interpretation. 


http://decipher.sanger.ac.uk/

Transcription factors - Wasserman

http://www.google.com/url?sa=t&source=web&cd=4&ved=0CDAQFjAD&url=http%3A%2F%2Fwww.bioinformatics.ca%2Ffiles%2FCBW%2520-%2520presentations%2FGeneLists_2009_Module%25203%2FGeneLists_2009_Module%25203.ppt&rct=j&q=wasserman%20lab%20tools%20orcatk%20pazar%20jaspar&ei=7-_WTZy1JIy2sAOu8tyxBw&usg=AFQjCNHMoqtLZvQRi4Z24IfYbD04Nz3VOA&sig2=dBHUmlrQ970iLe39ZSaVJg&cad=rja

bioinformatics.ca/files/GeneLists_Day2-Module3.ppt

Allen Miner - Tool for identification of genes expressed in patterns of interest using the Allen Brain Atlas

http://research.janelia.org/davis/allenminer/

Bioinformatics conferences

http://www.conference-service.com/conferences/bioinformatics.html

http://kevin-gattaca.blogspot.com/2010/11/2011-bioinformatics-conferences.html

Thursday, May 19, 2011

Ralph Waldo Emerson

"He thought it happier to be dead, To die for Beauty, than live for bread."


Ralph Waldo Emerson

Wednesday, May 18, 2011

sed - delete first lines in the file

# delete the first 10 lines of a file
 sed '1,10d'
http://www.eng.cam.ac.uk/help/tpl/unix/sed.html

Record videos, create swf, demo

http://www.techsmith.com/jing/

Rsync for copying / backing up files

              rsync -avz foo:src/bar /data/tmp

       This would recursively transfer all files from the directory src/bar on
       the machine foo into the /data/tmp/bar directory on the local  machine.
       The  files  are  transferred in "archive" mode, which ensures that sym‐
       bolic links, devices, attributes,  permissions,  ownerships,  etc.  are
       preserved  in  the transfer.  Additionally, compression will be used to
       reduce the size of data portions of the transfer.

Robert Tjian

Robert Tjian

Howard Hughes Investigator and Professor of Biochemistry and Molecular Biology*
*And Affiliate, Division of Genetics and Development

Biochemistry of Transcription and Chromatin Transactions: Over the past 20 years, our lab has identified, isolated and characterized a large number (~100) of essential drosophila and human transcription factors. These regulatory proteins that include enhancer/promoter recognition factors, core RNA pol II initiation factors and co-activators form large multi-subunit complexes at promoter DNA to mediate transcription initiation and decode the genome. Our recent studies indicate that large co-activator complexes play a critical role in mediating both universal as well as cell type specific networks of gene transcription and can serve as the interface between transcription and chromatin regulation.

we recently completed a study revealing the mechanism by which the mutant Htt protein (responsible for Huntington's, a glutamine expansion neuro-degenerative disease) disrupts specific interactions between the human transcription factor Sp1 and its' target co-activator.

 In particular, we have focused on the transcription factor Lmxla which appears to be a key regulator that drives the differentiation of dopaminergic neurons implicated in Parkinsons disease.

http://mcb.berkeley.edu/index.php?option=com_mcbfaculty&name=tjianr

Hongkai Lab ChIP software

http://jilab.biostat.jhsph.edu/index_files/software.htm

OpenPetra

OpenPetra is a free and easy-to-use administration software package for non-profit organizations.

http://sourceforge.net/blog/potm-201105/

Tuesday, May 17, 2011

Kill dandelions

http://www.plantea.com/dandelions.htm

Python bioinformatics toolkit

Python Bioinformatics Toolkit, PyCogent
http://pycogent.wordpress.com/

BioPython
http://biopython.org/wiki/Main_Page

Perl bioinformatics modules

libgd - graphics - https://bitbucket.org/pierrejoye/gd-libgd
ensembl core API - http://uswest.ensembl.org/info/docs/api/api_installation.html
bioperl-live - www.bioperl.org

http://pazar.cvs.sourceforge.net/viewvc/pazar/

Prime numbers

#include
_(__,___,____){___/__<=1?_(__,___+1,____):!(___%__)?_(__,___+1,0):___%__==___/__&&!____?(printf("%d\t",___/__),_(__,___+1,0)):___%__>1&&___%__<___/__?_(__,1+
___,____+!(___/__%(___%__))):___<__*__?_(__,___+1,____):0;}main(){_(100,0,0);}

$ gcc -o foo foo.c

IUB nucleotide codes

http://biocorp.ca/IUB.php

Code Definition Mnemonic
A    Adenine    A
C    Cytosine    C
G    Guanine    G
T     Thymine   T
R     AG       puRine 
Y     CT         pYrimidine
K     GT           Keto
M     AC        aMino
S      GC          Strong
W    AT           Weak
B     CGT         Not A (B follows A)
D    AGT         Not C (D follows C)
H    ACT         Not G (H follows G)
V    ACG         Not T (U follows T but U is uracil, so V)
N   AGCT       aNy

The Art of War by Sun Tzu

Attack is the secret of defense; defense is the planning of an attack.

- Chang Yu, The Art of War by Sun Tzu

http://www.online-literature.com/suntzu/artofwar/9/

Monday, May 16, 2011

wikidot.com - Free and Pro wiki hosting

wikidot.com

Penicillin from Moldy Bread

Muhammad Ali Quotes


"If they can make penicillin out of moldy bread, they can sure make something out of you."

Friday, May 13, 2011

Cancer from a Bioinformatics Perspective Computational Gene Regulation

http://www.mcb.mcgill.ca/~hallett/CompCancer2009home.htm


Mike Hallett, hallett@mcb.mcgill.ca
Bellini Building, Room 434
514-398-5928

Realtylink - Home listing

http://www.realtylink.org/

Promoter analysis

Regulatory regions, in general, tend to be DNase sensitive, and promoters are particularly DNase sensitive.

http://genome.ucsc.edu/cgi-bin/hgTrackUi?hgsid=195093397&c=chr12&g=wgEncodeReg

http://www.scfbio-iitd.res.in/tutorial/promoter.html
Conserved eukaryotic promoter elements Consensus sequence
CAAT box   = GGCCAATCT
TATA box   = TATAA
GC box        = GGGCGG
CAP site      = TAC

http://bip.weizmann.ac.il/toolbox/seq_analysis/promoters.html


  • EPD - Eukaryotic Promoter Database at EMBL, Heidelberg 


  • dbTSS - database of Transcriptions Start Sites 


  • Jaspar - The high-quality transcription factor binding profile database 


  • PLACE - Database of Plant Cis-acting Regulatory DNA Elements 


  • PlantCare - a Plant Promoters database 


  • PlantPromDB - A Database of Plant Promoter Sequences 


  • SCPD - the Promoter Database of Saccharomyces cerevisiae 


  • TFD - the IFTI-MIRAGE website 


  • TransFac - Gene Regulation 


  • TRED - Transcriptional Regulatory Element Database 


  • TRRD - Transcription Regulatory Regions Database 


  • Dragon TF Integrator 


  • cisRED database holds conserved sequence motifs identified by genome scale motif discovery  



  • http://www.bios.net/daisy/promoters/239/g1/240.html

    PROMOTER ELEMENTS
    1. Core promoter - the minimal portion of the promoter required to properly initiate transcription
    • Transcription Start Site (TSS)
    • Approximately -34
    • A binding site for RNA polymerase
    • General transcription factor binding sites
    2. Proximal promoter - the proximal sequence upstream of the gene that tends to contain primary regulatory elements
    • Approximately -250
    • Specific transcription factor binding sites                

    THE RNA POLYMERASE II CORE PROMOTER

    Annual Review of Biochemistry
    Vol. 72: 449-479 (Volume publication date July 2003)
    DOI: 10.1146/annurev.biochem.72.121801.161520
    Stephen T. Smale1 and James T. Kadonaga2
    http://www.annualreviews.org/doi/full/10.1146/annurev.biochem.72.121801.161520?url_ver=Z39.88-2003&rfr_id=ori:rid:crossref.org&rfr_dat=cr_pub%3dpubmed


    Wray GA, Hahn MW, Abouheif E, Balhoff JP, Pizer M, Rockman MV, Romano LA.
    The evolution of transcriptional regulation in eukaryotes.
    Mol Biol Evol. 2003 Sep;20(9):1377-419.
    PMID:12777501
    Highly recommended for integrating recent data on chromatin modifications with prior knowledge about transcription in eukaryotes. 43 pages long!

    Heintzmann ND and Ren B.
    The gateway to transcription: identifying, characterizing and understanding promoters in the eukaryotic genome.
    Cell Mol Life Sci. 2007 Feb; 64(4):386-400. 15 pages long.
    PMID: 17171231

    http://onlinelibrary.wiley.com/doi/10.1111/j.1471-4159.2006.04111.x/full

    TATA-less core promoters, as discovered in the Snca proximal promoter, often initiate from numerous start sites (cluster) that can be distributed over a region of about 50–100 nucleotides (Butler and Kadonaga 2002). In contrast, transcription from TATA core promoters occurs from a single site.


    often, alternate promoters are associated with tissue-specific or developmentally-regulated gene expression (Schibler and Sierra 1987Valdenaire et al. 1994).

    Thursday, May 12, 2011

    Volunteer

    GoVolunteer.ca
    This online database managed by Vantage Point (formerly Volunteer Vancouver) has a collection of up-to-date positions with local organizations in Metro Vancouver. You can search by the type of activity you are looking for or by the type of organization you want to volunteer for, as well as by when and where you are available.

    VolWeb.ca Registering on this site allows you to find out about a variety of events from major international sport tournaments to community festivals and fundraisers that are all actively looking for volunteers. Most event-based volunteer opportunities are short term commitments, making them easy to fit into your already busy schedule. After you register you will be guided to opportunities that fit your schedule, location, interests, languages, and skills.

    RedBook Online This online database is a resource of all community service groups in a variety of communities, including non-profit agencies, advocacy groups, health care facilities, social clubs offering community service, and professional associations. You can search by city and key words to find organizations that interest you.
    Volunteer BC This website contains a list of all the volunteer centres in BC, some of which advertise volunteer opportunities unique to their community. There are also many volunteer-related workshops and events offered.
    Volunteer Canada This website can direct you to volunteer centres in most provinces and territories if you live outside of Metro Vancouver.
    Charity Village This website features volunteer positions as well as news and resources for the non-profit sector.

    Wednesday, May 11, 2011

    hmChIP is a database of genome-wide chromatin immu-noprecipitation (ChIP) data in human and mouse.

    http://jilab.biostat.jhsph.edu/database/cgi-bin/hmChIP.pl

    Currently, the database contains >2000 samples from >500 ChIP-seq and ChIP-chip experiments, representing a total of >170 proteins and >10,000,000 protein-DNA interactions. A web server provides interface for database query.

    Tuesday, May 10, 2011

    Statistics notes

    http://faculty.chass.ncsu.edu/garson/PA765/statnote.htm

    Ubuntu show all open windows - Windows Key + A

    Ubuntu show all open windows - Windows Key + A

    Research - Einstein

    “If we knew what it was we were doing, it would not be called research, would it?”

     Albert Einstein quotes (German born American Physicist who developed the special and general theories of relativity. Nobel Prize for Physics in 1921. 1879-1955)

    Monday, May 9, 2011

    ChIP-seq data

    http://www.wip.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GPL9052

    http://www.illumina.com/Documents/products/datasheets/datasheet_chip_sequence.pdf

    The Textile Plot: A New Linkage Disequilibrium Display of Multiple-Single Nucleotide Polymorphism Genotype Data

    http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0010207

    The Textile Plot: A New Linkage Disequilibrium Display of Multiple-Single Nucleotide Polymorphism Genotype Data

    Natsuhiko Kumasaka1*, Yusuke Nakamura1,2, Naoyuki Kamatani1

    1 Center for Genomic Medicine, RIKEN, Tokyo, Japan, 2 The Institute of Medical Science, The University of Tokyo, Tokyo, Japan

    Advances in high-throughput genotyping technology have enabled us to identify remarkably dense SNP (Single Nucleotide Polymorphism) genotype markers on human chromosomes [1]. Linkage disequilibrium (LD) is a topic of interest because it impacts the search for disease-susceptibility loci in genome-wide association studies [2], [3] and can reveal underlying historical and biological processes, such as selection [4], [5], mutation [6], recombination [7], [8] and population history [9].
    Graphical representations of LD for multiple-SNP genotypes have been developed to assess the presence of LD in practical data sets. Various pairwise LD statistics, such as or (reviewed in, e.g., [10]), can be shown by triangular heat map displays [11] in which the color shading indicates the strength and distribution of the pairwise LD. A segment with consistently high LD, a so-called LD block, is visually apparent in such displays. From these displays, it is clear that LD is discontinuous and heterogeneous over entire human chromosomes [7]. To incorporate such heterogeneity into further genetic and statistical analyses, the visualization of pairwise LD is now being recognized as a way to maximize insight into the LD present in multiple-SNP genotype data.

    http://bioinformatics.oxfordjournals.org/content/21/2/263.long
    Haploview: analysis and visualization of LD and haplotype maps

    1. J. C. Barrett1,*,
    2. B. Fry2,
    3. J. Maller1 and
    4. M. J. Daly1,3
    http://www.broad.mit.edu/mpg/haploview/

    The role of leucine-rich repeat kinase 2 (LRRK2) in Parkinson's disease

    http://www.nature.com/nrn/journal/v11/n12/full/nrn2935.html

    Nature Reviews Neuroscience 11, 791-797 (December 2010) | doi:10.1038/nrn2935

    The role of leucine-rich repeat kinase 2 (LRRK2) in Parkinson's disease

    Mark R. Cookson1 About the author

    Sunday, May 8, 2011

    Important

    Don't lose sight of what is *really* important. - redxink

    Beauty - Havelock Ellis

    "The absence of flaw in beauty is itself a flaw."
    Havelock Ellis

    Thursday, May 5, 2011

    PLoS Ten Simple Rules

    PLoS 10 Simple Rules

    Here's the collection
    http://www.ploscollections.org/downloads/TenSimpleRulesCollection.pdf

    http://www.ploscollections.org/article/browseIssue.action?issue=info:doi/10.1371/issue.pcol.v03.i01

    http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002108

    http://www.sicb.org/newsletters/fa97nl/sicb/poster.html

    http://www.med.ubc.ca/__shared/assets/patrick_poster6964.pdf

    Manuscript review
    http://interactive.snm.org/docs/A_systematic_guide_to_reviewing_a_manuscript.pdf

    F1000 Posters
    http://f1000.com/search/posters_beta

    Succeed


    Always bear in mind that your own resolution to succeed is more important than any one thing.

    Abraham Lincoln
    16th president of US (1809 - 1865)

    CHI2011 - Computer Human Interface Conference

    http://chi2011.org/program/index.html

    Antibody finder

    http://www.antibodybeyond.com/abfinder/absearchengine.htm

    LRRK2 GENETICS AND EXPRESSION IN THE PARKINSONIAN BRAIN

    2010
    Queen Square Brain Bank, Department of Molecular
    Neuroscience, Institute of Neurology
    & Institute of Human Genetics and Health,
    University College London
    Simone Sharma

    Genotype Imputation

    This technique allows geneticists to accurately
    evaluate the evidence for association at genetic markers that are not di-
    rectly genotyped.

    imputing missing
    genotypes for a set of individuals using informa-
    tion on their close relatives.

    Added predictive value of high-throughput molecular data to clinical data and its validation

    Brief Bioinform. 2011 Jan 18. [Epub ahead of print]
    Added predictive value of high-throughput molecular data to clinical data and its validation.
    Boulesteix AL, Sauerbrei W.

    In this article, we have reviewed a number of pro-
    cedures that can be used to validate added predictive
    value based on validation data as well as methods to
    assess added predictive value using a single training
    data set.

    Wednesday, May 4, 2011

    The Google Story

    Vise, David A., and Mark Malseed. The Google Story: Inside the Hottest Business, Media and Technology Success of Our Time. Paperback ed. Dell Pub., 2006.

    The company name Google is a misspelling of the word "Googol"[3] made by founders Larry Page and Sergey Brin, as described in the book The Google Story by David A. Vise.

    Connectomics

    http://news.harvard.edu/gazette/story/2011/03/web-crawling-the-brain/
    http://www.ted.com/talks/sebastian_seung.html
    http://en.wikipedia.org/wiki/Connectome
    http://blog.ted.com/2010/07/15/report_from_ted_8/

    Leo Tolstoy

    "Everyone thinks of changing the world, but no one thinks of changing himself."

    Tuesday, May 3, 2011

    Education: The PhD factory

    http://www.nature.com/news/2011/110419/full/472276a.html

    The Sequence Alignment/Map format and SAMtools.

    BFAST facilitates the fast and accurate mapping of short reads to reference sequences, where mapping billions of short reads with variants is of utmost importance.
    http://sourceforge.net/projects/bfast/


    Tracetuner is a tool for base and quality calling of trace files from DNA sequencing instruments. Originally developed by Paracel, this code base was released as open source in 2006 by Celera.

    http://sourceforge.net/projects/tracetuner/



    Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
    http://bowtie-bio.sourceforge.net/index.shtml

    http://www.ncbi.nlm.nih.gov/pubmed/19505943


    Bioinformatics. 2009 Aug 15;25(16):2078-9. Epub 2009 Jun 8.

    The Sequence Alignment/Map format and SAMtools.

    Source

    Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, CB10 1SA, UK, Broad Institute of MIT and Harvard, Cambridge, MA 02141, USA.

    Abstract

    SUMMARY: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. AVAILABILITY: http://samtools.sourceforge.net.