Wednesday, August 31, 2011

Mayo Clinic Finds Genetic Variation That Protects Against Parkinson's Disease

http://www.marketwatch.com/story/mayo-clinic-finds-genetic-variation-that-protects-against-parkinsons-disease-2011-08-31

"The idea that Parkinson's disease occurs mostly in a random sporadic fashion is changing," says lead investigator Owen Ross, Ph.D., a neuroscientist at Mayo Clinic Florida. "Our study, one of the largest to date in the study of the genetics of Parkinson's disease, shows that a single gene, LRRK2, harbors both rare and common variants that in turn alter the susceptibility to PD in diverse populations.''

In 2004, Mayo researchers led by Dr. Wszolek discovered that the little understood LRRK2 gene was responsible for causing a form of "familial" or inherited Parkinson's. "Through this study and subsequent follow-up investigation, we and others identified a LRRK2 variant (G2019S) which turned out to be the most common genetic cause of familial PD yet found. For example, it is found in more than 30 percent of Arab-Berber patients with the disease," he says. To date, seven such familial pathogenic LRRK2 variants have been discovered in different ethnic populations.

http://health.usnews.com/health-news/family-health/boomer-health/articles/2011/08/31/more-evidence-links-genes-to-parkinsons
TUESDAY, Aug. 30 (HealthDay News) -- A genetic variation that reduces the risk of Parkinson's disease by nearly 20 percent in many populations has been found by an international team of scientists.

"The finding that some variants in the LRRK2 gene can reduce the risk of Parkinson's disease is very interesting," said Dr. Andrew Feigin, associate professor of neurology and molecular medicine at the Feinstein Institute for Medical Research in Manhasset, N.Y. "[It] suggests that in addition to there likely being numerous genetic mutations that increase the risk of apparently sporadic disorders, there are also likely to be many genetic variations that reduce risk," he added.

Localizing language in the brain

http://www.labspaces.net/112920/Localizing_language_in_the_brain_

Fedorenko, E., & Kanwisher, N. (2011). Some Regions within Brocaʼs Area Do Respond More Strongly to Sentences than to Linguistically Degraded Stimuli: A Comment on Rogalsky and Hickok (2011). Journal of Cognitive Neuroscience. 23(10): 2632-2635.

http://web.mit.edu/bcs/nklab/publications.shtml

"Brains are different in their folding patterns, and where exactly the different functional areas fall relative to these patterns," Fedorenko says. "The general layout is similar, but there isn't fine-grained matching." So, she says, analyzing data by "aligning brains in some common space … is just never going to be quite right."

Ideally, then, data would be analyzed for each subject individually; that is, patterns of activity in one brain would only ever be compared to patterns of activity from that same brain. To do this, the researchers spend the first 10 to 15 minutes of each fMRI scan having their subject do a fairly sophisticated language task while tracking brain activity. This way, they establish where the language areas lie in that individual subject, so that later, when the subject performs other cognitive tasks, they can compare those activation patterns to the ones elicited by language.

Rare De Novo Variants Associated with Autism Implicate a Large Functional Network of Genes Involved in Formation and Function of Synapses

http://www.cell.com/neuron/abstract/S0896-6273%2811%2900439-9

Authors
Sarah R. Gilman, Ivan Iossifovsend email, Dan Levy, Michael Ronemus, Michael Wigler, Dennis Vitkupsend email

* Highlights
* Rare de novo CNVs associated with autism contain functionally connected genes
* NETBAG method identifies a significant functional network affected by rare variants
* Identified network is related to synaptogenesis, axon guidance, and neuronal motility
* Genes perturbed in females carry more weight in the network than genes in males

Summary

Identification of complex molecular networks underlying common human phenotypes is a major challenge of modern genetics. In this study, we develop a method for network-based analysis of genetic associations (NETBAG). We use NETBAG to identify a large biological network of genes affected by rare de novo CNVs in autism. The genes forming the network are primarily related to synapse development, axon targeting, and neuron motility. The identified network is strongly related to genes previously implicated in autism and intellectual disability phenotypes. Our results are also consistent with the hypothesis that significantly stronger functional perturbations are required to trigger the autistic phenotype in females compared to males. Overall, the presented analysis of de novo variants supports the hypothesis that perturbed synaptogenesis is at the heart of autism. More generally, our study provides proof of the principle that networks underlying complex human phenotypes can be identified by a network-based functional analysis of rare genetic variants.

Janet Thornton, Director of the European Bioinformatics Institute

http://www.research-europe.com/index.php/2011/08/janet-thornton-director-of-the-european-bioinformatics-institute/

EMBRACE is working towards integrating major databases and software tools in bioinformatics, using existing methods and emerging Grid service technologies, driven by an expanding set of test problems representing key issues for bioinformatics service providers and end-user biologists.

BioSapiens supported a large-scale, concerted effort to annotate genome data by laboratories distributed around Europe, using both informatics tools and input from experimentalists. It developed new methods for functional analysis in silico and provided servers and clients for annotating genomes and proteomes with a range of different types of biological data. These methods were applied to specific challenges, including annotating the human genome.

ENFIN aims to provide Europe-wide integration of computational approaches to systems biology. It has developed a suite of analysis tools for systems biologists and platforms to integrate these tools. Perhaps most importantly, all three networks have contributed significantly towards building a European Research area for bioinformatics, especially by being a major influence in the development of ELIXIR.

RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis

Summary: While meta-analysis provides a powerful tool for analyzing microarray experiments by combining data from multiple studies, it presents unique computational challenges. The Bioconductor package RankProd provides a new and intuitive tool for this purpose in detecting differentially expressed genes under two experimental conditions. The package modifies and extends the rank product method proposed by Breitling et al., [(2004)FEBS Lett., 573, 83–92] to integrate multiple microarray studies from different laboratories and/or platforms. It offers several advantages over t-test based methods and accepts pre-processed expression datasets produced from a wide variety of platforms. The significance of the detection is assessed by a non-parametric permutation test, and the associated P-value and false discovery rate (FDR) are included in the output alongside the genes that are detected by user-defined criteria. A visualization plot is provided to view actual expression levels for each gene with estimated significance measurements.

Availability: RankProd is available at Bioconductor http://www.bioconductor.org. A web-based interface will soon be available at http://cactus.salk.edu/RankProd

Contact:fhong@salk.edu

Supplementary information: Supplementary data are available at Bioinformatics online.

Tuesday, August 30, 2011

Multiregional gene expression profiling identifies MRPS6 as a possible candidate gene for Parkinson's disease.

http://www.ncbi.nlm.nih.gov/pubmed/17193926

Combining large-scale gene expression approaches and bioinformatics may provide insights into the molecular variability of biological processes underlying neurodegeneration. To identify novel candidate genes and mechanisms, we conducted a multiregional gene expression analysis in postmortem brain. Gene arrays were performed utilizing Affymetrix HG U133 Plus 2.0 gene chips. Brain specimens from 21 different brain regions were taken from Parkinson's disease (PD) (n = 22) and normal aged (n = 23) brain donors. The rationale for conducting a multiregional survey of gene expression changes was based on the assumption that if a gene is changed in more than one brain region, it may be a higher probability candidate gene compared to genes that are changed in a single region. Although no gene was significantly changed in all of the 21 brain regions surveyed, we identified 11 candidate genes whose pattern of expression was regulated in at least 18 out of 21 regions. The expression of a gene encoding the mitochondria ribosomal protein S6 (MRPS6) had the highest combined mean fold change and topped the list of regulated genes. The analysis revealed other genes related to apoptosis, cell signaling, and cell cycle that may be of importance to disease pathophysiology. High throughput gene expression is an emerging technology for molecular target discovery in neurological and psychiatric disorders. The top gene reported here is the nuclear encoded MRPS6, a building block of the human mitoribosome of the oxidative phosphorylation system (OXPHOS). Impairments in mitochondrial OXPHOS have been linked to the pathogenesis of PD.

Rostral, caudral

rostral "beak" (think rooster) = anterior = head
caudal "tail" = posterior

http://en.wikipedia.org/wiki/Anatomical_terms_of_location

Published studies examining alpha-synuclein expression in Parkinson's Disease

The ups and downs of alpha-synuclein mRNA expression
http://www.ncbi.nlm.nih.gov/pubmed/17094104

- selecting of appropriate "housekeeping" gene for expression normalization is very important
- used four housekeeping genes GAPDH, synaptophysin, HPRT, YWHAZ

brain regions with varying levels of alpha-synuclein pathology:
- occipital lobe (resistant)
- putamen (intermediate)
- amygdala (vulnerable)
- substantia nigra (highly vulnerable)

Island Getaways

Bora Bora Islands, Oceania (near Australia, Japan in the Pacific Ocean)
http://wikitravel.org/en/Bora_Bora

Cook Islands, Oceania, Polynesia
http://wikitravel.org/en/Cook_Islands

Monday, August 29, 2011

Parkinson's Disease and alpha-synuclein papers

Variant in the 3' region of SNCA associated with Parkinson's disease and serum α-synuclein levels.
http://www.ncbi.nlm.nih.gov/pubmed/21853288?dopt=Abstract

The effect of S129 phosphorylation on the interaction of {alpha}-synuclein with synaptic and cellular membranes.
http://www.ncbi.nlm.nih.gov/pubmed/21849493?dopt=Abstract

Parkinson's disease induced pluripotent stem cells with triplication of the α-synuclein locus.
http://www.ncbi.nlm.nih.gov/pubmed/21863007?dopt=Abstract

Phosphorylated {alpha}-synuclein can be detected in blood plasma and is potentially a useful biomarker for Parkinson's disease.
Phosphorylated {alpha}-synuclein can be detected in blood plasma and is potentially a useful biomarker for Parkinson's disease.

Post mortem cerebrospinal fluid α-synuclein levels are raised in multiple system atrophy and distinguish this from the other α-synucleinopathies, Parkinson's disease and Dementia with Lewy bodies.
http://www.pdonlineresearch.org/resources/citations/post-mortem-cerebrospinal-fluid-synuclein-levels-are-raised-multiple-system-atro

Genetic Regulation of α-Synuclein mRNA Expression in Various Human Brain Tissues
http://www.plosone.org/article/info:doi%2F10.1371%2Fjournal.pone.0007480

SNPs that affect expression (eSNPs)

SNPs that affect expression (eSNPs) - not in protein-coding regions
http://www.nature.com/ng/journal/v43/n7/full/ng.859.html
http://en.wikipedia.org/wiki/Single-nucleotide_polymorphism

regulatory SNPs (rSNPs), or QTL (quantitative trait loci)

PhenoGen
http://www.ncbi.nlm.nih.gov/pubmed/17760997
http://phenogen.ucdenver.edu/PhenoGen/index.jsp

Review
http://www.ncbi.nlm.nih.gov/pubmed/18668207

Plant system (might reference human databases)
http://www.ncbi.nlm.nih.gov/pubmed/20964836

GeneNetwork (webqtl)
http://www.genenetwork.org/webqtl/main.py
http://www.genenetwork.org/tutorial/WebQTLTour/

GeneNetwork is a group of linked data sets and tools used to study complex networks of genes, molecules, and higher order gene function and phenotypes. GeneNetwork combines more than 25 years of legacy data generated by hundreds of scientists together with sequence data (SNPs) and massive transcriptome data sets (expression genetic or eQTL data sets). The quantitative trait locus (QTL) mapping module that is built into GN is optimized for fast on-line analysis of traits that are controlled by combinations of gene variants and environmental factors. GeneNetwork can be used to study humans, mice (BXD, AXB, LXS, etc.), rats (HXB), Drosophila, and plant species (barley and Arabidopsis). Most of these population data sets are linked with dense genetic maps (genotypes) that can be used to locate the genetic modifiers that cause differences in expression and phenotypes, including disease susceptibility.

Users are welcome to enter their own private data directly into GeneNetwork to exploit the full range of analytic tools and to map modulators in a powerful environment. This combination of data and fast analytic functions enable users to study relations between sequence variants, molecular networks, and function.

Friday, August 26, 2011

Shutdown linux automatically

www.cyberciti.biz/tips/howto-shutdown-linux-box-automatically.html

# start daemon in background
$ sudo atd

$ at 8pm
> halt
Ctrl+D

# list jobs
$ atq

# remove job 1
$ atrm 1

cron

Ninja Scroll (1993) - Jûbê ninpûchô

http://www.imdb.com/title/tt0107692/

A ninja-for-hire is forced into fighting an old nemesis who is bent on overthrowing the Japanese government. His nemesis is also the leader of a group of demons each with superhuman powers.

A Journeyman ninja by name of Jubei stumbles upon a plague, an evil clan of demons, a national crisis, and a beautiful ninja girl.

Comparative analysis of algorithms for next-generation sequencing read alignment.


Comparative analysis of algorithms for next-generation sequencing read alignment.



http://www.ncbi.nlm.nih.gov/pubmed/21856737


MOTIVATION:

The advent of next-generation sequencing (NGS) techniques presents many novel opportunities for many applications in life sciences. The vast number of short reads produced by these techniques, however, pose significant computational challenges. The first step in many types of genomic analysis is the mapping of short reads to a reference genome, and several groups have developed dedicated algorithms and software packages to perform this function. As the developers of these packages optimize their algorithms with respect to various considerations, the relative merits of different software packages remain unclear. However, for scientists who generate and use NGS data for their specific research projects, an important consideration is choosing the software that is most suitable for their application.

RESULTS:

With a view to comparing existing short read alignment software, we develop a simulation and evaluation suite, Seal, which simulates NGS runs for different configurations of various factors, including sequencing error, indels, and coverage. We also develop criteria to compare the performances of software with disparate output structure (e.g., some packages return a single alignment while some return multiple possible alignments). Using these criteria, we comprehensively evaluate the performances of Bowtie, BWA, mr- and mrsFAST, Novoalign, SHRiMP and SOAPv2, with regard to accuracy and runtime. Conclusion: We expect that the results presented here will be useful to investigators in choosing the alignment software that is most suitable for their specific research aims. Our results also provide insights into the factors that should be considered to use alignment results effectively. Seal can also be used to evaluate the performance of algorithms that use deep sequencing data for various purposes (e.g., identification of genomic variants).

AVAILABILITY:

Seal is available as open-source at http://compbio.case.edu/seal/.

Emulate Galaxy Join operate on interval with R's merge()

http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right

> df1 <- data.frame(CustomerId=c(1:6),Product=c(rep("Toaster",3),rep("Radio",3)))
> df2 <- data.frame(CustomerId=c(2,4,6),State=c(rep("Alabama",2),rep("Ohio",1)))
> df3 <- data.frame(CustomerId=c(3,4,1),Food=c(rep("Pancake",2),rep("Cereal",1)))

> merge(merge(df1, df2, all.x=T), df3, all.x=T)
  CustomerId Product   State    Food
1          1 Toaster      Cereal
2          2 Toaster Alabama    
3          3 Toaster     Pancake
4          4   Radio Alabama Pancake
5          5   Radio        
6          6   Radio    Ohio    

hmm... problem is that the intervals are not exactly the same (like in intersectBed)
but wait! you CAN use intersectBed and read each Bed files into R!!!!


You can also aggregate using the aggregate command


> df1
  CustomerId Product
1          1 Toaster
2          2 Toaster
3          3 Toaster
4          4   Radio
5          5   Radio
6          6   Radio
> aggregate(df1$CustomerId, by=list(df1$Product), FUN=mean)
  Group.1 x
1   Radio 5
2 Toaster 2


http://www.statmethods.net/management/aggregate.html

Twins - Mono- (identical) vs Dizygotic (two eggs, fraternal twins)

http://www.wisegeek.com/what-is-the-difference-between-monozygotic-and-dizygotic-twins.htm
http://answers.yahoo.com/question/index?qid=20061119133111AAEtCnl

Fraternal occur when the mother has two eggs that are fertilized at the same time. They can be boy/boy, girl/girl or boy/girl.

Identical twins occur when the egg splits into two embryos after it has been fertilized. They have to be of the same sex.

Roadtrip

http://www.hellobc.com/roadtrip

Relativity

"When you are courting a nice girl an hour seems like a second. When you sit on a red-hot cinder a second seems like an hour. That's relativity." --Albert Einstein

Thursday, August 25, 2011

Metabolomics (sort of like proteomics)

www1.imperial.ac.uk/medicine/people/e.want



Dr Elizabeth J Want - Faculty of Medicine - Imperial College London




study on inborn error http://en.wikipedia.org/wiki/Inborn_error_of_metabolism

screening: Plasma acylcarnitines analysis by mass spectrometry


metabolite DBs: hmp, chemSpider, 


PCA loadings = variables


PLS, OPLS - supervised (as opposed to PCA) multivariate data analysis approach


microbiome anaylsis, toxicology

20 years of Linux

I'll be celebrating 20 years of Linux with
The Linux Foundation!

Using biomedical networks to prioritize gene–disease associations

http://www.dovepress.com/articles.php?article_id=8144

Understanding the genetic foundations of genetic diseases, such as cancer, Alzheimer disease, or Huntington’s disease, is critical to the development of new diagnostics and treatments. Several computational methods have been used to speed up the discovery process, eg, by selecting the molecular targets for a given disease. However, despite the achievements obtained over recent years, better solutions are still required. This paper presents an innovative computational method that addresses the problem of using disperse biomedical knowledge to select the best candidate genes associated with a disease. The method uses a network representation of current biomedical knowledge that includes biomolecular concepts such as genes, diseases, pathways, and biological process. It also applies information extraction techniques to enrich the network with more dynamic and updated data. A biologically inspired algorithm is applied to this network in order to identify association levels between genes and diseases. The solution proposed here surpasses many limitations of previous methods such as the need for training data. The validation applied demonstrates that the proposed method has best overall results compared with state-of-the-art methods as it also performs especially well for the critical top-rank positions. We believe this method represents a major advance over previous work and that it will be a key tool for future gene–disease association studies.


gene–disease, biomedical networks, prioritization, computational method

Autism

http://www.sciencedirect.com/science/article/pii/S1071909111000040


The NeuroDevNet Autism Spectrum Disorders Demonstration Project 

WIG (wiggle) to BED format conversion

http://genomewiki.ucsc.edu/images/9/9d/FixStepToBedGraph_pl.txt

http://genomewiki.ucsc.edu/index.php/Wiggle_BED_to_variableStep_format_conversion

Brain plasticity

http://blogs.psychcentral.com/relationships/2011/08/three-ways-your-brain-adapts-to-change-1-of-3/

Among the most amazing discovery is plasticity, a remarkable capacity that allows the brain to generate new neurons all the time, according to neuroscientist Antonio Damasio, and to renew and to reorganize its basic structure of neurons indefinitely.

I. Strengthens existing behaviors.

Voronoi diagram


http://www.cs.sunysb.edu/~algorith/files/voronoi-diagrams.shtml

Input Description: A set S of points p_1,...,p_n.


Problem: Decompose the space into regions around each point, such that all the points in the region around p_i are closer to p_i than any other point in S.



http://en.wikipedia.org/wiki/Voronoi_diagram

In mathematics, a Voronoi diagram is a special kind of decomposition of a metric space determined by distances to a specified discrete set of objects in the space, e.g., by a discrete set of points.



In the simplest case, we are given a set of points S in the plane, which are the Voronoi sites. Each site s has a Voronoi cell, also called a Dirichlet cell, V(s) consisting of all points closer to s than to any other site. The segments of the Voronoi diagram are all the points in the plane that are equidistant to the two nearest sites. The Voronoi nodes are the points equidistant to three (or more) sites.

One of the early applications of Voronoi diagrams was by John Snow to study the epidemiology of the 1854 Broad Street cholera outbreak in Soho, England. He showed the correlation between areas on the map of London using a particular water pump, and the areas with most deaths due to the outbreak.
A point location data structure can be built on top of the Voronoi diagram in order to answer nearest neighbor queries, where one wants to find the object that is closest to a given query point. Nearest neighbor queries have numerous applications. For example, one might want to find the nearest hospital, or the most similar object in a database. A large application is vector quantization, commonly used in data compression.
With a given Voronoi diagram, one can also find the largest empty circle amongst a set of points, and in an enclosing polygon; e.g. to build a new supermarket as far as possible from all the existing ones, lying in a certain city.
The Voronoi diagram is useful in polymer physics. It can be used to represent free volume of the polymer.
It is also used in derivations of the capacity of a wireless network.
In climatology, Voronoi diagrams are used to calculate the rainfall of an area, based on a series of point measurements. In this usage, they are generally referred to as Thiessen polygons.
Voronoi diagrams are used to study the growth patterns of forests and forest canopies, and may also be helpful in developing predictive models for forest fires.
Voronoi diagrams are also used in computer graphics to procedurally generate some kinds of organic looking textures.
In autonomous robot navigation, Voronoi diagrams are used to find clear routes. If the points are obstacles, then the edges of the graph will be the routes furthest from obstacles (and theoretically any collisions).
In computational chemistry, Voronoi cells defined by the positions of the nuclei in a molecule are used to compute atomic charges. This is done using the Voronoi deformation density method.
Voronoi Polygons have been used in mining to estimate the reserves of valuable materials, minerals or other resources. Exploratory drillholes are used as the set of points in the Voronoi polygons.

Problem and Solution

"Every solution breeds new problems." 


-- Murphy's Law

Tuesday, August 23, 2011

SenseLab

http://senselab.med.yale.edu/

SenseLab is a long term project to build integrated, multidisciplinary models of neurons and neural systems. The project at present involves a set of eight interrelated databases, containing data on membrane properties of neurons and neuron compartments, with the aim of aiding neuroscientists to integrate this data into computational models of neurons and neuronal microcircuits. The project involves novel informatics approaches to constructing databases and database tools for archiving and comparing neuronal properties across neurons in different brain regions, and providing for efficient interoperability with other neuroscience databases. SenseLab was a member of the original Human Brain Project, and is a current member of the Neuroscience Information Framework.

Cistrome: an integrative platform for transcriptional regulation studies

http://genomebiology.com/2011/12/8/R83/abstract
The increasing volume of ChIP-chip and ChIP-seq data being generated creates a challenge for standard, integrative and reproducible bioinformatics data analysis platforms. We developed a web-based application called Cistrome, based on the Galaxy open source framework. In addition to the standard Galaxy functions, Cistrome has 29 ChIP-chip- and ChIP-seq-specific tools in three major categories, from preliminary peak calling and correlation analyses to downstream genome feature association, gene expression analyses, and motif discovery. Cistrome is available at http://cistrome.org/ap/.

Yourself

"Don't compromise yourself. You are all you've got."
--Janis Joplin

Monday, August 22, 2011

The brain-specific microRNA miR-128b regulates the formation of fear-extinction memory

http://www.nature.com/neuro/journal/vaop/ncurrent/full/nn.2891.html

miR-128b is highly expressed in the frontal cortex, and its host gene, regulator of calmodulin signaling (Rcs, also known as Arpp21) is essential for mediating dopamine transmission9, which is critical in the ILPFC for the formation of fear-extinction memories10.

Installing packages in R

$ R CMD INSTALL cooccur-1471-2105-11-359-S1.GZ -l ~/R/library/

http://math.usask.ca/~longhai/software/installrpkg.html

How to Turn Your iPod touch into an iPhone: 4G Edition

http://lifehacker.com/5636976/how-to-turn-your-ipod-touch-into-an-iphone-4g-edition

Line2 is designed to be a complete phone application, and it delivers. The app itself is free to download and try, but after 30 days you'll need to pay $10/month.

Friday, August 19, 2011

Anime Movies

http://www.comicbookmovie.com/fansites/MelaninMan/news/?a=18610

Mapping Human Cortical Areas In Vivo Based on Myelin Content as Revealed by T1- and T2-Weighted MRI

Myelin evolved as a way to speed conduction along axons, and is present in most long-distance projection neurons of the CNS. Corticospinal projections may be more heavily myelinated because they must traverse long distances to reach their spinal cord targets. These distances are much greater for lower versus upper body projections, and larger axons and heavier myelination may help offset the additional time delay.

http://www.jneurosci.org/content/31/32/11597.full?sid=801aa77f-a396-4cf3-a1cc-262ee432e07b

Regions overlap significantly?

http://biostar.stackexchange.com/questions/5501/how-do-you-calculate-if-two-sets-of-genomic-regions-overlap-significantly

TF co-occurence in the genome
conserved TSS

Cold Spring Harbor (CSH) Asia 2011 Abstracts

http://csh-asia.org/Abstract%20Status/a-system2011_absstat.html

A user's guide to the encyclopedia of DNA elements (ENCODE).

http://www.ncbi.nlm.nih.gov/pubmed/21526222

ChIP-seq and RNA-seq

http://www.nature.com/nmeth/journal/v6/n11s/full/nmeth.1371.html

Genome-wide measurements of protein-DNA interactions and transcriptomes are increasingly done by deep DNA sequencing methods (ChIP-seq and RNA-seq). The power and richness of these counting-based measurements comes at the cost of routinely handling tens to hundreds of millions of reads. Whereas early adopters necessarily developed their own custom computer code to analyze the first ChIP-seq and RNA-seq datasets, a new generation of more sophisticated algorithms and software tools are emerging to assist in the analysis phase of these projects. Here we describe the multilayered analyses of ChIP-seq and RNA-seq datasets, discuss the software packages currently available to perform tasks at each layer and describe some upcoming challenges and features for future analysis tools. We also discuss how software choices and uses are affected by specific aspects of the underlying biology and data structure, including genome size, positional clustering of transcription factor binding sites, transcript discovery and expression quantification.

Best Conferences -- GenomeWeb

http://www.genomeweb.com/best-conferences

Thursday, August 18, 2011

Whole Transcriptome Sequencing Reveals Gene Expression and Splicing Differences in Brain Regions Affected by Alzheimer's Disease

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0016266


Recent studies strongly indicate that aberrations in the control of gene expression might contribute to the initiation and progression of Alzheimer's disease (AD). In particular, alternative splicing has been suggested to play a role in spontaneous cases of AD. Previous transcriptome profiling of AD models and patient samples using microarrays delivered conflicting results. This study provides, for the first time, transcriptomic analysis for distinct regions of the AD brain using RNA-Seq next-generation sequencing technology. Illumina RNA-Seq analysis was used to survey transcriptome profiles from total brain, frontal and temporal lobe of healthy and AD post-mortem tissue. We quantified gene expression levels, splicing isoforms and alternative transcript start sites. Gene Ontology term enrichment analysis revealed an overrepresentation of genes associated with a neuron's cytological structure and synapse function in AD brain samples. Analysis of the temporal lobe with the Cufflinks tool revealed that transcriptional isoforms of the apolipoprotein E gene, APOE-001, -002 and -005, are under the control of different promoters in normal and AD brain tissue. We also observed differing expression levels of APOE-001 and -002 splice variants in the AD temporal lobe. Our results indicate that alternative splicing and promoter usage of the APOE gene in AD brain tissue might reflect the progression of neurodegeneration.

A Network of Genes, Genetic Disorders, and Brain Areas

http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0020907

The network-based approach has been used to describe the relationship among genes and various phenotypes, producing a network describing complex biological relationships. Such networks can be constructed by aggregating previously reported associations in the literature from various databases. In this work, we applied the network-based approach to investigate how different brain areas are associated to genetic disorders and genes. In particular, a tripartite network with genes, genetic diseases, and brain areas was constructed based on the associations among them reported in the literature through text mining. In the resulting network, a disproportionately large number of gene-disease and disease-brain associations were attributed to a small subset of genes, diseases, and brain areas. Furthermore, a small number of brain areas were found to be associated with a large number of the same genes and diseases. These core brain regions encompassed the areas identified by the previous genome-wide association studies, and suggest potential areas of focus in the future imaging genetics research. The approach outlined in this work demonstrates the utility of the network-based approach in studying genetic effects on the brain.

NeuronDB

Neurotransmitters by brain regions
http://senselab.med.yale.edu/neurondb/ndbRegions.aspx?sr=1

Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways

http://www.pnas.org/content/107/28/12698.full

Because mouse models play a crucial role in biomedical research related to the human nervous system, understanding the similarities and differences between mouse and human brain is of fundamental importance. Studies comparing transcription in human and mouse have come to varied conclusions, in part because of their relatively small sample sizes or underpowered methodologies. To better characterize gene expression differences between mouse and human, we took a systems-biology approach by using weighted gene coexpression network analysis on more than 1,000 microarrays from brain. We find that global network properties of the brain transcriptome are highly preserved between species. Furthermore, all modules of highly coexpressed genes identified in mouse were identified in human, with those related to conserved cellular functions showing the strongest between-species preservation. Modules corresponding to glial and neuronal cells were sufficiently preserved between mouse and human to permit identification of cross species cell-class marker genes. We also identify several robust human-specific modules, including one strongly correlated with measures of Alzheimer disease progression across multiple data sets, whose hubs are poorly-characterized genes likely involved in Alzheimer disease. We present multiple lines of evidence suggesting links between neurodegenerative disease and glial cell types in human, including human-specific correlation of presenilin-1 with oligodendrocyte markers, and significant enrichment for known neurodegenerative disease genes in microglial modules. Together, this work identifies convergent and divergent pathways in mouse and human, and provides a systematic framework that will be useful for understanding the applicability of mouse models for human brain disorders.                  

Neuroscience - NRSC 500

http://www.neuroscience.ubc.ca/nrsc500_001.htm#2

from notes posted ....

The Blood-Brain Barrier
Evolutionarily selective pressure and advantage to developing a barrier came from the need to preserve chemical homeostasis from any fluctuations around the synaptic active zones.

BBB acts as a physical, metabolic, and transport barrier restricting traffic of nutrients and other molecules (through epithelial cells, astrocytes)

Posses challenges to drug therapy, as to how to transport drugs past this barrier
-> but lipids can still pass-through (like 1st gen antihistamines for allergies, they pass-through so it makes you dizzy)

stroke and inflammation may cause BBB to break open and cause harm

In situ hybridization
The cDNA can also be used to determine which cells, produce a particular mRNA using in situ hybridization. Labeled nucleic acid is incubated with fixed tissue or cells under conditions where only specifically bound hybrid is stable. Auto radiography reveals the position of endogenous RNA. Controls with RNase help prove that hybridization is to RNA not genetic material.

Immunochemistry

The visualization of antigens in their normal cellular and tissue
environment.The primary antibody, which detects the antigen of interest
is generally detected by a secondary antibody, which is linked to an
enzyme (e.g., horseradish peroxidase or alkaline phosphatase), a
chromogen (e.g., FITC, Fast Red, etc.), or, in the case of electron
microscopy, an electron dense material (a gold particle).

Knock-out, Knock-in, Transgenics

Knock-out: a procedure to disrupt the sequence of a gene and
interfere with its expression.
Knock-in: a variation of gene targeting that uses homologous
recombination but allows expression of altered genetic sequences
in place of the endogenous gene. This approach allows the test of
more subtle mutations than is allowed by a simple knock out.
Transgenics: Introduction of a gene under a particular promoter
into the germ line allows propagation of an organism that will
express or even conditionally express a particular gene. The
transgene is expressed in addition to the normal gene.



Molecular Biology of the Cell. Alberts et
al.
Chapter:
Manipulating Proteins, DNA, and RNA
(Chapter 8 in 4th or 5th Ed.)
or
Recombinant DNA Technology
(Chapter 7 in 3rd Ed.)

IBM produces first 'brain chips'

http://www.bbc.co.uk/news/technology-14574747

IBM has developed a microprocessor which it claims comes closer than ever to replicating the human brain.

Australian biotechs rejoice

http://www.nature.com/nbt/journal/v29/n8/full/nbt0811-676b.html?WT.ec_id=NATUREjobs-20110818

The Australian Department of Innovation, Industry, Science & Research and The Treasury jointly announced an AUS $1.8 ($1.9) billion R&D tax credit last month aimed at boosting biotech companies and other innovation-oriented firms.

Freedom

We don't appreciate what we have until it's gone. Freedom is like that. It's like air. When you have it, you don't notice it."
--Boris Yeltsin

Wednesday, August 17, 2011

Convert Clustal to MAF (Multiple Alignment Format) (TBD)

http://biopython.org/wiki/AlignIO
http://web.archiveorange.com/archive/v/5dAwXKUXOF6l2xgl9yNy

from Bio import AlignIO
 
input_handle = open("example.clw", "rU")
output_handle = open("example.maf", "w")
 
alignments = AlignIO.parse(input_handle, "clustal")
AlignIO.write(alignments, output_handle, "maf")  # TBD ...
 
output_handle.close()
input_handle.close()

TFs

1. Nathaniel D. Heintzman et al., “Histone modifications at human enhancers reflect global cell-type-specific gene expression,” Nature 459, no. 7243 (May 7, 2009): 108-112.

CTCF - insulator
NRSF - repressor
STAT1 - IFN-gamma inducible
p300 - coactivator / enhancer

Mapping and analysis of chromatin state dynamics in nine human cell types

http://www.nature.com/nature/journal/v473/n7345/full/nature09906.html

Chromatin profiling has emerged as a powerful means of genome annotation and detection of regulatory activity. The approach is especially well suited to the characterization of non-coding portions of the genome, which critically contribute to cellular phenotypes yet remain largely uncharted. Here we map nine chromatin marks across nine cell types to systematically characterize regulatory elements, their cell-type specificities and their functional interactions. Focusing on cell-type-specific patterns of promoters and enhancers, we define multicell activity profiles for chromatin state, gene expression, regulatory motif enrichment and regulator expression. We use correlations between these profiles to link enhancers to putative target genes, and predict the cell-type-specific activators and repressors that modulate them. The resulting annotations and regulatory predictions have implications for the interpretation of genome-wide association studies. Top-scoring disease single nucleotide polymorphisms are frequently positioned within enhancer elements specifically active in relevant cell types, and in some cases affect a motif instance for a predicted regulator, thus suggesting a mechanism for the association. Our study presents a general framework for deciphering cis-regulatory connections and their roles in disease.

Processing microarray data with Bioconductor

http://www.math.ku.dk/~richard/courses/bioconductor2009/handout/18_08_Tuesday/tutorial-article.pdf

Statistical analysis of gene expression data with R and Bioconductor ... MAS5.0 is the historical way Affymetrix used to perform processing on it's arrays. ...

What's Driving Specific Patterns Of Gene Expression Among Cell Types?

What's Driving Specific Patterns Of Gene Expression Among Cell Types?
http://www.sciencedaily.com/releases/2009/03/090318140518.htm

"Our studies show that enhancers play much more prominent role than previously appreciated in cell-type-specific gene expression, helping to explain what causes cells to differentiate into liver or brain or skin cells, or why these cells might become cancerous," said principal investigator Bing Ren, PhD, associate professor of Cellular and Molecular Medicine at the University of California, San Diego School of Medicine and head of the Laboratory of Gene Regulation at the Ludwig Institute for Cancer Research (LICR).

The research team has performed a type of genome-wide analysis called ChIP-chip analysis to locate promoters, enhancers, insulators and other regulatory DNA sequences for each gene, using this approach to identify these elements in multiple cell types and investigate their roles in gene expression. ChIP-chip is used to localize protein binding sites that may help identify functional elements of the genome.

"Using this process, we described signatures, or distinguishing patterns, on histone proteins that enabled us to distinguish promoters and enhancers in the genome," said Ren. "In our analyses, we were surprised to find that the chromatin signatures at promoter sites were similar across all cells. However, we found that enhancers are marked with highly cell-type specific modification patterns. These patterns suggested that enhancers are of primary importance in the differentiation of specific cell types."



Nature 459, 108-112 (7 May 2009) | doi:10.1038/nature07829; Received 17 October 2008; Accepted 26 January 2009; Published online 18 March 2009

Histone modifications at human enhancers reflect global cell-type-specific gene expression

http://www.nature.com/nature/journal/v459/n7243/full/nature07829.html
The human body is composed of diverse cell types with distinct functions. Although it is known that lineage specification depends on cell-specific gene expression, which in turn is driven by promoters, enhancers, insulators and other cis-regulatory DNA sequences for each gene1, 2, 3, the relative roles of these regulatory elements in this process are not clear. We have previously developed a chromatin-immunoprecipitation-based microarray method (ChIP-chip) to locate promoters, enhancers and insulators in the human genome4, 5, 6. Here we use the same approach to identify these elements in multiple cell types and investigate their roles in cell-type-specific gene expression. We observed that the chromatin state at promoters and CTCF-binding at insulators is largely invariant across diverse cell types. In contrast, enhancers are marked with highly cell-type-specific histone modification patterns, strongly correlate to cell-type-specific gene expression programs on a global scale, and are functionally active in a cell-type-specific manner. Our results define over 55,000 potential transcriptional enhancers in the human genome, significantly expanding the current catalogue of human enhancers and highlighting the role of these elements in cell-type-specific gene expression.

Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome
Nature Genetics 39, 311 - 318 (2007). Published online: 4 February 2007 | doi:10.1038/ng1966
Eukaryotic gene transcription is accompanied by acetylation and methylation of nucleosomes near promoters, but the locations and roles of histone modifications elsewhere in the genome remain unclear. We determined the chromatin modification states in high resolution along 30 Mb of the human genome and found that active promoters are marked by trimethylation of Lys4 of histone H3 (H3K4), whereas enhancers are marked by monomethylation, but not trimethylation, of H3K4. We developed computational algorithms using these distinct chromatin signatures to identify new regulatory elements, predicting over 200 promoters and 400 enhancers within the 30-Mb region. This approach accurately predicted the location and function of independently identified regulatory elements with high sensitivity and specificity and uncovered a novel functional enhancer for the carnitine transporter SLC22A5 (OCTN2). Our results give insight into the connections between chromatin modifications and transcriptional regulatory activity and provide a new tool for the functional annotation of the human genome.

Fermat's Last Theorem: xn + yn ≠ zn

xn + yn ≠ zn


32+ 42 = 52, or 9+16 =25


Fermat said, an + bn = cwas only true when n=2


http://www.telegraph.co.uk/technology/google/google-doodle/8706390/Pierre-de-Fermats-birthday-celebrated-in-Google-Doodle.html

Defeat - Michel de Montaigne

"There are some defeats more triumphant than victories."

-- Michel de Montaigne

Tuesday, August 16, 2011

Visual Understanding Environment

http://vue.tufts.edu/

The Visual Understanding Environment (VUE) is an Open Source project based at Tufts University. The VUE project is focused on creating flexible tools for managing and integrating digital resources in support of teaching, learning and research. VUE provides a flexible visual environment for structuring, presenting, and sharing digital information.

Like a mindmapper

Retrieve Sequence Given a Region using EnsEMBL

my $db = Bio::EnsEMBL::DBSQL::DBAdaptor->new()
my $ta = $db->get_TranscriptAdaptor;
my $sa = $db->get_SliceAdaptor;
my $trans = $ta->fetch_by_stable_id($trans_id);
my $coord_sys_name = $trans->slice->coord_system->name;
my $myslice = $sa->fetch_by_region($coord_sys_name, $chr, $start, $end);
my $seq = Bio::LocatableSeq->new(-seq => $myslice->seq, ...)

http://doc.bioperl.org/releases/bioperl-1.0/Bio/LocatableSeq.html
http://biostar.stackexchange.com/questions/9891/get-ensembl-gene-by-species-strain-for-e-coli
http://www.ensembl.org/info/docs/api/core/core_tutorial.html

http://www.kokocinski.net/bioinformatics/ensembl
use Bio::EnsEMBL::DBSQL::DBAdaptor;
my $db = new Bio::EnsEMBL::DBSQL::DBAdaptor(
-host => 'ensembldb.ensembl.org',
-dbname => 'homo_sapiens_core_42_36b',
-user => 'anonymous',
);

my $chrom = "X";
my $start = 100000;
my $end = 200000;
my $strand = 1;
my $slice_adaptor = $db->get_SliceAdaptor;
my $slice = $slice_adaptor->fetch_by_region(
"chromosome",
$chrom,
$start,
$end,
$strand);
print "\nhave slice of ".$slice->seq_region_name()." ".$slice->start()."-".$slice->end();

Phylogeny tree

http://bioweb2.pasteur.fr/phylogeny/intro-en.html

http://phylemon.bioinfo.cipf.es/

Jalview

UGENE

Convert alignment formats (eg. clustalw -> MAF multiple alignment format (supported by UCSC Genome Browser, used by TBA and multiz))
http://biopython.org/wiki/Multiple_Alignment_Format
http://www.ibi.vu.nl/programs/convertalignwww/

Bioinformatics tools

http://guides.uflib.ufl.edu/content.php?pid=13898&sid=93373

http://www.oxfordjournals.org/nar/database/c/

http://bioinformatics.ca/links_directory/

Monday, August 15, 2011

Inferring cancer subnetwork markers using density-constrained biclustering

http://bioinformatics.oxfordjournals.org/content/26/18/i625.abstract


Phuong Dao*, Recep Colak*, Raheleh Salari, Flavia Moser, Elai Davicioni, Alexander Schonhuth**, Martin Ester**
9th European Conference on Computational Biology (ECCB 2010)
Also: Bioinformatics 26(13): 1608-1615 (2010)
* Joint first authorship, **Corresponding author

Abstract

Motivation: Recent genomic studies have confirmed that cancer is of utmost phenotypical complexity, varying greatly in terms of subtypes and evolutionary stages. When classifying cancer tissue samples, subnetwork marker approaches have proven to be superior over single gene marker approaches, most importantly in cross-platform evaluation schemes. However, prior subnetwork-based approaches do not explicitly address the great phenotypical complexity of cancer.

Results: We explicitly address this and employ density-constrained biclustering to compute subnetwork markers, which reflect pathways being dysregulated in many, but not necessarily all samples under consideration. In breast cancer we achieve substantial improvements over all cross-platform applicable approaches when predicting TP53 mutation status in a well-established non-cross-platform setting. In colon cancer, we raise prediction accuracy in the most difficult instances from 87% to 93% for cancer versus non−cancer and from 83% to (astonishing) 92%, for with versus without liver metastasis, in well-established cross-platform evaluation schemes.

Availability: Software is available on request.

Contact: alexsch@math.berkeley.edu; ester@cs.sfu.ca

Supplementary information: Supplementary data are available at Bioinformatics online.

ENCODE downloads

http://genome.ucsc.edu/ENCODE/downloads.html

Sunday, August 14, 2011

10 Tips for how to get the most from your PhD. -- fejes

http://blog.fejes.ca/2011/07/24/10-tips-for-how-to-get-the-most-from-your-phd/

1. Say yes, until you learn to say no.
2. Learn to communicate.
2b. Learn to teach.
3. Take advantage of your university’s resources.
4. Meet your future life partner.
4b. Make some great friends.
5. Decide what you stand for, and stand for it.
6. Keep good notes
7. Be a sponge.
8. Face the challenges.
9. Don’t give up.
10. Enjoy the ride.

Friday, August 12, 2011

Bioinformatics analysis of microarray data.

http://www.ncbi.nlm.nih.gov/pubmed/19763933

Methods Mol Biol. 2009;573:259-84.
Bioinformatics analysis of microarray data.
Zhang Y, Szustakowski J, Schinke M.
Source

Novartis Institutes for BioMedical Research, Cambridge, MA, USA.

Project Nim, Rise of the Apes - chimps

http://www.ft.com/cms/s/2/89b06dac-c3fb-11e0-b302-00144feabdc0.html#axzz1UkH7Z3c8

Bear Mountain

http://www.trailpeak.com/trail-Bear-Mountain-near-Hope-Airport-BC-462

A 1.5hour(150km) drive East of Vancouver,
just past the Village of Harrison Hot Springs.


Round Trip 19km,11.8miles.
Elevation Gain 1010meters,3315feet
Average grade 10.6%
Book says 'Allow 6-hours'


www.vancouvertrails.com/

Team building activities

http://www.teamtechnology.co.uk/team-building-activities-outdoors.html

http://www.eventus.co.uk/outdoor.asp

http://www.singaporeteambuilding.com/teambuilding/

Ultimate Frisbee

Amazing Race

# Rocket Building
You construct a water-powered rocket, and the team that sends their rocket the furthest wins.

# Bridge building
The team have to get from one side of a beach or lawn to another, using only the pieces of wood provided. and without them or the wood touching the ground. Requires good planning and balance.

100 minus 1 day

"If you live to be 100, I hope I live to be 100 minus 1 day, so I never have to live without you."



Winnie the Pooh

Confident people.

10 things you can do right now to improve your chances for a promotion
Beth Braccio Hering, Special to CareerBuilder

Hurzeler claims that confidence level is the first thing he notices when someone enters his office. He also claims that most confident people are faking it. "They are just as scared of the world as the next person, but they have learned to push that fear into the background so they can reach over their heads and pull themselves up. Confident people are willing to explore, to lead, to speak up, to try new things, to innovate, to fail and to get up off the ground and try again."

"Winners know who the players are in the company and throughout the whole industry. They work to understand it all so that someday they can run it all."

http://www.careerbuilder.ca/Article/CB-805-Workplace-Issues-10-things-you-can-do-right-now-to-improve-your-chances-for-a-promotion/?sc_extcmp=cbca_9805&cblang=CAEnglish&SiteId=cbca_9805

Wednesday, August 10, 2011

Local File URL

file:////home/bob/123.jpg

That's four / after :

Google Chart API / Infographics

http://code.google.com/apis/chart/infographics/docs/overview.html

QR code for "Hello world"
https://chart.googleapis.com/chart?chs=150x150&cht=qr&chl=Hello%20world


https://chart.googleapis.com/chart?chst=d_bubble_icon_text_small&chld=ski|bb|Wheeee!|FFFFFF|000000

Google Image Chart
https://chart.googleapis.com/chart?chs=250x100&chd=t:60,40&cht=p3&chl=Hello|World

Cytoscape custom graphic pass-through mapper
http://proteomics-ms.blogspot.com/2011/07/custom-graphics-for-node-in-cytoscape.html

Cytoscape gallery unleased

http://cytoscape.wodaklab.org/wiki/Cytoscape_User_Manual/Visual_Styles

Online course on Protein-Protein Interactions

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

BI211 Protein-Protein Interactions
September 19-23, 2011
Online at Bioinformatics.Org

http://www.bioinformatics.org/wiki/BI211_Protein-Protein_Interactions

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

OBJECTIVES:
This course will dig into some of the fundamental issues concerning protein-protein interactions (PPIs), including their need and use in research. It will introduce various tools and provide examples for finding true, positive interactors from Web searches and interfaces. The exercises and take-home messages will encourage participants to give the topics some thought and then hopefully build some excitement about the top-down approach of systems biology involving protein-protein interactions.

INSTRUCTORS:
Prashanth Suravajhala is on the Board of Directors of Bioinformatics.Org. His home page is at http://www.bioinformatics.org/wiki/Prash.

Gary D. Bader works on biological network analysis and pathway information resources as an Assistant Professor at The Donnelly Centre at the University of Toronto. His home page is at http://baderlab.org/.

TOPICS:
* Assays, tools and techniques in PPIs: Pros and Cons
* Types of interactions and networks
* Data validation and integration
* Capabilities of networks
* A look at sample PPI data
* Future challenges
* Introduction to Cytoscape, Osprey and Ingenuity
* Networkology: Data representations
* Exercises using Osprey: A short project
* Exercises with iHOP, String, Bind, PreBind, Genecards, MINT, HPRD

REGISTRATION:
http://www.bioinformatics.org/edu/ACAA

FOR MORE INFORMATION:
Please write to edu@bioinformatics.org.

Tuesday, August 9, 2011

Chimpanzees Not as Selfish as We Thought

New research debunks a previous theory about how and why chimps share.

http://news.discovery.com/animals/chimps-share-110808.html

For each experiment, the tokens came in two different colors. Choosing tokens of a certain color would result in a "selfish outcome," which was a food reward for just the participant. Choosing tokens of the other color resulted in food rewards for both the token selector and another nearby chimp in an adjacent compartment.

The food rewards consisted of banana slices wrapped in butcher paper that made a loud sound when unwrapped.

The chimp participants nearly always chose the tokens that would yield food rewards for both the selector and the nearby observing chimp.

compendium

com·pen·di·um (km-pnd-m)
n. pl. com·pen·di·ums or com·pen·di·a (-d-)
1. A short, complete summary; an abstract.
2. A list or collection of various items.

GHMM Library

http://ghmm.org/

The General Hidden Markov Model library (GHMM) is a freely available C library implementing efficient data structures and algorithms for basic and extended HMMs with discrete and continous emissions. It comes with Python wrappers which provide a much nicer interface and added functionality. The GHMM is licensed under the LGPL.

Olga Troyanskaya

http://imperio.princeton.edu/cm/node/14
http://www.molbio.princeton.edu/index.php?option=content&task=view&id=243

Olga Troyanskaya
Troyanskaya Lab Webpage
ogt@princeton.eduThis e-mail address is being protected from spam bots, you need JavaScript enabled to view it Faculty Assistant:
Computer Science Bldg-204
Phone: 609-258-1749 Phone: 609-258-7014
Bioinformatics and genomics

The new era of high-throughput experimental methods in molecular biology has created exciting challenges for computer science to develop novel algorithms for complex, accurate, and consistent interpretation of diverse biological information. In the next decades, large-scale explorations of complex molecular, cellular, and organismic systems at complementary levels of resolution will allow us to integrate our understanding of macroscopic physiology and microscopic biology. To realize the full potential of these developments, we need to develop sophisticated bioinformatics frameworks to integrate and synthesize diverse biological data produced by these methods.

The goal of the research in my laboratory is to bring the capabilities of computer science and statistics to the study of gene function and regulation in the biological networks through integrated analysis of biological data from diverse data sources--both existing and yet to come (e.g. from diverse gene expression data sets and proteomic studies). We are designing systematic and accurate computational and statistical algorithms for biological signal detection in high-throughput data sets. More specifically, our lab is interested in developing methods for better gene expression data processing and algorithms for integrated analysis of biological data from multiple genomic data sets and different types of data sources (e.g. genomic sequences, gene expression, and proteomics data).

My laboratory combines computational methods with an experimental component in a unified effort to develop comprehensive descriptions of genetic systems of cellular controls, including those whose malfunctioning becomes the basis of genetic disorders, such as cancer, and others whose failure might produce developmental defects in model systems. The experimental component the lab focuses on is S. cerevisiae (baker's yeast).

Global Prediction of Tissue-Specific Gene Expression and Context-Dependent Gene Networks in Caenorhabditis elegans

http://www.ploscompbiol.org/article/info:doi%2F10.1371%2Fjournal.pcbi.1000417

Abstract

Tissue-specific gene expression plays a fundamental role in metazoan biology and is an important aspect of many complex diseases. Nevertheless, an organism-wide map of tissue-specific expression remains elusive due to difficulty in obtaining these data experimentally. Here, we leveraged existing whole-animal Caenorhabditis elegans microarray data representing diverse conditions and developmental stages to generate accurate predictions of tissue-specific gene expression and experimentally validated these predictions. These patterns of tissue-specific expression are more accurate than existing high-throughput experimental studies for nearly all tissues; they also complement existing experiments by addressing tissue-specific expression present at particular developmental stages and in small tissues. We used these predictions to address several experimentally challenging questions, including the identification of tissue-specific transcriptional motifs and the discovery of potential miRNA regulation specific to particular tissues. We also investigate the role of tissue context in gene function through tissue-specific functional interaction networks. To our knowledge, this is the first study producing high-accuracy predictions of tissue-specific expression and interactions for a metazoan organism based on whole-animal data.

CPAWS | Canadian Parks and Wilderness Society

www.cpaws.org

Founded in 1963, the Canadian Parks and Wilderness Society (CPAWS) has helped protect over 300000 square kilometers of Canada's threatened wild areas.

Philippine Films

pinoyindiefilms.blogspot.com/

Monday, August 8, 2011

Sample Wiggle UCSC custom track

track type=wiggle_0 name=fooWig descript="bla"
variableStep chrom=chr4
265 0.202
270 0.202
275 0.202
280 0.202
285 0.202
290 0.202

http://genome.ucsc.edu/goldenPath/help/wiggle.html

wig tools

http://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/

bedGraphToBigWig v 4 -
Convert a bedGraph program to bigWig.

bedItemOverlapCount

bedToBigBed

http://genome.ucsc.edu/goldenPath/help/bigWig.html

ENCODE cell types

http://genome.ucsc.edu/ENCODE/cellTypes.html

SH-SY5Y
neuroblastoma clonal subline of the neuroepithelioma cell line SK-N-SH that had been established in 1970 from the bone marrow biopsy of a 4-year-old girl with metastatic neuroblastoma

SK-N-MC
neuroepithelioma cell line derived from a metastatic supra-orbital human brain tumor,

SK-N-SH
Human neuroblastoma

BE2_C
neuroblastoma

PFSK-1
neuroectodermal cell line derived from a human cerebral brain tumor. (PMID: 1316433)

Gliobla
glioblastoma, these cells (aka H54 and D54) come from a surgical resection from a patient with glioblastoma multiforme (WHO Grade IV).

HBMEC
Human Brain Microvascular Endothelial Cells

Kendall tau distance rank correlation coefficient

Specifically, it is a measure of rank correlation: that is, the similarity of the orderings of the data when ranked by each of the quantities. It is named after Maurice Kendall, who developed it in 1938,[1] though Gustav Fechner had proposed a similar measure in the context of time series in 1897.[2]

http://en.wikipedia.org/wiki/Kendall_tau_rank_correlation_coefficient

Rank aggregation methods -- Shili Lin
http://onlinelibrary.wiley.com/doi/10.1002/wics.111/pdf


Kendall’s Tau top-k distance for ranking set of genes, classification or clustering of datasets based on patterns of differential expression

theory.stanford.edu/~sergei/papers/www10-metrics.pdf


For top k lists T1
and T2, the minimizing Kendall distance Kmin(T1,T2)
between rl and v2 is defined to be the minimum value
of K ( a l , a 2 ) , where al and a2 are each permutations of
DT1 U DT2 and where al >= T1 and a2 >= T2

Music - Elvis

"Music should be something that makes you gotta move, inside or outside." -- Elvis Presley

Advises on things to put on your CV

http://biostar.stackexchange.com/questions/10818/is-bioinformatics-on-decline-only-in-the-web-or-also-within-institutions/10826#10826

Put the most relevant things on top, or in your letter (or both). Be aware that people will probably make up their mind in the first 5 seconds when reading your letter.

Motivations that say you want to improve the world as a whole may sound nice, but they don't help.

Sunday, August 7, 2011

Amorita Resort

I'd recommend Amorita Resort

http://www.pasyalera.com/hotels-and-accommodations/amorita-resort-review/

Love all, trust a few, do wrong to none. -- Shakespeare

Love all, trust a few, do wrong to none. -- Shakespeare

Tastes - Oscar Wilde

"I have the simplest tastes. I am always satisfied with the best."
-- Oscar Wilde

Saturday, August 6, 2011

Emily Dickinson

"Not knowing when the dawn will come, I open every door."

Friday, August 5, 2011

chungaivancouver

http://chungaivancouver.com/PhotoAlbums/album_1191443867/

Romance

One Day
http://www.imdb.com/video/imdb/vi3449396249/
Description: After spending the night together on the night of their college graduation Dexter and Em are revisited each year on the same date to see where they are in their lives. They are sometimes together, sometimes not, on that day.

Crazy Stupid Love
http://www.imdb.com/title/tt1570728/
A father's life unravels while he deals with a marital crisis and tries to manage his relationship with his children.

Risk factors for autism: translating genomic discoveries into diagnostics -- Stephen W. Scherer and Geraldine Dawson

http://www.springerlink.com/content/b286184612181424/

Autism spectrum disorders (ASDs) are a group of conditions characterized by impairments in communication and reciprocal social interaction, and the presence of restricted and repetitive behaviors. The spectrum of autistic features is variable, with severity of symptoms ranging from mild to severe, sometimes with poor clinical outcomes. Twin and family studies indicate a strong genetic basis for ASD susceptibility. Recent progress in defining rare highly penetrant mutations and copy number variations as ASD risk factors has prompted early uptake of these research findings into clinical diagnostics, with microarrays becoming a ‘standard of care’ test for any ASD diagnostic work-up. The ever-changing landscape of the generation of genomic data coupled with the vast heterogeneity in cause and expression of ASDs (further influenced by issues of penetrance, variable expressivity, multigenic inheritance and ascertainment) creates complexity that demands careful consideration of how to apply this knowledge. Here, we discuss the scientific, ethical, policy and communication aspects of translating the new discoveries into clinical and diagnostic tools for promoting the well-being of individuals and families with ASDs.

Thursday, August 4, 2011

Stormo Lab Center for Genome Sciences - Washington University

http://ural.wustl.edu/pubs.html

Detecting and profiling tissue-selective genes

Tissue Distributions DB
http://genome.dkfz-heidelberg.de/menu/tissue_db/
TissueDistributionDBs, is a repository of tissue distribution profiles for identifying and ranking the genes in the spectrum of tissue specificity based on Expressed Sequence Tags (ESTs). This repository is currently available for several model organisms across animal and plant kingdoms and is fundamentally based on the UniGene database.

http://physiolgenomics.physiology.org/content/26/2/158.full

Received 19 December 2005; accepted in final form 4 May 2006.
Physiological Genomics 26:158-162 (2006)
1094-8341/06 $8.00 © 2006 American Physiological Society
Detecting and profiling tissue-selective genes
Shuang Liang , Yizheng Li , Xiaobing Be , Steve Howes and Wei Liu
Bioinformatics, Wyeth Research, Cambridge, Massachusetts

ABSTRACT

The widespread use of DNA microarray technologies has generated large amounts of data from various tissue and/or cell types. These data set the stage to answer the question of tissue specificity of human transcriptome in a comprehensive manner. Our focus is to uncover the tissue-gene relationship by identifying genes that are preferentially expressed in a small number of tissue types. The tissue selectivity would shed light on the potential physiological functions of these genes and provides an indispensable reference to compare against disease pathophysiology and to identify or validate tissue-specific drug targets. Here we describe a systematic computational and statistical approach to profile gene expression data to identify tissue-selective genes with the use of a more extensive data set and a well-established multiple-comparison procedure with error rate control. Expression data of 35,152 probe sets in 97 normal human tissue types were analyzed, and 3,919 genes were identified to be selective to one or a few tissue types. We presented results of these tissue-selective genes and compared them to those identified by other studies.
tissue selectivity; differential expression; transcription profiling; Tukey; honest significant difference

Alignment of gene expression profiles from test samples against a reference database: New method for context-specific interpretation of microarray data.
http://www.ncbi.nlm.nih.gov/pubmed/21453538
Abstract
Background
Gene expression microarray data have been organized and made available as public databases, but the utilization of such highly heterogeneous reference datasets in the interpretation of data from individual test samples is not as developed as e.g. in the field of nucleotide sequence comparisons. We have created a rapid and powerful approach for the alignment of microarray gene expression profiles (AGEP) from test samples with those contained in a large annotated public reference database and demonstrate here how this can facilitate interpretation of microarray data from individual samples.
Methods
AGEP is based on the calculation of kernel density distributions for the levels of expression of each gene in each reference tissue type and provides a quantitation of the similarity between the test sample and the reference tissue types as well as the identity of the typical and atypical genes in each comparison. As a reference database, we used 1654 samples from 44 normal tissues (extracted from the Genesapiens database).
Results
Using leave-one-out validation, AGEP correctly defined the tissue of origin for 1521 (93.6%) of all the 1654 samples in the original database. Independent validation of 195 external normal tissue samples resulted in 87% accuracy for the exact tissue type and 97% accuracy with related tissue types. AGEP analysis of 10 Duchenne muscular dystrophy (DMD) samples provided quantitative description of the key pathogenetic events, such as the extent of inflammation, in individual samples and pinpointed tissue-specific genes whose expression changed (SAMD4A) in DMD. AGEP analysis of microarray data from adipocytic differentiation of mesenchymal stem cells and from normal myeloid cell types and leukemias provided quantitative characterization of the transcriptomic changes during normal and abnormal cell differentiation.
Conclusions
The AGEP method is a widely applicable method for the rapid comprehensive interpretation of microarray data, as proven here by the definition of tissue- and disease-specific changes in gene expression as well as during cellular differentiation. The capability to quantitatively compare data from individual samples against a large-scale annotated reference database represents a widely applicable paradigm for the analysis of all types of high-throughput data. AGEP enables systematic and quantitative comparison of gene expression data from test samples against a comprehensive collection of different cell/tissue types previously studied by the entire research community.

A Comparative Study of Mouse Hepatic and Intestinal Gene Expression Profiles under PPARα Knockout by Gene Set Enrichment Analysis.
http://www.ncbi.nlm.nih.gov/pubmed/21811494

Gene expression profiling of PPARα has been used in several studies, but fewer
studies went further to identify the tissue-specific pathways or genes involved
in PPARα activation in genome-wide. Here, we employed and applied gene set
enrichment analysis to two microarray datasets both PPARα related respectively in
mouse liver and intestine. We suggested that the regulatory mechanism of PPARα
activation by WY14643 in mouse small intestine is more complicated than in liver
due to more involved pathways. Several pathways were cancer-related such as
pancreatic cancer and small cell lung cancer, which indicated that PPARα may have
an important role in prevention of cancer development. 12 PPARα dependent
pathways and 4 PPARα independent pathways were identified highly common in both
liver and intestine of mice. Most of them were metabolism related, such as fatty
acid metabolism, tryptophan metabolism, pyruvate metabolism with regard to PPARα
regulation but gluconeogenesis and propanoate metabolism independent of PPARα
regulation. Keratan sulfate biosynthesis, the pathway of regulation of actin
cytoskeleton, the pathways associated with prostate cancer and small cell lung
cancer were not identified as hepatic PPARα independent but as WY14643 dependent
ones in intestinal study. We also provided some novel hepatic tissue-specific
marker genes.

Jian Pei - SFU Bioinformatics, data mining, gene expression

http://www.cs.sfu.ca/~jpei/publications.htm

Nature Articles -

Interdisciplinary studies: Seeking the right toolkit
* Bryn Nelson
http://www.nature.com/naturejobs/2011/110804/full/nj7358-115a.html?WT.ec_id=NATUREjobs-20110804

Barry Bozeman, a policy analyst at the University of Georgia in Athens who studies scientists' career trajectories, says that for now, an interdisciplinary background is “very rarely an advantage” when looking for a faculty position. Biotechnology and pharmaceutical firms might be more accommodating, as long as the applicant's unconventional research fits within the company's overall scientific aims. But formal interdisciplinary training may be less important than informal learning experiences in labs, institutes and universities that encourage the intermingling of a broad range of ideas.

“If you have ideas that the department likes and people think that what you're proposing to do is vigorous and interesting, then you will get a job,” says Anikeeva, who did her postdoctoral research at the Clark Center. “I don't think it really depends on if you have interdisciplinary training or not.”

it is much harder to get interdisciplinary faculty positions.

“gives them full citizenry in terms of access to financial and physical resources”, says the programme's website. And when they complete their graduate studies, students are “strongly advised to be strategic about their post-doctoral placement, since most must find a job in an existing more traditional field”.

“People who establish interdisciplinary degrees are also more likely to hire people with interdisciplinary degrees,” says Bozeman.

Scientists for sale:
http://www.nature.com/naturejobs/2011/110804/full/nj7358-117a.html?WT.ec_id=NATUREjobs-20110804

First, be realistic and make sure that your product fits the needs of your target audience.

Second, a sales meeting is a conversation. All the tips I found stressed that the salesperson must listen to potential buyers to understand their needs.

Finally, explain clearly what will happen after the sale. Buyers need to know how they will put you, the product, to use. Think of yourself as a new printer. Are you 'upgradable'? Be honest about what you need to get started. It's best to tell your department about the particle accelerator you'll need in your basement before the fleet of moving trucks arrives.

Of course, should everything else fail, you can always break out the car-salesman routine. Look the search-committee members squarely in the eye, give them your widest grin and ask, “Say, what will it take for me to get this job today?”


Graduate students: Aspirations and anxieties
* Gene Russo
http://www.nature.com/naturejobs/2011/110728/full/nj7357-533a.html?WT.ec_id=NATUREjobs-20110804

Across all disciplines, PhD students became less pleased with their experience as their degrees progressed. Of first-year students who responded to the survey, 76% were “satisfied” or “very satisfied”; that decreased to 66.8% for second-years and 61.3% for third-years, although the numbers varied with region (see 'Continental divide').

Hugh Kearns, a psychologist at Flinders University in Adelaide, Australia, who studies the graduate-student experience, says that the change could also be due to research results not turning out as expected. He notes that new students sometimes have unrealistically optimistic ideas about the feasibility of their research aims.

Also, getting a PhD typically takes three to four years in parts of Europe, whereas it can take five or more in the United States, which can cause dismay.

Adviser recognition is an “essential element” of quality supervision, says Marja Makarow

They have found that a lack of direction and clear advice from an adviser leads to significant declines in student satisfaction.

Thomas Skalak, vice-president for research at the University of Virginia in Charlottesville, emphasizes the need to impress upon students that they are, in the end, responsible for their own education. He likes to suggest that they act as 'intellectual entrepreneurs' by fastidiously minding their own education, graduate project, research focus and career prospects.

The survey implies that the longer students spend in graduate education, the less attractive an academic career becomes.

Intense competition for original results, publications and jobs seems to be a major factor in this change.

Among the 469 respondents, 42% of first-years wanted to be a “principal investigator at a research-intensive institution”; that dropped to 25% for third-year students. Of those who gave reasons, many cited the long work hours required, the challenge of getting funding, a distaste for daily tasks such as grant writing and the slow pace of research, and the intense competition for tenure. Some also had what Fuhrmann terms “positive” reasons for their change of preference — such as learning about an exciting new job opportunity.