Wednesday, October 20, 2010

Splicing, chemical compound classification, structural variations

Deciphering the Splicing Code
Barash et al. Nature 465, 53-59 (6 May 2010)
http://www.nature.com/nature/journal/v465/n7294/full/nature09000.html
- A unique aspect of our approach is that it searches for a regulatory
code that maximizes a quantifiable measure of code quality, so as to
jointly account for many features and produce a predictive splicing
code.
- To achieve this, we introduce an information
theoretic measure of ‘code quality’
- Our method seeks a code that is able to predict the splicing patterns of
all exons as accurately as possible, based solely on the tissue type and
proximal RNA features.
- We use a measure of ‘code quality’ that is based on information
theory31 (see Methods). It can be viewed as the amount of informa-
tion about genome-wide tissue-dependent splicing accounted for by
the code. A code quality of zero indicates that the predictions are no
better than guessing, whereas a higher code quality indicates
improved prediction capability.

SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data.
Zeitouni B, Boeva V, Janoueix-Lerosey I, Loeillet S, Legoix-né P, Nicolas A, Delattre O, Barillot E.
http://bioinformatics.oxfordjournals.org/content/26/15/1895.long

Semantic Similarity for Automatic Classification of Chemical Compounds
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000937
- Their semantic
similarity, as measured with a simGIC method in the whole
ontology, is 0.324, and their structural similarity, as measured with
the FP3 format, is 0.667.
- Their semantic
similarity, as measured with a simGIC method in the whole
ontology, is 0.324, and their structural similarity, as measured with
the FP3 format, is 0.667.
- The best approaches existing today are based on the structure-
activity relationship premise (SAR), which states that biological
activity of a molecule is strongly related to its structural or
physicochemical properties.
- We dubbed the novel
approach Chym, for Chemical Hybrid Metric. We extract
semantic information from ChEBI, the Chemical Entities


Detection of splice junctions from paired-end RNA-seq data by SpliceMap
http://nar.oxfordjournals.org/content/38/14/4570.abstract
TopHat
http://tophat.cbcb.umd.edu/
~ 151 317 exon junctions, including 23 020 novel junctions, which were not reported in RefSeq (19), Ensembl (20) and KnownGene (21). in the human brain tissue
- expression level (in RKPM)
- Novel junction discovery is the major function of SpliceMap, which therefore cannot be replace-able by annotation-independent ERANGE
- SpliceMap gener-ates the seeding by using short-read alignment tools such
as ELAND and SeqMap, while BLAT makes use of a hash table.
- For the Illumina protocol used in to produce our data, the distance between two
paired-end reads is about 200 nt in the mRNA

No comments: