*Free shift (or semi-global) alignments will ignore gaps at the beginning and end of the sequence, while Global alignments try to consider all positions.
Use Free shift alignments when some of the sequences are terminally truncated. Local alignments (such as BLAST) are useful for
finding short stretches of homology and are useful for finding sequence overlap or detecting a short internal sequence stretches that
are shared.
http://www.uoguelph.ca/plant/depttools/dnaanalysis.htm
www.cs.ecu.edu/hochberg/spring2006/LocalAlign.pdf
birg.cs.wright.edu/text/Ch2.ppt
Just a collection of some random cool stuff. PS. Almost 99% of the contents here are not mine and I don't take credit for them, I reference and copy part of the interesting sections.
Sunday, October 31, 2010
Friday, October 29, 2010
Rspec
$ jruby -S gem install rspec -v=1.1.12
$ spec -v
rspec 1.1.12
$ RAILS_ENV=test jruby -S rake db:test:prepare
$ RAILS_ENV=test jruby -S rake db:migrate
$ RAILS_ENV=development jruby -S rake db:migrate:redo VERSION=20101027194336
$ RAILS_ENV=development jruby -S rake db:rollback STEP=5
http://cheat.errtheblog.com/s/rspec/
http://guides.rubyonrails.org/migrations.html
$ spec -v
rspec 1.1.12
$ RAILS_ENV=test jruby -S rake db:test:prepare
$ RAILS_ENV=test jruby -S rake db:migrate
$ RAILS_ENV=development jruby -S rake db:migrate:redo VERSION=20101027194336
$ RAILS_ENV=development jruby -S rake db:rollback STEP=5
http://cheat.errtheblog.com/s/rspec/
http://guides.rubyonrails.org/migrations.html
Thursday, October 28, 2010
ncRNA non-coding RNA review papers
1: Galasso M, Elena Sana M, Volinia S. Non-coding RNAs: a key to future
personalized molecular therapy? Genome Med. 2010 Feb 18;2(2):12. PubMed PMID: 20236487; PubMed Central PMCID: PMC2847703.
http://www.ncbi.nlm.nih.gov/pubmed/20236487
1: Harrison BR, Yazgan O, Krebs JE. Life without RNAi: noncoding RNAs and their functions in Saccharomyces cerevisiae. Biochem Cell Biol. 2009 Oct;87(5):767-79. Review. PubMed PMID: 19898526.
http://www.ncbi.nlm.nih.gov/pubmed/19898526
1: Fabbri M, Calin GA. Beyond genomics: interpreting the 93% of the human genome that does not encode proteins. Curr Opin Drug Discov Devel. 2010 May;13(3):350-8. Review. PubMed PMID: 20443168.
http://www.ncbi.nlm.nih.gov/pubmed/20443168
1: Majer A, Booth SA. Computational methodologies for studying non-coding RNAs relevant to central nervous system function and dysfunction. Brain Res. 2010 Jun 18;1338:131-45. Epub 2010 Apr 8. Review. PubMed PMID: 20381467.
http://www.ncbi.nlm.nih.gov/pubmed/20381467
1: Zheng L, Qu L. Computational RNomics: structure identification and functional
prediction of non-coding RNAs in silico. Sci China Life Sci. 2010
May;53(5):548-62. Epub 2010 May 23. PubMed PMID: 20596938.
http://www.ncbi.nlm.nih.gov/pubmed/20596938
personalized molecular therapy? Genome Med. 2010 Feb 18;2(2):12. PubMed PMID: 20236487; PubMed Central PMCID: PMC2847703.
http://www.ncbi.nlm.nih.gov/pubmed/20236487
1: Harrison BR, Yazgan O, Krebs JE. Life without RNAi: noncoding RNAs and their functions in Saccharomyces cerevisiae. Biochem Cell Biol. 2009 Oct;87(5):767-79. Review. PubMed PMID: 19898526.
http://www.ncbi.nlm.nih.gov/pubmed/19898526
1: Fabbri M, Calin GA. Beyond genomics: interpreting the 93% of the human genome that does not encode proteins. Curr Opin Drug Discov Devel. 2010 May;13(3):350-8. Review. PubMed PMID: 20443168.
http://www.ncbi.nlm.nih.gov/pubmed/20443168
1: Majer A, Booth SA. Computational methodologies for studying non-coding RNAs relevant to central nervous system function and dysfunction. Brain Res. 2010 Jun 18;1338:131-45. Epub 2010 Apr 8. Review. PubMed PMID: 20381467.
http://www.ncbi.nlm.nih.gov/pubmed/20381467
1: Zheng L, Qu L. Computational RNomics: structure identification and functional
prediction of non-coding RNAs in silico. Sci China Life Sci. 2010
May;53(5):548-62. Epub 2010 May 23. PubMed PMID: 20596938.
http://www.ncbi.nlm.nih.gov/pubmed/20596938
ncRNA, probe, GWAS
A method for automatically extracting infectious disease-related primers and probes from the literature
http://www.biomedcentral.com/1471-2105/11/410
Classification of ncRNAs using position and size information in deep
sequencing data
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935403/?tool=pubmed
Forward-time simulation of realistic samples for genome-wide association studies
http://www.biomedcentral.com/1471-2105/11/442
Genetic drift or allelic drift is the change in the frequency of a gene variant (allele) in a population due to random sampling. The alleles in the offspring are a sample of those in the parents, and chance has a role in determining whether a given individual survives and reproduces. vs (natural selection)
In population genetics, linkage disequilibrium is the non-random association of alleles at two or more loci, not necessarily on the same chromosome. It is not the same as linkage, which describes the association of two or more loci on a chromosome with limited recombination between them. Linkage disequilibrium describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies. Non-random associations between polymorphisms at different loci are measured by the degree of linkage disequilibrium (LD). Numerically, it is the difference between observed and expected (assuming random distributions) allelic frequencies.
Detection and characterization of novel sequence insertions using paired-end next-generation sequencing.
http://bioinformatics.oxfordjournals.org/content/26/10/1277.full
http://www.biomedcentral.com/1471-2105/11/410
Classification of ncRNAs using position and size information in deep
sequencing data
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2935403/?tool=pubmed
Forward-time simulation of realistic samples for genome-wide association studies
http://www.biomedcentral.com/1471-2105/11/442
Genetic drift or allelic drift is the change in the frequency of a gene variant (allele) in a population due to random sampling. The alleles in the offspring are a sample of those in the parents, and chance has a role in determining whether a given individual survives and reproduces. vs (natural selection)
In population genetics, linkage disequilibrium is the non-random association of alleles at two or more loci, not necessarily on the same chromosome. It is not the same as linkage, which describes the association of two or more loci on a chromosome with limited recombination between them. Linkage disequilibrium describes a situation in which some combinations of alleles or genetic markers occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies. Non-random associations between polymorphisms at different loci are measured by the degree of linkage disequilibrium (LD). Numerically, it is the difference between observed and expected (assuming random distributions) allelic frequencies.
Detection and characterization of novel sequence insertions using paired-end next-generation sequencing.
http://bioinformatics.oxfordjournals.org/content/26/10/1277.full
Wednesday, October 27, 2010
imagemagick convert - split single multi-pdf to many pdfs
http://ardvaark.net/useful-pdf-imagemagick-recipes
Split single multi-pdf to many pdfs
$ convert -quality 100 -density 300x300 in.pdf multi%d.pdf
# combine, may increase in size by a lot
$ convert -density 150 pdf1.pdf pdf2.pdf out.pdf
or better
$ pdftk pdf1.pdf pdf2.pdf cat output temp.pdf
Split single multi-pdf to many pdfs
$ convert -quality 100 -density 300x300 in.pdf multi%d.pdf
# combine, may increase in size by a lot
$ convert -density 150 pdf1.pdf pdf2.pdf out.pdf
or better
$ pdftk pdf1.pdf pdf2.pdf cat output temp.pdf
Tuesday, October 26, 2010
R aggregation
> x
j word journ
1 1 p b
2 2 g b
3 3 p d
4 4 p b
5 5 p d
> with(x, tapply(word, journ, length))
b d
3 2
j word journ
1 1 p b
2 2 g b
3 3 p d
4 4 p b
5 5 p d
> with(x, tapply(word, journ, length))
b d
3 2
Monday, October 25, 2010
Critical thinking
http://en.wikipedia.org/wiki/Critical_thinking
Critical thinking clarifies goals, examines assumptions, discerns hidden values, evaluates evidence, accomplishes actions, and assesses conclusions.
"Critical" as used in the expression "critical thinking" connotes the importance or centrality of the thinking to an issue, question or problem of concern. "Critical" in this context does not mean "disapproval" or "negative." There are many positive and useful uses of critical thinking, for example formulating a workable solution to a complex personal problem, deliberating as a group about what course of action to take, or analyzing the assumptions and the quality of the methods used in scientifically arriving at a reasonable level of confidence about a given hypothesis. Using strong critical thinking we might evaluate an argument, for example, as worthy of acceptance because it is valid and based on true premises. Upon reflection, a speaker may be evaluated as a credible source of knowledge on a given topic.
Critical thinking can occur whenever one judges, decides, or solves a problem; in general, whenever one must figure out what to believe or what to do, and do so in a reasonable and reflective way. Reading, writing, speaking, and listening can all be done critically or uncritically. Critical thinking is crucial to becoming a close reader and a substantive writer. Expressed most generally, critical thinking is "a way of taking up the problems of life."[2]
Critical thinking clarifies goals, examines assumptions, discerns hidden values, evaluates evidence, accomplishes actions, and assesses conclusions.
"Critical" as used in the expression "critical thinking" connotes the importance or centrality of the thinking to an issue, question or problem of concern. "Critical" in this context does not mean "disapproval" or "negative." There are many positive and useful uses of critical thinking, for example formulating a workable solution to a complex personal problem, deliberating as a group about what course of action to take, or analyzing the assumptions and the quality of the methods used in scientifically arriving at a reasonable level of confidence about a given hypothesis. Using strong critical thinking we might evaluate an argument, for example, as worthy of acceptance because it is valid and based on true premises. Upon reflection, a speaker may be evaluated as a credible source of knowledge on a given topic.
Critical thinking can occur whenever one judges, decides, or solves a problem; in general, whenever one must figure out what to believe or what to do, and do so in a reasonable and reflective way. Reading, writing, speaking, and listening can all be done critically or uncritically. Critical thinking is crucial to becoming a close reader and a substantive writer. Expressed most generally, critical thinking is "a way of taking up the problems of life."[2]
Friday, October 22, 2010
Transcription factor, PROFESS, SPA
High Resolution Models of Transcription Factor-DNA Affinities Improve In Vitro and In Vivo Binding Predictions:http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1000916
Semi-supervised recursively partitioned mixture models for identifying cancer subtypes
http://bioinformatics.oxfordjournals.org/content/early/2010/08/15/bioinformatics.btq470.full.pdf+html
PROFESS: a PROtein Function, Evolution, Structure and Sequence database: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2911846/
The advantage of the ‘answering queries using views’ approach to the database integration problem is that it reduces the integration problem to two steps: (i) building wrappers of the source databases, thereby providing simple ‘views’, and (ii) applying standard database queries on the views. Thus, implementing wrappers enables a robust query system that incorporates a variety of similarity functions capable of generating data relationships not conceived during the creation of the database. This will allow the user to move beyond simple text-based queries. Therefore, the PROFESS (PROtein Function, Evolution, Structure and Sequence) database uses wrappers to assist in the structural, functional and evolutionary analysis of the abundant number of novel proteins continually identified from whole-genome sequencing.
eggNOG http://eggnog.embl.de/
evolutionary genealogy of genes: Non-supervised Orthologous Groups
multiple structural alignment program, MAMMOTH-mult
http://ub.cbm.uam.es/mammoth/mult/
PROFESS
http://bionmr-c1.unl.edu/
Edit distance eg. kitten -> sitting has 3 character changes needed, useful for autocomplete
http://en.wikipedia.org/wiki/Levenshtein_distance
SPA: Short peptide analyzer of intrinsic disorder status of short peptides
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2900848/?tool=pubmed
Biological Insights of Transcription Factor through Analyzing ChIP-Seq Data
Kaida Ning, 2009
LaTeX, Sweave, R
\usepackage{hyperref}
\usepackage{tabularx}
\usepackage{listings}
\usepackage{graphicx}
\usepackage{url}
\usepackage{cite}
$ R CMD Sweave foo.Rnw ; texi2pdf foo.tex
http://www.stat.umn.edu/~charlie/Sweave/
http://www.stat.umn.edu/~charlie/Sweave/foo.pdf
http://www.stat.umn.edu/~charlie/Sweave/foo.Rnw
\pagebreak[3]
\verb@Sweave@
\begin{verbatim}
latex foo
\end{verbatim}
Figure~\ref{fig:one} (p.~\pageref{fig:one})
<
\begin{figure}
\begin{center}
<
<>=
n <- 50
x <- seq(1, n)
a.true <- 3
b.true <- 1.5
y.true <- a.true + b.true * x
s.true <- 17.3
y <- y.true + s.true * rnorm(n)
out1 <- lm(y ~ x)
summary(out1)
@
The commands in package lattice have different behavior than the standard plot commands in
the base package: lattice commands return an object of class "trellis", the actual plotting is
performed by the print method for the class. Encapsulating calls to lattice functions in print()
statements should do the trick, e.g.:
<>=
library(lattice)
print(bwplot(1:10))
@
BibTeX and bibliography styles
http://amath.colorado.edu/documentation/LaTeX/reference/faq/bibstyles.html
The two Latex editors that I found most useful are:
1. Texmaker - lightweight, spell-check, have to press F1 and F3 to generate a PDF, need to click on the log window a lot
2. Kile - nicer, but I couldn't get syntax coloring to work for Sweave, spell-check, one-click gets you a PDF
Trick
- if you rename your '.Rnw' to '.tex' as a work around, it plays nicely with the editors
- then once all formatting is done, copy '.tex' to '.Rnw' and run the command
- or just create a tex symlink to Rnw! ln -s mydoc.Rnw mydoc.tex
R CMD Sweave mydoc.Rnw && texi2pdf mydoc.tex && evince mydoc.pdf
In R, call
> Stangle(file='foo.Rnw')
to extract the R code, WARNING: this will overwrite 'foo.R'!!!!!
\usepackage{tabularx}
\usepackage{listings}
\usepackage{graphicx}
\usepackage{url}
\usepackage{cite}
$ R CMD Sweave foo.Rnw ; texi2pdf foo.tex
http://www.stat.umn.edu/~charlie/Sweave/
http://www.stat.umn.edu/~charlie/Sweave/foo.pdf
http://www.stat.umn.edu/~charlie/Sweave/foo.Rnw
\pagebreak[3]
\verb@Sweave@
\begin{verbatim}
latex foo
\end{verbatim}
Figure~\ref{fig:one} (p.~\pageref{fig:one})
<
\begin{figure}
\begin{center}
<
<
n <- 50
x <- seq(1, n)
a.true <- 3
b.true <- 1.5
y.true <- a.true + b.true * x
s.true <- 17.3
y <- y.true + s.true * rnorm(n)
out1 <- lm(y ~ x)
summary(out1)
@
the base package: lattice commands return an object of class "trellis", the actual plotting is
performed by the print method for the class. Encapsulating calls to lattice functions in print()
statements should do the trick, e.g.:
<
library(lattice)
print(bwplot(1:10))
@
- if you rename your '.Rnw' to '.tex' as a work around, it plays nicely with the editors
- then once all formatting is done, copy '.tex' to '.Rnw' and run the command
- or just create a tex symlink to Rnw! ln -s mydoc.Rnw mydoc.tex
R CMD Sweave mydoc.Rnw && texi2pdf mydoc.tex && evince mydoc.pdf
In R, call
> Stangle(file='foo.Rnw')
to extract the R code, WARNING: this will overwrite 'foo.R'!!!!!
quantitative trait loci (QTL)
A common approach to understanding the genetic basis of complex traits is through identification of associated quantitative trait loci (QTL). Fine mapping QTLs requires several generations of backcrosses and analysis of large populations, which is time-consuming and costly effort.
http://www.biomedcentral.com/1471-2105/11/525/abstract
http://www.biomedcentral.com/1471-2105/11/526/abstract
http://www.biomedcentral.com/1471-2105/11/525/abstract
http://www.biomedcentral.com/1471-2105/11/526/abstract
Wednesday, October 20, 2010
Splicing, chemical compound classification, structural variations
Deciphering the Splicing Code
Barash et al. Nature 465, 53-59 (6 May 2010)
http://www.nature.com/nature/journal/v465/n7294/full/nature09000.html
- A unique aspect of our approach is that it searches for a regulatory
code that maximizes a quantifiable measure of code quality, so as to
jointly account for many features and produce a predictive splicing
code.
- To achieve this, we introduce an information
theoretic measure of ‘code quality’
- Our method seeks a code that is able to predict the splicing patterns of
all exons as accurately as possible, based solely on the tissue type and
proximal RNA features.
- We use a measure of ‘code quality’ that is based on information
theory31 (see Methods). It can be viewed as the amount of informa-
tion about genome-wide tissue-dependent splicing accounted for by
the code. A code quality of zero indicates that the predictions are no
better than guessing, whereas a higher code quality indicates
improved prediction capability.
SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data.
Zeitouni B, Boeva V, Janoueix-Lerosey I, Loeillet S, Legoix-né P, Nicolas A, Delattre O, Barillot E.
http://bioinformatics.oxfordjournals.org/content/26/15/1895.long
Semantic Similarity for Automatic Classification of Chemical Compounds
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000937
- Their semantic
similarity, as measured with a simGIC method in the whole
ontology, is 0.324, and their structural similarity, as measured with
the FP3 format, is 0.667.
- Their semantic
similarity, as measured with a simGIC method in the whole
ontology, is 0.324, and their structural similarity, as measured with
the FP3 format, is 0.667.
- The best approaches existing today are based on the structure-
activity relationship premise (SAR), which states that biological
activity of a molecule is strongly related to its structural or
physicochemical properties.
- We dubbed the novel
approach Chym, for Chemical Hybrid Metric. We extract
semantic information from ChEBI, the Chemical Entities
Detection of splice junctions from paired-end RNA-seq data by SpliceMap
http://nar.oxfordjournals.org/content/38/14/4570.abstract
TopHat
http://tophat.cbcb.umd.edu/
~ 151 317 exon junctions, including 23 020 novel junctions, which were not reported in RefSeq (19), Ensembl (20) and KnownGene (21). in the human brain tissue
- expression level (in RKPM)
- Novel junction discovery is the major function of SpliceMap, which therefore cannot be replace-able by annotation-independent ERANGE
- SpliceMap gener-ates the seeding by using short-read alignment tools such
as ELAND and SeqMap, while BLAT makes use of a hash table.
- For the Illumina protocol used in to produce our data, the distance between two
paired-end reads is about 200 nt in the mRNA
Barash et al. Nature 465, 53-59 (6 May 2010)
http://www.nature.com/nature/journal/v465/n7294/full/nature09000.html
- A unique aspect of our approach is that it searches for a regulatory
code that maximizes a quantifiable measure of code quality, so as to
jointly account for many features and produce a predictive splicing
code.
- To achieve this, we introduce an information
theoretic measure of ‘code quality’
- Our method seeks a code that is able to predict the splicing patterns of
all exons as accurately as possible, based solely on the tissue type and
proximal RNA features.
- We use a measure of ‘code quality’ that is based on information
theory31 (see Methods). It can be viewed as the amount of informa-
tion about genome-wide tissue-dependent splicing accounted for by
the code. A code quality of zero indicates that the predictions are no
better than guessing, whereas a higher code quality indicates
improved prediction capability.
SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data.
Zeitouni B, Boeva V, Janoueix-Lerosey I, Loeillet S, Legoix-né P, Nicolas A, Delattre O, Barillot E.
http://bioinformatics.oxfordjournals.org/content/26/15/1895.long
Semantic Similarity for Automatic Classification of Chemical Compounds
http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000937
- Their semantic
similarity, as measured with a simGIC method in the whole
ontology, is 0.324, and their structural similarity, as measured with
the FP3 format, is 0.667.
- Their semantic
similarity, as measured with a simGIC method in the whole
ontology, is 0.324, and their structural similarity, as measured with
the FP3 format, is 0.667.
- The best approaches existing today are based on the structure-
activity relationship premise (SAR), which states that biological
activity of a molecule is strongly related to its structural or
physicochemical properties.
- We dubbed the novel
approach Chym, for Chemical Hybrid Metric. We extract
semantic information from ChEBI, the Chemical Entities
Detection of splice junctions from paired-end RNA-seq data by SpliceMap
http://nar.oxfordjournals.org/content/38/14/4570.abstract
TopHat
http://tophat.cbcb.umd.edu/
~ 151 317 exon junctions, including 23 020 novel junctions, which were not reported in RefSeq (19), Ensembl (20) and KnownGene (21). in the human brain tissue
- expression level (in RKPM)
- Novel junction discovery is the major function of SpliceMap, which therefore cannot be replace-able by annotation-independent ERANGE
- SpliceMap gener-ates the seeding by using short-read alignment tools such
as ELAND and SeqMap, while BLAT makes use of a hash table.
- For the Illumina protocol used in to produce our data, the distance between two
paired-end reads is about 200 nt in the mRNA
Sensitivity (identify true +) and Specificity (identify true -)
http://en.wikipedia.org/wiki/Sensitivity_and_specificity
Sensitivity and specificity are statistical measures of the performance of a binary classification test. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified as such (e.g. the percentage of sick people who are correctly identified as having the condition). Specificity measures the proportion of negatives which are correctly identified (e.g. the percentage of healthy people who are correctly identified as not having the condition). These two measures are closely related to the concepts of type I and type II errors. A theoretical, optimal prediction can achieve 100% sensitivity (i.e. predict all people from the sick group as sick) and 100% specificity (i.e. not predict anyone from the healthy group as sick).
Sensitivity and specificity are statistical measures of the performance of a binary classification test. Sensitivity (also called recall rate in some fields) measures the proportion of actual positives which are correctly identified as such (e.g. the percentage of sick people who are correctly identified as having the condition). Specificity measures the proportion of negatives which are correctly identified (e.g. the percentage of healthy people who are correctly identified as not having the condition). These two measures are closely related to the concepts of type I and type II errors. A theoretical, optimal prediction can achieve 100% sensitivity (i.e. predict all people from the sick group as sick) and 100% specificity (i.e. not predict anyone from the healthy group as sick).
Paired-end (PE, 500bp) vs. Mate Pairs (longer PE, for structural variations, 2-10kbp)
http://seqanswers.com/forums/showthread.php?t=503&page=2
http://investor.illumina.com/phoenix.zhtml?c=121127&p=irol-newsArticle_print&ID=1248574&highlight=
Illumina refers to "paired end" as the original library preparation method they use, where you sequence each end of the same molecule. Because of the way the cluster generation technology works, it is limited to an inter-pair distance of ~300bp ( 200-600bp).
Illumina refers to "mate pairs" as sequences derived from their newer library prep method which is designed to provide paired sequences separated by a greater distance (between about 2 and 10kb). This method still actually only sequences the ends of ~400bp molecules, but this template is derived from both ends of a 2-10kb fragment that has had the middle section cut out and the 'internal' ends ligated in the middle. Basically, you take your 2-10kb random fragments, biotinylate the end, circularise them, shear the circles to ~400bp, capture biotinylated molecules, and then sequence those (they go into what is essentially a standard 'paired end' sample prep procedure).
http://www.nature.com/ng/journal/v37/n7/full/ng1562.html
http://www.nature.com/nature/journal/v431/n7011/full/nature03001.html
Used for studying structural variations
http://investor.illumina.com/phoenix.zhtml?c=121127&p=irol-newsArticle_print&ID=1248574&highlight=
Illumina refers to "paired end" as the original library preparation method they use, where you sequence each end of the same molecule. Because of the way the cluster generation technology works, it is limited to an inter-pair distance of ~300bp ( 200-600bp).
Illumina refers to "mate pairs" as sequences derived from their newer library prep method which is designed to provide paired sequences separated by a greater distance (between about 2 and 10kb). This method still actually only sequences the ends of ~400bp molecules, but this template is derived from both ends of a 2-10kb fragment that has had the middle section cut out and the 'internal' ends ligated in the middle. Basically, you take your 2-10kb random fragments, biotinylate the end, circularise them, shear the circles to ~400bp, capture biotinylated molecules, and then sequence those (they go into what is essentially a standard 'paired end' sample prep procedure).
http://www.nature.com/ng/journal/v37/n7/full/ng1562.html
http://www.nature.com/nature/journal/v431/n7011/full/nature03001.html
Used for studying structural variations
Tuesday, October 19, 2010
Python NetworkX
NetworkX is a Python package for the creation, manipulation, and study of the structure, dynamics, and functions of complex networks.
http://networkx.lanl.gov/
http://networkx.lanl.gov/
Data modelling in Rails
http://www.mongodb.org/display/DOCS/MongoDB+Data+Modeling+and+Rails
To find all the stories voted on by a given user:
To find all the stories voted on by a given user:
Story.all(:conditions => {:voters => @user.id})
http://biodegradablegeek.com/2008/07/how-to-use-fixtures-to-populate-your-database-in-rails/
$ rake db:fixtures:load
jruby -S gem list
Guide to Rails command line
http://guides.rubyonrails.org/command_line.html
$ RAILS_ENV=development jruby -S gem list
*** LOCAL GEMS ***
actionmailer (2.3.5)
actionpack (2.3.5)
activerecord (2.3.5)
activerecord-jdbc-adapter (0.9.3)
activerecord-jdbcmysql-adapter (0.9.3)
activeresource (2.3.5)
activesupport (2.3.5)
block_helpers (0.3.2)
builder (2.1.2)
columnize (0.3.1)
cucumber (0.9.2)
cucumber-rails (0.3.2)
diff-lcs (1.1.2)
factory_girl (1.2.4)
gem_plugin (0.2.3)
gherkin (2.2.9)
haml (2.2.22)
jdbc-mysql (5.0.4)
jruby-jars (1.5.3)
jruby-openssl (0.6)
jruby-rack (1.0.3)
json (1.4.6)
rack (1.0.1)
rails (2.3.5)
rake (0.8.7)
rdoc (2.5.11)
rdoc-data (2.5.3)
rspec (1.3.0)
rspec-core (2.0.1)
rspec-expectations (2.0.1)
rspec-mocks (2.0.1)
rspec-rails (1.3.2)
ruby-debug (0.10.3)
ruby-debug-base (0.10.3.2)
ruby-openid (2.1.7)
rubyzip (0.9.4)
sources (0.0.1)
term-ansicolor (1.0.5)
warbler (1.2.1, 0.9.13)
$ RAILS_ENV=development jruby -S ./script/generate rspec_model users
http://lukeredpath.co.uk/blog/developing-a-rails-model-using-bdd-and-rspec-part-1.html
http://rspec.info/
http://rspec.info/documentation/
$ jruby -S spec/models/users_spec.rb
spec/models/users_spec.rb:1:in `require': no such file to load -- spec_helper (LoadError)
from spec/models/users_spec.rb:1
$ cd spec/
$ jruby -S models/users_spec.rb
http://github.com/rspec/rspec-dev
http://guides.rubyonrails.org/command_line.html
$ RAILS_ENV=development jruby -S gem list
*** LOCAL GEMS ***
actionmailer (2.3.5)
actionpack (2.3.5)
activerecord (2.3.5)
activerecord-jdbc-adapter (0.9.3)
activerecord-jdbcmysql-adapter (0.9.3)
activeresource (2.3.5)
activesupport (2.3.5)
block_helpers (0.3.2)
builder (2.1.2)
columnize (0.3.1)
cucumber (0.9.2)
cucumber-rails (0.3.2)
diff-lcs (1.1.2)
factory_girl (1.2.4)
gem_plugin (0.2.3)
gherkin (2.2.9)
haml (2.2.22)
jdbc-mysql (5.0.4)
jruby-jars (1.5.3)
jruby-openssl (0.6)
jruby-rack (1.0.3)
json (1.4.6)
rack (1.0.1)
rails (2.3.5)
rake (0.8.7)
rdoc (2.5.11)
rdoc-data (2.5.3)
rspec (1.3.0)
rspec-core (2.0.1)
rspec-expectations (2.0.1)
rspec-mocks (2.0.1)
rspec-rails (1.3.2)
ruby-debug (0.10.3)
ruby-debug-base (0.10.3.2)
ruby-openid (2.1.7)
rubyzip (0.9.4)
sources (0.0.1)
term-ansicolor (1.0.5)
warbler (1.2.1, 0.9.13)
$ RAILS_ENV=development jruby -S ./script/generate rspec_model users
http://lukeredpath.co.uk/blog/developing-a-rails-model-using-bdd-and-rspec-part-1.html
http://rspec.info/
http://rspec.info/documentation/
$ jruby -S spec/models/users_spec.rb
spec/models/users_spec.rb:1:in `require': no such file to load -- spec_helper (LoadError)
from spec/models/users_spec.rb:1
$ cd spec/
$ jruby -S models/users_spec.rb
http://github.com/rspec/rspec-dev
Sunday, October 17, 2010
ssh ssh-keygen proxy tunnel
Setup is first we connect to host1 then a tunnel from host1 to host2
$ cat ~/.ssh/config
Host host2 i host2
ProxyCommand ssh -q mylogin@host1 nc host2 28 User mylogin
Connect by:
$ ssh i -l mylogin
Copy by:
$ scp myfile.txt mylogin@host2:outfile.txt
Setup RSA keys by
$ ssh-keygen -t rsa
copy ~/.ssh/id_rsa.pub to host2's ~/.ssh/authorized_keys
Using pipes and tar
tar c somefiles*.txt | ssh user1@host1 tar xvp
tar c somefiles*.txt | ssh user1@host1 ssh user2@host2 tar xvp
'pv' command - progress bar view
tar c somefiles*.* | pv -s 75m | ssh user1@host1 tar xp
$ cat ~/.ssh/config
Host host2 i host2
ProxyCommand ssh -q mylogin@host1 nc host2 28 User mylogin
Connect by:
$ ssh i -l mylogin
Copy by:
$ scp myfile.txt mylogin@host2:outfile.txt
Setup RSA keys by
$ ssh-keygen -t rsa
copy ~/.ssh/id_rsa.pub to host2's ~/.ssh/authorized_keys
Using pipes and tar
tar c somefiles*.txt | ssh user1@host1 tar xvp
tar c somefiles*.txt | ssh user1@host1 ssh user2@host2 tar xvp
'pv' command - progress bar view
tar c somefiles*.* | pv -s 75m | ssh user1@host1 tar xp
Thursday, October 14, 2010
PAM vs BLOSUM
http://www.ncbi.nlm.nih.gov/Education/BLASTinfo/Scoring2.html
The relationship between BLOSUM and PAM substitution matrices. BLOSUM matrices with higher numbers and PAM matrices with low numbers are both designed for comparisons of closely related sequences. BLOSUM matrices with low numbers and PAM matrices with high numbers are designed for comparisons of distantly related proteins. If distant relatives of the query sequence are specifically being sought, the matrix can be tailored to that type of search.
The PAM family
PAM matrices are based on global alignments of closely related proteins.
The PAM1 is the matrix calculated from comparisons of sequences with no more than 1% divergence.
BLOSUM matrices are based on local alignments.
BLOSUM 62 is a matrix calculated from comparisons of sequences with no less than 62% divergence.
All BLOSUM matrices are based on observed alignments; they are not extrapolated from comparisons of closely related proteins.
BLOSUM 62 is the default matrix in BLAST 2.0.
The relationship between BLOSUM and PAM substitution matrices. BLOSUM matrices with higher numbers and PAM matrices with low numbers are both designed for comparisons of closely related sequences. BLOSUM matrices with low numbers and PAM matrices with high numbers are designed for comparisons of distantly related proteins. If distant relatives of the query sequence are specifically being sought, the matrix can be tailored to that type of search.
The PAM family
PAM matrices are based on global alignments of closely related proteins.
The PAM1 is the matrix calculated from comparisons of sequences with no more than 1% divergence.
BLOSUM matrices are based on local alignments.
BLOSUM 62 is a matrix calculated from comparisons of sequences with no less than 62% divergence.
All BLOSUM matrices are based on observed alignments; they are not extrapolated from comparisons of closely related proteins.
BLOSUM 62 is the default matrix in BLAST 2.0.
PeakPicker
http://genome.cshlp.org/content/15/11/1584.full
PeakPicker is developed for quantitative allele ratio analysis and can be used to determine differential allelic expression in cells heterozygous for a marker SNP expressed in mRNA by measuring and calculating the peak height ratios of the marker SNP.
PeakPicker is developed for quantitative allele ratio analysis and can be used to determine differential allelic expression in cells heterozygous for a marker SNP expressed in mRNA by measuring and calculating the peak height ratios of the marker SNP.
WebLogo
http://weblogo.berkeley.edu/examples.html
WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Click here to create your own sequence logos.
Sequence logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richer and more precise description of, for example, a binding site, than would a consensus sequence.
WebLogo is a web based application designed to make the generation of sequence logos as easy and painless as possible. Click here to create your own sequence logos.
Sequence logos are a graphical representation of an amino acid or nucleic acid multiple sequence alignment developed by Tom Schneider and Mike Stephens. Each logo consists of stacks of symbols, one stack for each position in the sequence. The overall height of the stack indicates the sequence conservation at that position, while the height of symbols within the stack indicates the relative frequency of each amino or nucleic acid at that position. In general, a sequence logo provides a richer and more precise description of, for example, a binding site, than would a consensus sequence.
Tuesday, October 12, 2010
JRuby, Ant, Tomcat, Rails
http://gregmoreno.ca/deploy-a-rails-3-sqlite3-application-in-tomcat-using-jruby/
http://blog.emptyway.com/2008/04/08/120-seconds-guide-to-jruby-on-rails/
http://blog.emptyway.com/2008/04/08/120-seconds-guide-to-jruby-on-rails/
(jruby 1.5.3, Rails version 2.3.5)
jruby -S rails myapp -d mysql
cd myapp
jruby -S rake db:create:all
jruby script/generate scaffold post title:string body:text published:boolean
jruby script/generate model keywords keyword:string source:string
jruby script/generate migrate add_word_to_keywords
jruby -S rake db:migrate
jruby script/server
http://wiki.rubyonrails.org/rails/pages/availablegenerators
http://www.tutorialspoint.com/ruby-on-rails/rails-and-rake.htm
Reset MySQL password
$ sudo service mysql stop
$ sudo mysqld_safe --skip-grant-tables
$ mysql -u root
mysql> update mysql.user set password=password('newpassword') where user='root';
mysql> flush privileges;
http://www.tech-faq.com/how-do-i-reset-a-mysql-password.html
To assign passwords to the
$ sudo mysqld_safe --skip-grant-tables
$ mysql -u root
mysql> update mysql.user set password=password('newpassword') where user='root';
mysql> flush privileges;
http://www.tech-faq.com/how-do-i-reset-a-mysql-password.html
To assign passwords to the
root
accounts using mysqladmin, execute the following commands: shell>mysqladmin -u root password "
shell>newpwd
"mysqladmin -u root -h
host_name
password "newpwd
"
Monday, October 11, 2010
Word Clouds in R
http://www.r-bloggers.com/abstract-word-clouds-using-r/
http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/efetch_help.html
http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/esearch_help.html
http://math.illinoisstate.edu/dhkim/rstuff/rtutor.html
> library(lattice)
> x
a b
4 d 4
3 c 3
2 b 2
1 a 1
> x[order(x$b,decreasing=TRUE),]
> xyplot(b ~ a, data = x, groups=a, ylab='', xlab='', scales=list(x=list(tck=0, at=0),y=list(tck=0, at=0)), panel = function(x,y,subscripts,groups) ltext(x = c(mean(y),sample(1:max(y-1))), y = c(mean(y),sample(1:max(y-1))), label=groups[subscripts], cex=1*y^1.5, fontfamily = c("AvantGarde", "Bookman", "Courier", "Helvetica", "Helvetica-Narrow", "NewCenturySchoolbook", "Palatino", "Times"), col=c('red','blue')))
http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/efetch_help.html
http://eutils.ncbi.nlm.nih.gov/corehtml/query/static/esearch_help.html
http://math.illinoisstate.edu/dhkim/rstuff/rtutor.html
> library(lattice)
> x
a b
4 d 4
3 c 3
2 b 2
1 a 1
> x[order(x$b,decreasing=TRUE),]
> xyplot(b ~ a, data = x, groups=a, ylab='', xlab='', scales=list(x=list(tck=0, at=0),y=list(tck=0, at=0)), panel = function(x,y,subscripts,groups) ltext(x = c(mean(y),sample(1:max(y-1))), y = c(mean(y),sample(1:max(y-1))), label=groups[subscripts], cex=1*y^1.5, fontfamily = c("AvantGarde", "Bookman", "Courier", "Helvetica", "Helvetica-Narrow", "NewCenturySchoolbook", "Palatino", "Times"), col=c('red','blue')))
Loss of heterozygosity (LOH)
Loss of heterozygosity (LOH) in a cell represents the loss of normal function of one allele of a gene in which the other allele was already inactivated. This term is mostly used in the context of oncogenesis; after an inactivating mutation in one allele of a tumor suppressor gene occurs in the parent's germline cell, it is passed on to the zygote resulting in an offspring that is heterozygous for that allele. In oncology, loss of heterozygosity occurs when the remaining functional allele in a somatic cell of the offspring becomes inactivated by mutation. This could cause a normal tumor suppressor to no longer be produced which could result in tumorigenesis.
Canon MP560 Scanner in Ubuntu
http://ubuntuforums.org/showthread.php?t=1264928&page=3
Install Canon MP560 ScanGear
http://software.canon-europe.com/products/0010756.asp
$ cd scangearmp-mp560series-1.40-1-i386-deb
$ vi ./install.sh # add sudo dpkg --force-architecture -i
$ sudo ./install.sh
$ scangearmp
scangearmp: error while loading shared libraries: libgimp-2.0.so.0: cannot open shared object file: No such file or directory
Install Gimp lib32 libraries
$ wget http://mirrors.kernel.org/ubuntu/pool/main/g/gimp/libgimp2.0_2.6.7-1ubuntu1_i386.deb
$ mkdir libgimp2.0_2.6.7-1ubuntu1_i386
$ dpkg -x libgimp2.0_2.6.7-1ubuntu1_i386.deb libgimp2.0_2.6.7-1ubuntu1_i386
$ cd libgimp2.0_2.6.7-1ubuntu1_i386/
$ sudo cp usr/lib/libgimp* /usr/lib32/
And if all else fails, you can still scan images to a USB stick!
Install Canon MP560 ScanGear
http://software.canon-europe.com/products/0010756.asp
$ cd scangearmp-mp560series-1.40-1-i386-deb
$ vi ./install.sh # add sudo dpkg --force-architecture -i
$ sudo ./install.sh
$ scangearmp
scangearmp: error while loading shared libraries: libgimp-2.0.so.0: cannot open shared object file: No such file or directory
Install Gimp lib32 libraries
$ wget http://mirrors.kernel.org/ubuntu/pool/main/g/gimp/libgimp2.0_2.6.7-1ubuntu1_i386.deb
$ mkdir libgimp2.0_2.6.7-1ubuntu1_i386
$ dpkg -x libgimp2.0_2.6.7-1ubuntu1_i386.deb libgimp2.0_2.6.7-1ubuntu1_i386
$ cd libgimp2.0_2.6.7-1ubuntu1_i386/
$ sudo cp usr/lib/libgimp* /usr/lib32/
And if all else fails, you can still scan images to a USB stick!
Friday, October 8, 2010
Howl's Moving Castle
Nice movie!
When an unconfident young woman is cursed with an old body by a spiteful witch, her only chance of breaking the spell lies with a self-indulgent yet insecure young wizard and his companions in his legged, walking home.
Director:
Hayao MiyazakiWhen an unconfident young woman is cursed with an old body by a spiteful witch, her only chance of breaking the spell lies with a self-indulgent yet insecure young wizard and his companions in his legged, walking home.
Michael Sjoerdsma sfu technical writing
http://www.sfu.ca/immr/pmp/people.htm
Michael is currently a faculty member in the School of Engineering Science
at Simon Fraser University teaching courses related to technical writing,
group dynamics, graphical communication, and ethics and law.nique Process Feaures
Michael is currently a faculty member in the School of Engineering Science
at Simon Fraser University teaching courses related to technical writing,
group dynamics, graphical communication, and ethics and law.nique Process Feaures
Tuesday, October 5, 2010
Ruby create a class object with name determined at runtime, calling dynamic methods
http://ruby-doc.org/docs/ProgrammingRuby/html/ospace.html
Calling a method unknown during compile time.
So we want to call a method but we only get the method's name at runtime, use the 'send' method
ruby-1.8.7-p302 > 'John Coltran'.send('length')
=> 12
or using 'method' to call it later
or 'eval'
r = eval "'John Coltran'.length"
=> 12
-------------------------------------
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/190828
$ irb
ruby-1.8.7-p302 > b=Object::const_get('String').new()
=> ""
ruby-1.8.7-p302 > b
=> ""
ruby-1.8.7-p302 > b.class
=> String
# Print all User tuples
> b = Object::const_get('User').new()
> puts b.class
=> String
> puts b.class.all
#
$ rails console < my_script_helper.rb
--------class----------------
vs
--------module-------------
http://ruby-doc.org/core/classes/Module.html
A Module is a collection of methods and constants.
Calling a method unknown during compile time.
So we want to call a method but we only get the method's name at runtime, use the 'send' method
ruby-1.8.7-p302 > 'John Coltran'.send('length')
=> 12
or using 'method' to call it later
or 'eval'
r = eval "'John Coltran'.length"
=> 12
-------------------------------------
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/190828
$ irb
ruby-1.8.7-p302 > b=Object::const_get('String').new()
=> ""
ruby-1.8.7-p302 > b
=> ""
ruby-1.8.7-p302 > b.class
=> String
# Print all User tuples
> b = Object::const_get('User').new()
> puts b.class
=> String
> puts b.class.all
#
# The Greeter class
class Greeter
def initialize(name)
@name = name.capitalize
end
def salute
puts "Hello #{@name}!"
end
end
# Create a new object
g = Greeter.new("world")
# Output "Hello World!"
g.salute
module Mod alias_method :orig_exit, :exit def exit(code=0) puts "Exiting with code #{code}" orig_exit(code) end end include Mod exit(99)
Monday, October 4, 2010
R blogs and gallery
http://addictedtor.free.fr/graphiques/
http://www.statmethods.net/input/missingdata.html
# create new dataset without missing data
newdata <- na.omit(mydata)
Reshaping data
# example of melt function
library(reshape)
mdata <- melt(mydata, id=c("id","time"))
# Creating a Graph with a linear model regression line
attach(mtcars)
plot(wt, mpg)
abline(lm(mpg~wt))
title("Regression of MPG on Weight")
# Filled Density Plot
d <- density(mtcars$mpg)
plot(d, main="Kernel Density of Miles Per Gallon")
polygon(d, col="red", border="blue")
Comparing Groups VIA Kernal Density
The sm.density.compare( ) function in the sm package allows you to superimpose the kernal density plots of two or more groups. The format is sm.density.compare(x, factor) where x is a numeric vector and factor is the grouping variable.# Compare MPG distributions for cars with
# 4,6, or 8 cylinders
library(sm)
attach(mtcars)
# create value labels
cyl.f <- factor(cyl, levels= c(4,6,8),
labels = c("4 cylinder", "6 cylinder", "8 cylinder"))
# plot densities
sm.density.compare(mpg, cyl, xlab="Miles Per Gallon")
title(main="MPG Distribution by Car Cylinders")
# add legend via mouse click
colfill<-c(2:(2+length(levels(cyl.f))))
legend(locator(1), levels(cyl.f), fill=colfill)
http://yihui.name/en/page/2/
http://yihui.name/en/2009/06/creating-tag-cloud-using-r-and-flash-javascript-swfobject/#more-224 - Tag Cloud
Side by side plot
# png(width = 500, height = 300)
x =
rep
(0, 1000)
par
(mfrow =
c
(1, 2), mar =
c
(4, 4, 0.1, 0.1))
plot
(
density
(x), main =
""
)
plot
(
density
(x), main =
""
)
rug
(
jitter
(x))
# dev.off()
# transparent colors (alpha = 0.1)
plot
(x, col =
rgb
(0, 0, 0, 0.1))
http://processtrends.com/RClimate.htm
## Use regexp to replace all the occurences of **** with NA lines2 <- gsub("\\*{3,5}", " NA", lines, perl=TRUE)
## Select monthly data in first 13 columns df <- df[,1:13]
## Remove rows where Year=NA from the dataframe df <- df [!is.na(df$Year),]
Sunday, October 3, 2010
Using lattice in R
http://www.his.sunderland.ac.uk/~cs0her/Statistics/UsingLatticeGraphicsInR.htm
## Multiple variables in formula for grouped displays
xyplot(Sepal.Length + Sepal.Width ~ Petal.Length + Petal.Width | Species,
data = iris, scales = "free", layout = c(2, 2),
auto.key = list(x = .6, y = .7, corner = c(0, 0)))
## user defined panel functions
states <- data.frame(state.x77,
state.name = dimnames(state.x77)[[1]],
state.region = state.region)
xyplot(Murder ~ Population | state.region, data = states,
groups = state.name,
panel = function(x, y, subscripts, groups)
ltext(x = x, y = y, label = groups[subscripts], cex=1,
fontfamily = "HersheySans"))
http://learnr.wordpress.com/2009/08/18/ggplot2-version-of-figures-in-lattice-multivariate-data-visualization-with-r-part-13/
http://data.princeton.edu/R/gettingStarted.html
http://www.r-bloggers.com/5-minute-analysis-in-r-case-shiller-indices/
http://lmdvr.r-forge.r-project.org/figures/figures.html
Lucid update Ubuntu splash in grub
http://anonir.wordpress.com/2010/08/08/ubuntu-lucid-disable-boot-splash/
Open
For example, if your grub has this line:
Change it to this:
Then run this command to update grub2:
Open
/etc/default/grub
for editing and remove “quiet splash
” options from the GRUB_CMDLINE_LINUX_DEFAULT
property.For example, if your grub has this line:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash"
Change it to this:
GRUB_CMDLINE_LINUX_DEFAULT=""
Then run this command to update grub2:
sudo update-grub
R removing NA
http://www.opensubscriber.com/message/r-help@stat.math.ethz.ch/7268077.html
> df<-data.frame(name=c('a','b'), age=c('1','2'))
> if (any(apply(df,1,function(x) any(is.na(x)))) == TRUE) { r <- df[-which(apply(df,1,function(x) any(is.na(x)))),] } else { r <- df }
> r
name age
1 a 1
2 b 2
> df<-data.frame(name=c('a','b'), age=c('1',NA))
> df
name age
1 a 1
2 b
> if (any(apply(df,1,function(x) any(is.na(x)))) == TRUE) { r <- df[-which(apply(df,1,function(x) any(is.na(x)))),] } else { r <- df }
> r
name age
1 a 1
> df<-data.frame(name=c('a','b'), age=c('1','2'))
> if (any(apply(df,1,function(x) any(is.na(x)))) == TRUE) { r <- df[-which(apply(df,1,function(x) any(is.na(x)))),] } else { r <- df }
> r
name age
1 a 1
2 b 2
> df<-data.frame(name=c('a','b'), age=c('1',NA))
> df
name age
1 a 1
2 b
> if (any(apply(df,1,function(x) any(is.na(x)))) == TRUE) { r <- df[-which(apply(df,1,function(x) any(is.na(x)))),] } else { r <- df }
> r
name age
1 a 1
## Remove rows where Year=NA from the dataframe df <- df [!is.na(df$Year),]
Saturday, October 2, 2010
tm package on R
R 2.11 is needed by tm (tm_0.5-4.1.tar.gz) but only R 2.10 is in the lucid ubuntu repo
so
http://ubuntuforums.org/showthread.php?t=639710
deb http://cran.r-project.org/bin/linux/ubuntu lucid/
so
http://ubuntuforums.org/showthread.php?t=639710
deb http://cran.r-project.org/bin/linux/ubuntu lucid/
gpg --keyserver subkeys.pgp.net --recv-key E2A11821 gpg -a --export E2A11821 | sudo apt-key add -
My R libraries installed to
$HOME/R/x86_64-pc-linux-gnu-library/2.11
So you need to add this path in the 'R console' Run configuration
then I got a 'checking for xml2-config... no' error when doing
install.packages('XML')
so do
$ sudo apt-get install libxml2-dev
Ubuntu 10.04.1 LTS Lucid Lynx
#enable canonical archive in /etc/apt/sources.list
sudo apt-get install sun-java6-jdk
sudo apt-get install r-base
sudo apt-get install sun-java6-jdk
sudo apt-get install r-base
sudo apt-get update && sudo apt-get install cairo-dock cairo-dock-plug-ins
Ubuntu Intrepid AMD64
https://launchpad.net/ubuntu/intrepid/amd64
/var/lib/dpkg$ sudo cp status-old2 status
/var/lib/dpkg$ sudo cp status-old2 status
Friday, October 1, 2010
/usr/bin/ld: cannot find -lgfortran
$ ld -lgfortran
ld: cannot find -lgfortran
$ sudo ln -s /usr/lib/libgfortran.so.3.0.0 /usr/lib/libgfortran.so
$ ld -lgfortran
ld: warning: cannot find entry symbol _start; not setting start address
ld: cannot find -lgfortran
$ sudo ln -s /usr/lib/libgfortran.so.3.0.0 /usr/lib/libgfortran.so
$ ld -lgfortran
ld: warning: cannot find entry symbol _start; not setting start address
School, Life, Lessons
" The difference between school and life? In school, you're taught a lesson and the given a test. In life, you're given a test that teaches you a lesson" - Tom Bodett
Subscribe to:
Posts (Atom)