Tuesday, January 21, 2014

Taking R to the Limit (High Performance Computing in R)

http://www.slideshare.net/bytemining/r-hpc

bigmemory - it is ideal for problems involving the analysis in R for manageable subsets of the data, or when an analysis is conducted mostly in C++

the "big" family

biganalytics, synchronicity, bigtabulate, big.matrix, bigalgebra, bigvideo, shared.big.matrix, filebacked.big.matrix, bigsplit

linear models: biglm.big.matrix

mwhich

ff - "fast access files" - file-based access to datasets that cannot fit in memory

data.table

mapReduce - apply(map(data), reduce)

HadoopStreaming

No comments: