Wednesday, December 12, 2012

Makefile and A Quick Guide to Organizing Computational Biology Projects

http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000424

Most bioinformatics coursework focuses on algorithms, with perhaps some components devoted to learning programming skills and learning how to use existing bioinformatics software. Unfortunately, for students who are preparing for a research career, this type of curriculum fails to address many of the day-to-day organizational challenges associated with performing computational experiments. In practice, the principles behind organizing and documenting computational experiments are often learned on the fly, and this learning is strongly influenced by personal predilections as well as by chance interactions with collaborators or colleagues.
The purpose of this article is to describe one good strategy for carrying out computational experiments. I will not describe profound issues such as how to formulate hypotheses, design experiments, or draw conclusions. Rather, I will focus on relatively mundane issues such as organizing files and directories and documenting progress. These issues are important because poor organizational choices can lead to significantly slower research progress. I do not claim that the strategies I outline here are optimal. These are simply the principles and practices that I have developed over 12 years of bioinformatics research, augmented with various suggestions from other researchers with whom I have discussed these issues.

Makefile in bioinformatics
http://www.slideshare.net/giovanni/makefiles-bioinfo#btnNext

Syntax:
 <targets>: (prerequisites)
<tab><commands>

e.g.
print_hello.txt:
    echo 'hello'

$ make print_hello.txt

Everytime you call make, it'll check if the "print_hello.txt" file (the target) has been created yet.

$make -j all (-j option runs in parallel / cluster)

multiple targets:
$@ - corresponds to target name

1.txt 2.txt:
    echo 'bla' > $@

$make 1.txt
echo 'bla' > 1.txt

Suppress errors with '-' e.g. '-mkdir /var'

In make, every line is executed as a different process so put everything in a single process by putting it inside a bracket

lsvar:
    (cd /var; ls)

Variables
$() - corresponds to variable

working_dir = "/home/foo"
FILES = 1.txt 2.txt

print_wd:
    @echo "dir is $(working_dir)"

pass in variables from the command line
$ make working_dir = "/home/bar"

Functions
$(addprefix <prefix>, list)
$(addsuffix <suffix>, list)

No comments: