http://assemblathon.org/assemblathon-2-basic-assembly-metrics
N50 scaffold/contig length is
calculated by summing lengths of scaffolds/contigs from the longest to
the shortest and determining at what point you reach 50% of the total
assembly size. The length of the scaffold/contig at that point is the
N50 length.
http://en.wikipedia.org/wiki/Contig
A sequence contig is a contiguous, overlapping sequence read resulting
from the reassembly of the small DNA fragments generated by bottom-up sequencing strategies.
http://en.wikipedia.org/wiki/N50_statistic
The N50 size is computed by sorting all contigs from largest to smallest
and by determining the minimum set of contigs whose sizes total 50% of
the entire genome. For example, for a genome of 600Mb, if the assembled
sequences add up to 500Mb, the N50 would be calculated by sorting the
contigs from largest to smallest and finding the length of the contig
where the cumulative size is 250Mb.
http://seqanswers.com/forums/showthread.php?t=2332
https://www.broad.harvard.edu/crd/wiki/index.php/N50
Given a set of sequences of varying lengths, the N50 length is defined as the length N for which 50% of all bases in the sequences are in a sequence of length L < N.
the N50 (L50) is the median contig length from a list of all the contigs lengths in the assembly
N50 of {2, 2, 2, 3, 3, 4, 8, 8} is 5
No comments:
Post a Comment