Benchmark
Datasets and results are described at http://shenwei356.github.io/fakit/benchmark
The benchmark needs be performed in Linux-like operating systems.
Install softwares
Softwares
- fakit. (Go).
Version v0.1.9.
- fasta_utilities. (Perl).
Version 3dcc0bc.
Lots of dependencies to install_.
- fastx_toolkit. (Perl).
Version 0.0.13.
Can't handle multi-line FASTA files_.
- seqmagick. (Python).
Version 0.6.1
- seqtk. (C).
Version 1.0-r82-dirty.
Not used:
- pyfaidx. (Python).
Version 0.4.7.1. *Not used, because it
A Python script memusg was used
to computate running time and peak memory usage of a process.
Attention: the fasta_utilities uses Perl module Term-ProgressBar
which makes it failed to run when using benchmark script run_benchmark_00_all.pl.
Please change the source code of ProgressBar.pm (for me, the path is
/usr/share/perl5/vendor_perl/Term/ProgressBar.pm). Add the code below after line 535:
$config{bar_width} = 1 if $config{bar_width} < 1;
The edited code is
} else {
$config{bar_width} = $target;
$config{bar_width} = 1 if $config{bar_width} < 1; # new line
die "configured bar_width $config{bar_width} < 1"
if $config{bar_width} < 1;
}
Data preparation
http://shenwei356.github.io/fakit/benchmark/#datasets
Run tests
A Perl scripts
run.pl
is used to automatically running tests and generate data for plotting.
$ perl run.pl -h
Usage:
1. Run all tests:
perl run.pl run_benchmark*.sh --outfile benchmark.5test.csv
2. Run one test:
perl run.pl run_benchmark_04_remove_duplicated_seqs_by_name.sh -o benchmark.rmdup.csv
3. Custom repeate times:
perl run.pl -n 3 run_benchmark_04_remove_duplicated_seqs_by_name.sh -o benchmark.rmdup.csv
To compare performance between different softwares, run:
./run.pl run_benchmark*.sh -n 3 -o benchmark.5tests.csv
To test performance of other functions in fakit, run:
./run.pl run_test*.sh -n 1 -o benchmark.fakit.csv
Plot result
R libraries dplyr, ggplot2, scales, ggthemes, ggrepel are needed.
Plot for result of the five tests:
./plot2.R -i benchmark.5tests.csv
Plot for result of the stest of other functions in fakit:
./plot2.R -i benchmark.fakit.csv --width 5 --height 3