. It currently consists of
- analytic tools
-
lyndonfactorization
: counts the number of Lyndon factors. Outputs all ending positions of Lyndon factors when setting the environment variableRUST_LOG=debug
-
count_r
: counts the number of runs in the BWT obtained by the suffix array -
count_sigma
: counts the number of different characters -
count_z
: counts the number of overlapping LZ77 factors -
entropy
: counts the k-th order empirical entropy
-
- generators
-
thuemorse
: computes the n-th Thue-Morse word -
fibonacci
: computes the n-th Fibonacci word -
perioddoubling
computes the n-th period-doubling sequence -
debruijn
: computes a binary de Bruijin word of order n
-
Usage
Compile and run with cargo
of rust
-lang:
cargo build
cargo run --bin count_sigma -- --file ./data/tudocomp/einstein.en.txt
cargo run --bin fibonacci 5
Datasets can be found at http://dolomit.cs.tu-dortmund.de/tudocomp/
The output format of the analytic tools is compatible with sqlplot.