esl-selectn - Man Page
select random subset of lines from file
Synopsis
esl-selectn [options] nlines filename
Description
esl-selectn selects nlines lines at random from file filename and outputs them on stdout.
If filename is - (a single dash), input is read from stdin.
Uses an efficient reservoir sampling algorithm that only requires only a single pass through filename, and memory storage proportional to nlines (and importantly, not to the size of the file filename itself). esl-selectn can therefore be used to create large scale statistical sampling experiments, especially in combination with other Easel miniapplications.
Options
- -h
Print brief help; includes version number and summary of all options, including expert options.
- --seed <d>
Set the random number seed to <d>, an integer >= 0. The default is 0, which means to use a randomly selected seed. A seed > 0 results in reproducible identical samples from different runs of the same command.
See Also
http://bioeasel.org/
Copyright
Copyright (C) 2020 Howard Hughes Medical Institute. Freely distributed under the BSD open source license.
Author
http://eddylab.org