esl-seqstat - Man Page
summarize contents of a sequence file
Synopsis
esl-seqstat [options] seqfile
Description
esl-seqstat summarizes the contents of the seqfile. It prints the format, alphabet type, number of sequences, total number of residues, and the mean, smallest, and largest sequence length.
If seqfile is - (a single dash), sequence input is read from stdin.
Options
- -h
Print brief help; includes version number and summary of all options, including expert options.
- -a
Additionally show a summary statistic line showing the name, length, and description of each individual sequence. Each of these lines is prefixed by an = character, in order to allow these lines to be easily grepped out of the output.
- -c
Additionally print the residue composition of the sequence file.
Expert Options
- --informat <s>
Assert that input seqfile is in format <s>, bypassing format autodetection. Common choices for <s> include: fasta, embl, genbank. Alignment formats also work; common choices include: stockholm, a2m, afa, psiblast, clustal, phylip. For more information, and for codes for some less common formats, see main documentation. The string <s> is case-insensitive (fasta or FASTA both work).
- --amino
Assert that the seqfile contains protein sequences.
- --dna
Assert that the seqfile contains DNA sequences.
- --rna
Assert that the seqfile contains RNA sequences.
See Also
http://bioeasel.org/
Copyright
Copyright (C) 2020 Howard Hughes Medical Institute. Freely distributed under the BSD open source license.
Author
http://eddylab.org