esl-seqstat - Man Page

summarize contents of a sequence file

Synopsis

esl-seqstat [options] seqfile

Description

esl-seqstat summarizes the contents of the seqfile. It prints the format, alphabet type, number of sequences, total number of residues, and the mean, smallest, and largest sequence length.

If seqfile is - (a single dash), sequence input is read from stdin.

Options

-h: Print brief help; includes version number and summary of all options, including expert options.
-a: Additionally show a summary statistic line showing the name, length, and description of each individual sequence. Each of these lines is prefixed by an = character, in order to allow these lines to be easily grepped out of the output.
-c: Additionally print the residue composition of the sequence file.

Expert Options

--informat <s>: Assert that input seqfile is in format <s>, bypassing format autodetection. Common choices for <s> include: fasta, embl, genbank. Alignment formats also work; common choices include: stockholm, a2m, afa, psiblast, clustal, phylip. For more information, and for codes for some less common formats, see main documentation. The string <s> is case-insensitive (fasta or FASTA both work).
--amino: Assert that the seqfile contains protein sequences.
--dna: Assert that the seqfile contains DNA sequences.
--rna: Assert that the seqfile contains RNA sequences.

Copyright

Copyright (C) 2020 Howard Hughes Medical Institute.
Freely distributed under the BSD open source license.

Author

http://eddylab.org

Info

Nov 2020 Easel 0.48 Easel Manual