ddpt_sgl - Man Page
helper for ddpt utility's scatter gather lists
Synopsis
ddpt_sgl [--action=ACT] [--a-sgl=SGL] [--b-sgl=SGL] [--chs=CHS] [--degen=DV] [--document] [--elem=SE[,LE]] [--extension=FNE] [--flexible] [--help] [--hex] [--iaf=IAF] [--index=IA] [--interleave=IL] [--non-overlap] [--out=O_SGL] [--quiet] [--round=RB] [--sort-cmp=SC] [--stats] [--verbose] [--version]
Description
This utility is a scatter gather list helper for the ddpt utility which copies data between or within SCSI devices (logical units). While ddpt's command line syntax is modelled on that of the POSIX dd command, this utility has a more standard Unix command line syntax with both short and long variants of each option.
Scatter gather lists (sgl_s) are made up of scatter gather elements. Each element is made up a starting logical block address (LBA) and a number of blocks (NUM) from and including that LBA.
The scatter gather lists can also be viewed as arrays in which elements can be accessed by an index. Multiple sgl elements can be accessed with an array of indexes, hence index arrays. Indexes in this utility start at 0 and run to (n - 1) where n is the number of elements in the sgl. Also negative indexes are permitted where -1 is the index of the last sgl element, -2 is the index of the second last sgl element, etc.
For "twin" actions there is an assumed relationship between a-sgl and b-sgl as there is between two sgl_s used as the gather list (e.g. skip=) and the scatter list (e.g. seek=) in the ddpt utility. Breaking it down to individual logical blocks: LBAr0 is read and its data is written to LBAw0, LBAr1-->LBAw1, LBAr2-->LBAw2, etc; or more generally LBAr_n-->LNAw_n. Many actions will change the order in which those "read-write" items are performed, the twin portion of the action attempts to maintain the LBAr_n-->LNAw_n mapping. Generally speaking, copies are the same no matter what order the LBAs are read and written. One exception is an overlapping scatter list (i.e. on the write side) in which case the order of writes onto the same LBA does matter, hence there is an option to check sgl_s are well-formed in that respect: --non-overlap.
For background on scatter gather lists see the section of that name in the ddpt(8) manpage found in this package. There is a web page at https://sg.danny.cz/sg/ddpt.html .
Options
Arguments to long options are mandatory for short options as well.
- -a, --action=ACT
ACT is some action to perform on the given scatter gather list(s). To list the available actions set ACT to 'xxx' (or 'enum'). The available actions are listed in the Actions section below.
- -A, --a-sgl=SGL
SGL is a scatter gather list, a sequence of comma separated unsigned integers (up to 64 bits each). SGL has several forms, the simplest is: LBA0,NUM0,LBA1,NUM1,LBA2,NUM2... and there should be an even number of values with the exception of LBA0 appearing by itself. In this case NUM0 is assumed to be 0. Other SGL forms are '@<filename>' and 'H@<filename>' where the contents of the <filename> is parsed as a scatter gather list. Since there are two options for inputting SGLs, this one is termed as the 'a-sgl'.
See the section on File Formats below and the section on SCATTER GATHER LISTS in the ddpt(8) manpage for more information on sgl_s and their associated terminology.- -B, --b-sgl=SGL
SGL is a scatter gather list, a second list termed as the 'b-sgl' to differentiate it from the other sgl (a-sgl).
- -C, --chs=CHS
CHS is a 3 element tuple, separated by commas. Currently 16 bit values from 1 to 0xffff are accepted (the cylinder can be one more: 0x10000 (or 65536)). The first value is the number of cylinders, the second value is the number of heads (limited to 16), and the final value is the number of sectors per track (limited to 255). Sectors are counted origin 1 according to CHS conventions (cf. normal LBAs which nearly always count from 0).
- -D, --degen=DV
DV of 0 (the default) means that all degenerate elements (apart from the last) are ignored (and dropped from the internal representation which may later be written to an output file). If DV is odd then a degenerate element's LBA is taken into account when calculating the highest and lowest LBA in a sgl (and may be included in an output file). If DV is even (apart from a DV of 0) then a degenerate element's LBA it taken into account when determining if a sgl is monotonic increasing, decreasing or neither (and may be included in an output file).
- -d, --document
this option causes information about the a sgl or index array to be written as comments (i.e. lines starting with '#') to the beginning of output file(s) created by this utility.
If this option is given twice then the command line that caused the output is added to the file as a comment (before any numbers are output).- -E, --elem=SE[,LE]
this option allows a single sgl element (at position SE (starting element index)) to be output to O_SGL or O_SGL.FNE (or IAF). SE is origin 0. If LE (last element index) is given then a range of sgl elements are output starting at index SE to index LE inclusive. If a "twin" operation is being performed then this option only applies to the "a" side output, not the "twin" side. This option is ignored by the output of the split_n and tsplit_n actions.
Negative values for either SE or LE count from the end of sgl. For example --elem=0,-1 refers to the whole of the list.
If LE is less than SE (after any negative indexes are converted to their equivalent positive index) then that range is output in reverse.- -e, --extension=FNE
FNE is the filename extension used when output filenames are generated. For non-split actions the generated filenames are of the form: O_SGL.FNE . For the split_n action the generated filenames are of the form: O_SGL[1..n].FNE . For the tsplit_n action the a-sg is named as per the previous sentence, while for the b-sgl the generated filenames are of the form: O_SGL[1..n]_t.FNE .
If O_SGL is '-' (by itself) then all output is sent to stdout and this option is ignored.- -f, --flexible
this option effects the parsing (reading) of sgl_s and index arrays that are in files which are in hexadecimal. Such files should have a leading line (i.e. before any numbers) with 'HEX' on it. Without this option any such file must be invoked with 'H@' before the filename; in other words the 'H' in the invocation needs to match the 'HEX' in the file. With this option a file can be invoked with '@' and if a line with 'HEX' is parsed before any numbers then it switches to hexadecimal mode; so that all the parsed numbers are assumed to be in hexadecimal.
- -h, --help
outputs the usage message summarizing command line options then exits.
- -H, --hex
used to define the numeric format of sgl and index array elements written to output (often a file named O_SGL or stdout). If not given then only decimal values are written to output. If this option is given once then hexadecimal values, prefixed with '0x', are written. If this option is given twice then a line with the string 'HEX' is written to output, before any values, and those values are implicitly hexadecimal (i.e. no leading '0x' nor 'h' suffix).
- -I, --iaf=IAF
where IAF is a filename (or '-' for stdout) to write an index array to. The only action that generates an index array currently is --action=sort (and tsort). This option can be together used with, or in place of, the --out=O_SGL option.
The --document, --elem=SE[,LE] and --hex options effect what is written. See the section on File Formats below.- -x, --index=IA
where IA is one or more indexes, comma separated or, if prefixed by "@" or "H@", a filename containing a list of indexes. These indexes are used by the --action=select and --action=tselect to select elements from the 'a-sgl'. Positive and negative indexes that are too large (in absolute terms) are ignored and create noise if the --verbose option is given. See the section on File Formats below.
- -i, --interleave=IL
IL is an integer, starting from 0. When IL is 0 (the default) there is no interleave. The interleave only effects the split_n and tsplit_n actions and when greater than zero is the maximum number of logical blocks written in each segment in the output file, prior to moving to the next output file.
For the case where IL is 1 and --action=split_1 is given then the output file will have every LBA (given by the a-sgl) as a separate sgl element (and thus each will have a NUM of 1).
For the tsplit_n action the interleave is only applied to the a-sgl but it does effect the twin sgl files.- -N, --non-overlap
Checks any given sgl and any resulting sgl (from an action) to see if any portion of the sgl overlaps. This is done by first sorting each sgl by the LBA field, then checking every element against the previous one to determine if there is overlap. SCSI commands that accept sgl_s process degenerate elements without error but if two elements in a WRITE command overlap then it is the storage device's choice which one to WRITE first. The last one to be written will be the one read in subsequent read operations.
If no errors are detected then if (all) are non-overlapping then 0 is returned. If no errors are detected then if (any) are overlapping then 36 is returned.- -o, --out=O_SGL
O_SGL is the name of a file to write a resultant scatter gather list to. If O_SGL is '-' then the output is directed to stdout. If O_SGL starts with '+' then the output is appended to the file whose name follows the '+'.
For the split and tsplit actions, the leading '+' is interpreted as appended to all files that meet the template and exist, otherwise the file is created. If '-' is given then all output is directed to stdout (and the --extension=FNE option, if given, is ignored).- -q, --quiet
suppresses warning and messages announcing an action has succeeded. When this option is given, actions that have a logical (boolean) result don't output messages but still yield an indicative exit status. The exit status will typically be either 0 for true or 36 for false. are typically sent to stderr.
- -r, --round=RB
RB is the number of round blocks. Without the option the split_n action will divide the number of blocks to be split by '<n>' (or use IL) to get a nominal value. This value is the number of blocks taken from the a-sgl before moving to the next output file. The RB value (default 0) is the maximum number of blocks the nominal value may be changed by to align with an existing element boundary in the a-sgl.
If the number of blocks in 'a-sgl is less than 10 or RB is greater than one third of the nominal value, then RB is ignored (with a notification written to stderr).
For the tsplit_n action this option only applies to the a-sgl.- -S, --sort-cmp=SC
where SC is a value indicating what the sort action's comparison will be. When SC is 0 (the default) the sort is ascending based on the LBA; when it is 1 the sort is descending based on LBA. When SC is 2 the sort is ascending based on NUM; when it is 3 the sort is descending based on NUM. Any other value is mapped to 0. All sorts are stable which means that sgl elements with the same LBA (in the case of SC being 0 or 1) keep their same relative position. A side effect of this is that the ascending and descending sorts are not always reversals of one another.
- -s, --stats
print out sgl statistics on any given sgl and any resultant sgl.
- -v, --verbose
increase the level of verbosity, (i.e. debug output).
- -V, --version
print the version string and then exit.
Actions
Actions are given on the command line as part of the --action=ACT option. Currently only one action is allowed per invocation. If more are allowed in the future, they will be comma separated and performed in the order in which they appear (i.e. left to right).
If no action is given and the --a-sgl=SGL and --out=O_SGL options (with no --b-sgl=SGL option) are given then the a-sgl is copied to O_SGL (or O_SGL.FNE if the --extension=FNE option is given).
The actions are listed below in alphabetical order.
- append-b2a
appends the b-sgl to the end of the a-sgl and outputs the result to O_SGL (or O_SGL.FNE if the --extension=FNE option is given). Requires the --a-sgl=SGL, --b-sgl=SGL and --out=O_SGL options.
- divisible<n>[,L|N] or divisible_<n>[,L|N]
where <n> is an integer, 1 or higher. This action checks if each LBA and NUM in a-sgl is divisible by <n> (where 'is divisible' is equivalent to having a remainder of zero). If all are divisible then true is returned (i.e. the exit status 0); otherwise false is returned (i.e. exit status 36).
If the optional ",L" suffix (or ",LBA") is given then only each LBA element in a-sgl is checked for divisibility. If the optional ",N" suffix (or ",NUM") then only each NUM element in a-sgl is checked for divisibility.
The output of the string to stderr announcing divisibility, or lack of it, can be suppressed by the --quiet option.- enum
prints out the list of supported actions then exits. Giving the action 'xxx' has the same effect.
- equal
this action compares the sgl_s given to --a-sgl=SGL and --b-sgl=SGL. If the same LBAs are in the same order with the same overall number of blocks (but not necessarily the same number of elements) then true is returned (i.e. the exit status 0); otherwise false is returned (i.e. exit status 36). For example the two element sgl "0x10,0x5, 0x15,0x2" is 'equal' to the one element sgl "0x10, 0x7".
The output of the string to stderr announcing equality, or lack of it, can be suppressed by the --quiet option.- none
this action does nothing. This is the default action. If --a-sgl=SGL and --out=O_SGL options are given and no other action, then a-sgl is copied to O_SGL.
It is a placeholder.- part-equal
this action is similar to the equal action but relaxes the condition that both lists must have the same overall number of blocks. For example the two element sgl "0x10,0x5, 0x15,0x2" is 'part-equal' to the one element sgl "0x10, 0x12".
- part-same
this action is similar to the same action but relaxes the condition that both lists must have the same overall number of blocks. For example the two element sgl "0x15,0x2,0x10,0x5" is 'part-same' as the one element sgl "0x10, 0x12".
- same
this action is similar to the equal action but relaxes the condition that both lists must be in the same order. The implementation sorts both given lists before comparing them. For example the two element sgl "0x15,0x2, 0x10,0x5" is the 'same' as the one element sgl "0x10, 0x7".
- scale<n> or scale_<n>
where <n> is an integer, positive or negative but not zero. When <n> is positive then the starting LBA and the NUM in each a-sgl element is multiplied by <n> . The new (scaled) sgl is written to O_SGL (or O_SGL.FNE if the --extension=FNE option is given).
When <n> is negative then the absolute value of <n> is used as a divisor for each starting LBA and NUM in each a-sgl element.
As an example: converting a 512 byte logical block (LB) size sgl to a 4096 byte LB size and vice versa is relatively common. To convert from 4096 --> 512 byte LB size then --action=scale_8 is appropriate. To convert from 512 --> 4096 byte LB size then --action=scale_-8 is appropriate.
Note: because an integer division is used (that rounds 'towards zero') when <n> is negative then LBs or NUMs may be "lost" in this conversion. This can be checked beforehand with the --action=divisible<n>[,L|N] option. For example: for 512 --> 4096 conversions: --action=divisible_8 will report if any starting LBAs or NUMs are not divisible be 8 and hence are not able to be precisely represented as 4096 byte LB addresses or number of 4096 byte blocks.- select
this action can be used to select a subset (or superset) of the a-sgl in the specified order. Alternatively it can be seen as re-ordering the elements in a-sgl such as is done toward the end of a sort operation. Assuming all the indexes in IA are valid, then the O_SGL file will have the same number of elements as there are indexes in IA.
This option requires non-empty --a-sgl=SGL and --index=IA options, plus the --out=O_SGL option.- sort
this action will sort the sgl given by --a-sgl=SGL in ascending order by LBA. The resulting sgl is output to O_SGL (or O_SGL.FNE if the --extension=FNE option is given).
The sort is "stable", so if two elements have the same starting LBA then they will appear in the same relative order in the output.- split<n> or split_<n>
where <n> is an integer, 1 or higher. This action divides --a-sgl=SGL into <n> roughly equal length (i.e. number of blocks) output sgl_s. The output files are named "O_SGL<1..n>" or "O_SGL<1..n>.FNE". Both the --interleave=IL and --round=RB options are taken into account during the split process.
- to-chs
this action takes the 'flat' LBA SGL given to --a-sgl=SGL and converts it into CHS (cylinder/head/sector) based SGL which is written out as directed to --out=O_SGL. This action requires the --chs=CHS option as well as the --a-sgl=SGL and --out=O_SGL options.
- tselect
this is a "twin select" action that selects from --a-sgl=SGL (a-sgl) then re-orders --b-sgl=SGL (b-sgl) in unison. The select from a-sgl is the same as described under the select action above. Additionally b-sgl is is broken up so it has "breaks" at the same positions (i.e. number of blocks from the start of the sgl) as a-sgl does; plus the "breaks" b-sgl has already got. So the "broken up" b-sgl will have at least as many elements as a-sgl. The output of the re-ordered b-sgl is then written to O_SGL_t or O_SGL_t.FNE if the --extension=FNE option is given.
- tsort
this is a "twin sort" action that sorts --a-sgl=SGL (a-sgl) and re-orders --b-sgl=SGL (b-sgl) in unison. The sort of a-sgl is the same as described under the sort action above. Additionally b-sgl is is broken up so it has "breaks" at the same positions (i.e. number of blocks from the start of the sgl) as a-sgl does; plus the "breaks" b-sgl has already got. So the "broken up" b-sgl will have at least as many elements as a-sgl. The re-ordering vector generated by the stable sort of a-sgl is then applied to the broken up b-sgl. The output of the re-ordered b-sgl is then written to O_SGL_t or O_SGL_t.FNE if the --extension=FNE option is given.
- tsplit<n> or tsplit_<n>
this is a "twin split" action that splits the --a-sgl=SGL and --b-sgl=SGL into separate series of output files. These separate series maintain the LBA to LBA correspondence of the original a_sgl and b_sgl lists. <n> is an integer, 1 or higher. This action divides --a-sgl=SGL into <n> roughly equal length (i.e. number of blocks) output sgl_s. The "roughly equal length" is influenced by the --interleave=IL and --round=RB options. The output filenames are generated the same way as described for the split action. The sgl from --a-sgl=SGL is expected to be a "hard" sgl which means its last element should not be degenerate (i.e. have a NUM of 0).
The second half of the "twin split" is to split the --b-sgl=SGL sgl. The same number of output files are used as for the 'A' side but the filenames have a slightly different form: "O_SGL<1..n>_t" or "O_SGL<1..n>_t.FNE" (if the --extension=FNE option is given). The critical point of this split is that it moves in lockstep with the 'A' side split in the sense that whatever block count an 'A' side segment uses, the following 'B' side segment split uses the same block count. The sgl from --b-sgl=SGL may be a "hard" or "soft" sgl. In the simplest case the 'B' side sgl can be just '0' which gets expanded to '0,0' (i.e. degenerate list starting at LBA 0); this will use the overall block count from the 'A' side.
File Formats
Both sgl_s and index arrays can be read from, or written to, files. The options that supply sgl_s or index arrays to be read (e.g. --a-sgl=SGL, --b-sgl=SGL and --index=IA) by default allow them to be given directly on the command line. These will typically be comma separated lists (although space and tab could be used as separators if they were appropriately escaped). So with these options when reading sgl_s and index arrays, a leading "@" or "H@" is needed to indicate that a file name follows.
By default, numbers given in this utility and other utilities in this package are assumed to be in decimal. Hexadecimal (hex) numbers can be given with either a leading "0x" or trailing "h". A whole file can be flagged as containing hex numbers (and thus not needing a leading "0x" nor trailing "h" on each number) by using "H@" on the command line before the filename. The file itself may contain a line with 'HEX' in it, prior to any numbers that are to be parsed. If the --flexible option is given then "@" can be used before the filename and when 'HEX' is detected in the file (before any numbers) the code switches to hex mode. Without the --flexible option "H@" must be use before the filename. As a convenience the 'HEX' string may appear after hex numbers have been decoded and it will be ignored. This is to allow hex sgl_s files to be concatenated together and still be parsed without error.
A file being parsed may contain comments following a "#" symbols. Everything from and include the hash mark to the end of a line is ignored. Blank lines and "whitespace" (spaces, tabs, CRs and LFs) are also ignored.
If large sgl_s or index arrays are being used it is better to have one element per line in the file to be read. This is because a line is not expected to be over 1024 bytes long with more than 254 parsable items on it. This utility imposes no limit on the number of lines a file to be parsed may have.
Files to be written out by this utility have their names specified by the --out=O_SGL (optionally together with --extension=FNE) and the --iaf=IAF options. Unlike the file reading options, no "@" character should be placed in front of the filename on the command line. If a filename of "-" is given then output is written to stdout instead of a file. stdout is normally the console. If the filename starts with "+" then that character is skipped the output will be appended to that file, if it exists. If the filename starts with "+" and the file does not exist then it is created. If "+" is not given and the file already exists then it is truncated (to 0) then overwritten. Some output file names have numbers (e.g. as a result of the --action=spilt_<n> option) or "_t" (e.g. as a result of "twin" actions) appended to them (before the extension, if any). Sgl elements are output one per line, with a comma separating the LBA and the NUM. Index arrays are output one element (an index) per line. The --hex option controls the form of those numbers output. If --hex is not given, the numbers are output in decimal. If the --hex option is given one the number are output in hex with a "0x" prefix. If the --hex option is given twice then the line 'HEX' is written to the file before any numbers and those numbers are in hex without any adornment (i.e. with no leading "0x").
If the --document option is given then some information including a date timestamp of generation is placed as comments at the beginning of files that are written out by this utility. If the --document option is given twice then the invocation line of this utility that caused the output is placed in the written file as an additional comment.
The written file format is compatible with the read file format. So, for example, a sgl generated by a invocation of this utility can later be used as a file to be read by another invocation of this utility.
Exit Status
The exit status of ddpt_sgl is 0 when it is successful. Note that some options and actions that return a boolean value return 0 for true and 36 for false. Otherwise the exit status for this utility is the same as that for ddpt. See the EXIT STATUS section in the ddpt man page.
Examples
Examples are needed. See testing/test_sgl.sh script in this package. That script can be run without root permissions and places its work file (sgl_s) in the /tmp directory.
Authors
Written by Douglas Gilbert.
Reporting Bugs
Report bugs to <dgilbert at interlog dot com>.
Copyright
Copyright © 2020-2021 Douglas Gilbert
This software is distributed under a FreeBSD license. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.