ndl-start - Man Page
NDL
- Date
2024-03-08
COMMAND 'ndl start'
usage: ndl start [-h] [-q] [-d] [-H HOSTNAME] [-U USERNAME] [-K PRIVKEY] [-T TIMEOUT] [-c COUNT] [--time-limit LIMIT] [-o OUTDIR] [--reportid REPORTID] [--stats STATS] [--stats-intervals STATS_INTERVALS] [--list-stats] [-l LDIST] [--cpunum CPUNUM] [--exclude EXCLUDE] [--include INCLUDE] [--keep-filtered] [--report] [--force] [--trash-cpu-cache] [--freq-noise FREQ_NOISE] [--freq-noise-sleep FREQ_NOISE_SLEEP] ifname
Start measuring and recording the latency data.
- ifname
The network interface backed by the NIC to use for latency measurements. Today only Intel I210 and I211 NICs are supported. Please, specify NIC's network interface name (e.g., eth0).
OPTIONS 'ndl start'
- -h
Show this help message and exit.
- -q
Be quiet.
- -d
Print debugging information.
- -H HOSTNAME, --host HOSTNAME
Name of the host to run the command on.
- -U USERNAME, --username USERNAME
Name of the user to use for logging into the remote host over SSH. The default user name is 'root'.
- -K PRIVKEY, --priv-key PRIVKEY
Path to the private SSH key that should be used for logging into the remote host. By default the key is automatically found from standard paths like '~/.ssh'.
- -T TIMEOUT, --timeout TIMEOUT
SSH connect timeout in seconds, default is 8.
- -c COUNT, --datapoints COUNT
How many datapoints should the test result include, default is 1000000. Note, unless the '--start-over' option is used, the pre-existing datapoints are taken into account. For example, if the test result already has 6000 datapoints and memory.
- --time-limit LIMIT
The measurement time limit, i.e., for how long the SUT should be measured. The default unit is minute, but you can use the following handy specifiers as well: d - days, h - hours, m - minutes, s -seconds. For example '1h25m' would be 1 hour and 25 minutes, or 10m5s would be 10 minutes and 5 seconds. Value '0' means "no time limit", and this is the default. If this option is used along with the '--datapoints' option, then measurements will stop as when either the time limit is reached, or the required amount of datapoints is collected.
- -o OUTDIR, --outdir OUTDIR
Path to the directory to store the results at.
- --reportid REPORTID
Any string which may serve as an identifier of this run. By default report ID is the current date, prefixed with the remote host name in case the '-H' option was used: [hostname-]YYYYMMDD. For example, "20150323" is a report ID for a run made on March 23, 2015. The allowed characters are: ACSII alphanumeric, '-', '.', ',', '_', '~', and ':'.
- --stats STATS
Comma-separated list of statistics to collect. The statistics are collected in parallel with measuring C-state latency. They are stored in the the "stats" sub-directory of the output directory. By default, only 'turbostat, sysinfo' statistics are collected. Use 'all' to collect all possible statistics. Use '--stats=""' or '--stats="none"' to disable statistics collection. If you know exactly what statistics you need, specify the comma-separated list of statistics to collect. For example, use 'turbostat,acpower' if you need only turbostat and AC power meter statistics. You can also specify the statistics you do not want to be collected by pre-pending the '!' symbol. For example, 'all,!turbostat' would mean: collect all the statistics supported by the SUT, except for 'turbostat'. Use the '--list-stats' option to get more information about available statistics. By default, only 'sysinfo' statistics are collected.
- --stats-intervals STATS_INTERVALS
The intervals for statistics. Statistics collection is based on doing periodic snapshots of data. For example, by default the 'acpower' statistics collector reads SUT power consumption for the last second every second, and 'turbostat' default interval is 5 seconds. Use 'acpower:5,turbostat:10' to increase the intervals to 5 and 10 seconds correspondingly. Use the '--list-stats' to get the default interval values.
- --list-stats
Print information about the statistics 'ndl' can collect and exit.
- -l LDIST, --ldist LDIST
The launch distance in microseconds. This tool works by scheduling a delayed network packet, then sleeping and waiting for the packet to be sent. This step is referred to as a "measurement cycle" and it is usually repeated many times. The launch distance defines how far in the future the delayed network packets are scheduled. By default this tool randomly selects launch distance in range of [5000, 50000] microseconds (same as '--ldist 5000,50000'). Specify a comma-separated range or a single value if you want launch distance to be precisely that value all the time. The default unit is microseconds, but you can use the following specifiers as well: ms - milliseconds, us - microseconds, ns - nanoseconds. For example, '--ldist 500us,100ms' would be a [500,100000] microseconds range. Note, too low values may cause failures or prevent the SUT from reaching deep C-states. The optimal value is system-specific.
- --cpunum CPUNUM
The CPU number to bind the helper to. The helper will use this CPU to send delayed packets. In normal conditions this means that network packet buffers will be allocated on the NUMA node local to the CPU, but not necessarily local to the network card. Use this option to measure different packet memory locations on a NUMA system. Special value 'local' can be used to specify a CPU with lowest CPU number local to the NIC, and this is the default value.a Special value
- --exclude EXCLUDE
Datapoints to exclude: remove all the datapoints satisfying the expression 'EXCLUDE'. Here is an example of an expression: '(WakeLatency < 10000) | (PC6% < 1)'. This filter expression will remove all datapoints with 'WakeLatency' smaller than 10000 nanoseconds or package C6 residency smaller than 1%. You can use any metrics in the expression.
- --include INCLUDE
Datapoints to include: remove all datapoints except for those satisfying the expression 'INCLUDE'. In other words, this option is the inverse of '--exclude'. This means, '--include expr' is the same as '--exclude "not (expr)"'.
- --keep-filtered
If the '--exclude' / '--include' options are used, then the datapoints not matching the selector or matching the filter are discarded. This is the default behavior which can be changed with this option. If '--keep-filtered' has been specified, then all datapoints are saved in result. Here is an example. Suppose you want to collect 100000 datapoints where RTD is greater than 50 microseconds. In this case, you can use these options: -c 100000 --exclude="RTD > 50". The result will contain 100000 datapoints, all of them will have RTD bigger than 50 microseconds. But what if you do not want to simply discard the other datapoints, because they are also interesting? Well, add the '--keep-filtered' option. The result will contain, say, 150000 datapoints, 100000 of which will have RTD value greater than 50.
- --report
Generate an HTML report for collected results (same as calling 'report' command with default arguments).
- --force
By default a network card is not accepted as a measurement device if it is used by a Linux network interface and the interface is in an active state, such as "up". Use '--force' to disable this safety mechanism. Use it with caution.
- --trash-cpu-cache
Trash CPU cache to make sure NIC accesses memory when measuring latency. Without this option, there is a change the data NIC accesses is in a CPU cache. With this option, ndl allocates a buffer and fills it with data every time a delayed packet is scheduled. Supposedly, this should push out cached data to the memory. By default, the CPU cache trashing buffer size a sum of sizes of all caches on all CPUs (includes all levels, excludes instruction cache).
- --freq-noise FREQ_NOISE
Add frequency scaling noise to the measured system. This runs a background process that repeatedly modifies CPU or uncore frequencies for given domains. The reason for doing this is because frequency scaling is generally an expensive operation and is known to impact system latency. 'FREQ_NOISE' is specified as 'TYPE:ID:MIN:MAX', where: TYPE should be 'cpu' or 'uncore', specifies whether CPU or uncore frequency should be modified; ID is either CPU number or uncore domain ID to modify the frequency for (e.g. 'cpu:12:...' would target CPU12); MIN is the minimum CPU/uncore frequency value; MAX is the maximum CPU/uncore frequency value. For example, to add frequency scaling noise for CPU0, add '-- freq-noise cpu:0:min:max'. To add uncore frequency noise for uncore domain 0, add '--freq-noise uncore:0:min:max'. The parameter can be added multiple times to specify multiple frequency noise domains.
- --freq-noise-sleep FREQ_NOISE_SLEEP
Sleep between frequency noise operations. This time is added between every frequency scaling operation executed by the 'freq-noise' feature. The default time unit is microseconds, but it is possible to use time specifiers as well, ms - milliseconds, us - microseconds, ns - nanoseconds. Default sleep time is 50ms.