Scroll to navigation

CSVSTAT(1) User Commands CSVSTAT(1)

NAME

csvstat - manual page for csvstat 1.4.0

DESCRIPTION

usage: csvstat [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]

[-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE] [-S] [--blanks] [--null-value NULL_VALUES [NULL_VALUES ...]] [--date-format DATE_FORMAT] [--datetime-format DATETIME_FORMAT] [-H] [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [--csv] [--json] [-i INDENT] [-n] [-c COLUMNS] [--type] [--nulls] [--non-nulls] [--unique] [--min] [--max] [--sum] [--mean] [--median] [--stdev] [--len] [--max-precision] [--freq] [--freq-count FREQ_COUNT] [--count] [--decimal-format DECIMAL_FORMAT] [-G] [-y SNIFF_LIMIT] [-I] [FILE]

Print descriptive statistics for each column in a CSV file.

positional arguments:

The CSV file to operate on. If omitted, will accept input as piped data via STDIN.

options:

show this help message and exit
Delimiting character of the input CSV file.
Specify that the input CSV file is delimited with tabs. Overrides "-d".
Character used to quote strings in the input CSV file.
Quoting style used in the input CSV file. 0 = Quote Minimal, 1 = Quote All, 2 = Quote Non-numeric, 3 = Quote None.
Whether or not double quotes are doubled in the input CSV file.
Character used to escape the delimiter if --quoting 3 ("Quote None") is specified and to escape the QUOTECHAR if --no-doublequote is specified.
Maximum length of a single field in the input CSV file.
Specify the encoding of the input CSV file.
Specify the locale (en_US) of any formatted numbers.
Ignore whitespace immediately following the delimiter.
Do not convert "", "na", "n/a", "none", "null", "." to NULL.
Convert this value to NULL. --null-value can be specified multiple times.
Specify a strptime date format string like "%m/%d/%Y".
Specify a strptime datetime format string like "%m/%d/%Y %I:%M %p".
Specify that the input CSV file has no header row. Will create default headers (a,b,c,...).
Specify the number of initial lines to skip before the header row (e.g. comments, copyright notices, empty rows).
Print detailed tracebacks when errors occur.
Insert a column of line numbers at the front of the output. Useful when piping to grep or as a simple primary key.
When interpreting or displaying column numbers, use zero-based numbering instead of the default 1-based numbering.
Display version information and exit.
Output results as a CSV table, rather than plain text.
Output results as JSON text, rather than plain text.
Indent the output JSON this many spaces. Disabled by default.
Display column names and indices from the input CSV and exit.
A comma-separated list of column indices, names or ranges to be examined, e.g. "1,id,3-5". Defaults to all columns.
Only output data type.
Only output whether columns contains nulls.
Only output counts of non-null values.
Only output counts of unique values.
Only output smallest values.
Only output largest values.
Only output sums.
Only output means.
Only output medians.
Only output standard deviations.
Only output the length of the longest values.
Only output the most decimal places.
Only output lists of frequent values.
The maximum number of frequent values to display.
Only output total row count.
%-format specification for printing decimal numbers. Defaults to locale-specific formatting with "%.3f".
Do not use grouping separators in decimal numbers.
Limit CSV dialect sniffing to the specified number of bytes. Specify "0" to disable sniffing entirely, or "-1" to sniff the entire file.
Disable type inference when parsing the input. Disable reformatting of values.

SEE ALSO

The full documentation for csvstat is maintained as a Texinfo manual. If the info and csvstat programs are properly installed at your site, the command

info csvstat

should give you access to the complete manual.

February 2024 csvstat 1.4.0