csvmaps
This is a command-line tool for manipulating map-like CSV data, where the first column is treated as a “key” for that row. Please see the README for the most up-to-date documentation: https://gitlab.com/transportengineering/csvmaps
Installation
Currently requires stack because of a dependency on an unreleased (or rather,
also not on hackage) package cassava-streaming.
$ git clone git@gitlab.com:transportengineering/csvmaps.git
$ cd csvmaps
$ stack install
Usage
$ csvmaps --help
Usage: csvmaps [-i|--infiles STRING]... [-o|--outfile STRING]
[--has-header BOOL] [--op MAPOP]
[--expr MAPEXPR] [--save-labels BOOL]
[--verbose BOOL]
Available options:
-h,--help Show this help text
-i,--infiles STRING... Files to combine. The first column of each csv will
be used as a key. For non-csv files ( iles that don't
end in .csv), each line will be treated as a key.
-o,--outfile STRING .csv file to write output to. If omitted, use stdout.
--has-header BOOL whether to ignore header in csv files
--op MAPOP The operation to use to combine csv files (default:
union)
--expr MAPEXPR
The expression to use to combine csv files. Reference
the inputs with $N for the Nth document in the
"infiles". The operations available are:
1. Union with the + operator;
2. Difference with the - operator;
3. Intersection with the * operator;
4. Union combining all columns with the +. operator;
5. Intersection combining all columns with the *.
operator
6. Labels for expressions with the syntax
'"LABEL": EXPR'.
Examples
Take the rows that exist in both v1, in addition (preferring v1) to the rows
that exist in both v1 and v2 (preferring v2).
$ csvmaps -i v1.csv -i v2.csv --expr '(($1 *| $2) +| ($1 - $2))'
MAPEXPR syntax
$N- References the Nth document specified, starting at 1.
keys A- Discards the values of all rows.
nulls A- Discards the rows that have non-empty values.
non-nulls A- Discards the rows that have empty values.
A : "label"-
Assigns a label to the expression
A. When used with the--save-labelsoption, this will also result in a file calledlabel.csvbeing created usingA. const ["Col_1","Col_2",...] A-
Replaces all values of
Awith the columns given in the first argument. pad N A- Ensures the number of columns after each key is at least N; fills ones that don’t exist with empty strings.
col N A- Takes the Nth column.
col [N1,N2,...] A- Uses the columns given.
Unions
A + B-
Left-biased union of A and B. Same as
A + B. When keys exist in both maps, the values at those keys are taken from the left operand. A |+ B-
Left-biased union of A and B. Same as
A + B. When keys exist in both maps, the left operand’s values at those keys are used, unless they are empty. A +| B-
Right-biased union of A and B. Same as
A +| B. When keys exist in both maps, the right operand’s values at those keys are used, unless they are empty. A |+| B- Union of A and B that concatenates the values of all keys that both operands have in common.
Intersections
A * B-
Left-biased intersection of A and B. Same as
A * B. When keys exist in both maps, the values at those keys are taken from the left operand. A |* B-
Left-biased intersection of A and B. Same as
A * B. When keys exist in both maps, the left operand’s values at those keys are used, unless they are empty. A *| B-
Right-biased intersection of A and B. Same as
A *| B. When keys exist in both maps, the right operand’s values at those keys are used, unless they are empty. A |*| B- Intersection of A and B that concatenates the values of all keys that both operands have in common.
Difference
A - B- Returns the rows of A whose keys are not in B.