Skip to content

Gizmos for working with CSVs (and other shell utilities)

License

Notifications You must be signed in to change notification settings

Notgnoshi/csvizmo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

247 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

csvizmo

lint workflow code coverage

Gizmos for working with CSVs

Table of contents

  • csvplot -- line and scatter plots from CSV files
  • csvstats -- histograms and summary statistics for CSV files
  • csvcat -- concatenate CSV files
  • csvdelta -- calculate inter-row deltas for CSV files
  • can2k -- parse NMEA 2000 GPS data from CAN logs
  • can2csv -- parse CAN logs into CSV files
  • canspam -- generate random CAN traffic
  • canstruct -- reconstruct NMEA 2000 Fast Packet / ISO 11783-3 Transport Protocol sessions
  • qgsdir -- generate QGIS projects from directories of CSV files
  • depconv -- convert dependency graphs between formats
  • depfilter -- filter or select subsets of dependency graphs
  • deptransform -- transform dependency graphs
  • depquery -- query properties of dependency graphs
  • depcluster -- cluster dependency graphs using community detection
  • graphdiff -- compare two dependency graphs
  • bbclasses -- generate BitBake recipe inheritance diagrams
  • minpath -- shorten file paths to minimal unique suffixes

Philosophy

Rather than to build an infinitely flexible, highly optimized, does-everything-and-more toolkit (see https://github.com/dathere/qsv for that) these gizmos are targeted tools to solve specific problems I frequently encounter.

All tools operate on stdin/stdout in addition to files, and are designed to be chained together with pipes. Any ancillary output is emitted on stderr.

How to use

You can install the gizmos with

./install --prefix ~/.local/
./install --uninstall --prefix ~/.local/

You can also do just

cargo install --path . --root ~/.local/

but that won't install any of the non-Rust scripts from this project.

You can also just experiment the gizmos by

cargo run --release --bin can2csv -- ...

You likely want a release build. As an example, the can2k tool runs in 2.4s with a release build on a 1 hour candump, but 23s with a debug build.

Gizmos

See gizmo An idea for a new gizmo for the gizmos I have planned.

csvplot

Plot data from a CSV file. Supports line (X & Y) and time series (Y only) plots. Multiple columns can be plotted at once, so long as they share the same X axis.

$ head session-2.csv
roll
8
7
14
$ csvplot -y rolls session-2.csv

D&D rolls time series

csvstats

Calculate summary statistics for CSV columns.

$ csvstats --column roll session-2.csv
Stats for column "roll":
    count: 21
    Q1: 5
    median: 8
    Q3: 14
    min: 2 at index: 0
    max: 18 at index: 19
    mean: 9.571428571428571
    stddev: 5.401964631566671

csvstats can also generate histogram plots

$ csvstats --column roll session-2.csv --histogram --bins 20 --discrete

D&D roll histogram

csvcat

Concatenate multiple CSV files of the same shape together.

$ cat a.csv
foo,bar,baz
1,2,3
4,5,6
$ cat b.csv
foo,bar,baz
7,8,9
$ csvcat a.csv b.csv
foo,bar,baz
1,2,3
4,5,6
7,8,9

csvdelta

Calculate the inter-row deltas for a CSV column. Useful for understanding the time between events. Also supports mean-centering a column, or centering it around a specific value.

$ csvdelta --column foo <<EOF
foo,bar
0,a
1,b
3,c
5,d
EOF

foo,bar,foo-deltas
0,a,
1,b,1
3,c,2
5,d,2

can2k

Parse NMEA 2000 GPS data out of a candump into a CSV file that QGIS can load with minimal effort.

$ can2k ./data/n2k-sample.log
src,seq_id,longitude_deg,latitude_deg,altitude_m,sog_mps,cog_deg_cwfn,cog_ref,method,msg_timestamp,gps_timestamp,gps_age,msg
28,82,-139.6000461230086,-8.799010622654123,0.0,4.5,356.769359872061,0,4,1739920494.579828,,0.0,GNSS Position Data
28,82,-139.60004635356415,-8.799006583765234,0.0,4.5,356.769359872061,0,4,1739920494.675967,,0.0,Position Delta
28,82,-139.60004658411972,-8.799002542098567,0.0,4.5,356.769359872061,0,4,1739920494.775932,,0.0,Position Delta
...

Important

If you want to use can2k together with qgsdir, you need to use can2k --wkt.

can2csv

Parse basic data from a CAN frame into a CSV record. Faster than sed, and also parses the canid. Useful in conjunction with csvdelta to understand message timing.

can2csv is not a real CAN parser, and does not understand any of the data transmitted via CAN.

$ head -n 3 data/candump-random-data.log | can2csv
timestamp,interface,canid,dlc,priority,src,dst,pgn,data
1739229594.465994,can0,0xE9790B5,8,3,0xB5,0x90,0x29700,CA3F871A5A6EE75F
1739229594.467052,can0,0xD15F192,8,3,0x92,0xF1,0x11500,500B3766CB2DED7C

If you pass --reconstruct, then can2csv will reconstruct any transport layer sessions it can understand. Right now that's just NMEA 2000 Fast Packet, but ISO-11783 Transport Protocol is planned.

canspam

The canspam script can generate random CAN traffic on a Linux CAN device. It's useful for inflating busload, or for generating random traffic to test can2csv against ;)

canstruct

The canstruct tool is a NMEA 2000 Fast Packet / ISO 11783-3 Transport Protocol transport session reconstruction tool. That is, you give it the individual 8-byte frames, and it gives you the reconstructed messages.

$ cat data/abort-then-full.log
(1750963033.251412) can0 18EC2A1C#101600040400EF00      // TP.CM_RTS
(1750963033.270725) can0 18EC1C2A#FF01FFFFFF00EF00      // TP.Conn_Abort
(1750963079.757877) can0 18EC2A1C#101600040400EF00      // TP.CM_RTS
(1750963079.775206) can0 18EC1C2A#110401FFFF00EF00      // TP.CM_CTS
(1750963079.778342) can0 14EB2A1C#0111111111111111      // TP.DT
(1750963079.779468) can0 14EB2A1C#0222222222222222      // TP.DT
(1750963079.780613) can0 14EB2A1C#0333333333333333      // TP.DT
(1750963079.781778) can0 14EB2A1C#0444FFFFFFFFFFFF      // TP.DT
(1750963079.795905) can0 18EC1C2A#13160004FF00EF00      // TP.CM_EndofMsgACK

$ canstruct data/abort-then-full.log
2025-06-28T15:36:19.051620Z  WARN csvizmo::can::tp: TP.CM_ABRT 0x1C <- 0x2A reason ExistingTransportSession pgn 0xEF00
(1750963079.795905) can0 18EF2A1C#11111111111111222222222222223333333333333344

qgsdir

Generate a QGIS project from a directory of CSV layer files. Each CSV file is assumed to have a column of WKT geometries named geometry (QGIS's geometry heuristics don't appear to be exposed via their Python API).

$ can2k --wkt ./data/n2k-sample.log ./data/n2k.csv
$ qgsdir --open ./data/n2k.csv

You may pass directories or files. If you pass a directory, the script is able to group layers by subdirectory, leading to an easier-to-use layer tree.

depconv

Convert dependency graphs between formats. Reads from stdin or a file, writes to stdout or a file. Input and output formats are auto-detected from file extensions or content when --input-format/--output-format are not specified. Defaults to DOT output when no output format can be inferred.

$ echo -e "a Node A\nb Node B\n#\na b depends on" | depconv --output-format dot
digraph {
    a [label="Node A"]
    b [label="Node B"]
    a -> b [label="depends on"]
}

$ cargo tree --depth 1 | depconv --output-format tgf
csvizmo v0.1.0
clap v4.5.39
...
#
csvizmo v0.1.0 clap v4.5.39
...

Supported formats

Format --input-format --output-format Description
DOT (GraphViz) yes yes digraph / graph syntax. Parses cmake, ninja, bitbake, and ad-hoc DOT output
Mermaid yes yes flowchart / graph graph types
TGF yes yes Trivial Graph Format
Depfile yes yes Makefile .d depfile
Tree yes yes Box-drawing trees (tree CLI output)
Pathlist yes yes One path per line; hierarchy inferred from / separators
Cargo tree yes -- cargo tree output
Cargo metadata yes -- cargo metadata --format-version=1 JSON

What's preserved across formats

Not every format can represent the same information. The table below shows what each format preserves when parsing (P) and emitting (E):

Format Labels Node type Attrs Edge labels Subgraphs
DOT P+E P+E P+E P+E P+E
Mermaid P+E partial partial P+E P+E
TGF P+E -- -- P+E --
Depfile -- -- -- -- --
Tree P+E -- -- -- --
Pathlist P+E -- -- -- --
Cargo tree P P P -- --
Cargo metadata P P P -- --

Converting from a rich format (DOT, cargo metadata) to a simpler one (TGF, depfile) silently drops unsupported attributes. Converting in the other direction preserves graph topology but cannot recover lost metadata.

depfilter

Filter or select subsets of dependency graphs. Works on the same formats as depconv, and is designed to be chained with pipes.

  • depfilter select keeps nodes matching --include patterns and/or removes --exclude patterns
  • depfilter between select nodes connecting multiple sets of query nodes
  • depfilter cycles select any cycles in the graph

Each subcommand has extra options to tune its behavior.

# From a cargo dependency tree, select the subtree rooted at "clap", excluding
# all the proc-macro crates:
$ cargo tree --depth 10 \
    | depfilter select -g "clap*" --deps -x "*derive*" -x "*proc*" -I cargo-tree -O dot
digraph {
    clap [label="v4.5.57 clap"];
    clap_builder [label="v4.5.57 clap_builder"];
    anstream [label="v0.6.21 anstream"];
    ...
}

deptransform

Structural transformations on dependency graphs. Works on the same formats as depconv, and is designed to be chained with pipes.

The deptransform tool supports the following subcommands:

  • deptransform reverse - reverse the direction of all edges in the graph
  • deptransform simplify - remove redundant edges (e.g. if A->B and B->C, then A->C is redundant)
  • deptransform shorten - shorten node IDs that look like paths (minpath, but for node IDs)
  • deptransform sub - sed, but for node IDs and node / edge attributes
  • deptransform merge - merge multiple graphs into one
  • deptransform flatten - recursively flatten subgraphs into the parent graph
# Collapse bitbake task-level nodes IDs (acl-native.do_compile -> acl-native), then remove the
# now-misleading node labels
$ cat data/depconv/bitbake.curl.task-depends.dot |
    deptransform sub --key=id 's/\.do_.*//' |
    deptransform sub --key=node:label 's/.*//' |

depquery

Query properties of dependency graphs. Lists nodes, edges, and computes graph metrics. Supports the same input formats as depconv, and is designed to be used in pipelines.

# Show the 5 crates with the most dependencies:
$ cargo metadata --format-version=1 |
    depquery nodes --sort out-degree --limit 5
csvizmo-depgraph	20
csvizmo-stats	16
csvizmo-can	12
csvizmo-minpath	11
tracing-subscriber	10

The depquery tool supports outputting nodes, edges, and metrics. The output is intended to be machine-readable, and is tab-separated.

depcluster

Run community detection on a dependency graph to identify clusters of related nodes. Each cluster becomes a subgraph in the output, with cross-cluster edges at the top level. Supports Louvain (default), Leiden, and Label Propagation algorithms.

$ echo -e "a b\na c\nb c\nd e\nd f\ne f\n#\na b\na c\nb c\nd e\nd f\ne f" |
    depcluster -I tgf -O mermaid
flowchart LR
    subgraph cluster_0
        a
        b
        c
        a --> b
        a --> c
        b --> c
    end
    subgraph cluster_1
        d
        e
        f
        d --> e
        d --> f
        e --> f
    end
Loading

graphdiff

Compare two dependency graphs and report what changed. Nodes are matched by ID, and edges by their endpoints.

graphdiff supports several subcommands:

  • graphdiff annotate -- output the combined graph with changes highlighted (added, removed, changed nodes/edges get distinct attributes)
  • graphdiff list -- tab-delimited list of changes (+ added, - removed, ~ changed, > moved)
  • graphdiff summary -- tab-delimited counts of each change type
  • graphdiff subtract -- set difference: nodes and edges only in the first graph
$ cat before.tgf
a Alpha
b Bravo
#
a b

$ cat after.tgf
b Bravo
c Charlie
#
b c

$ graphdiff annotate before.tgf after.tgf -O mermaid
flowchart LR
    b["Bravo"]
    c["+ Charlie"]
    a["- Alpha"]
    b --> c
    a --> b
Loading

bbclasses

The bbclasses script can parse BitBake recipes to generate an inheritance diagram. It tries to evaluate variable expansion, and needs to run in your BitBake environment to work properly.

bbclasses --group-by-layer curl >curl.dot
flowchart LR
    subgraph meta[meta]
        poky/meta/classes-global/debian.bbclass{{"poky/meta/classes-global/debian.bbclass"}}
        poky/meta/classes-global/package.bbclass{{"poky/meta/classes-global/package.bbclass"}}
        poky/meta/classes-global/package_pkgdata.bbclass{{"poky/meta/classes-global/package_pkgdata.bbclass"}}
        poky/meta/classes-global/package_rpm.bbclass{{"poky/meta/classes-global/package_rpm.bbclass"}}
        poky/meta/classes-global/packagedata.bbclass{{"poky/meta/classes-global/packagedata.bbclass"}}
        poky/meta/classes-recipe/autotools.bbclass{{"poky/meta/classes-recipe/autotools.bbclass"}}
        poky/meta/classes-recipe/binconfig.bbclass{{"poky/meta/classes-recipe/binconfig.bbclass"}}
        poky/meta/classes-recipe/multilib_header.bbclass{{"poky/meta/classes-recipe/multilib_header.bbclass"}}
        poky/meta/classes-recipe/multilib_script.bbclass{{"poky/meta/classes-recipe/multilib_script.bbclass"}}
        poky/meta/classes-recipe/pkgconfig.bbclass{{"poky/meta/classes-recipe/pkgconfig.bbclass"}}
        poky/meta/classes-recipe/ptest.bbclass{{"poky/meta/classes-recipe/ptest.bbclass"}}
        poky/meta/classes-recipe/siteinfo.bbclass{{"poky/meta/classes-recipe/siteinfo.bbclass"}}
        poky/meta/classes-recipe/update-alternatives.bbclass{{"poky/meta/classes-recipe/update-alternatives.bbclass"}}
        poky/meta/classes/chrpath.bbclass{{"poky/meta/classes/chrpath.bbclass"}}
        poky/meta/classes/image-buildinfo.bbclass{{"poky/meta/classes/image-buildinfo.bbclass"}}
        poky/meta/classes/siteconfig.bbclass{{"poky/meta/classes/siteconfig.bbclass"}}
        poky/meta/conf/distro/include/ptest-packagelists.inc[["poky/meta/conf/distro/include/ptest-packagelists.inc"]]
        poky/meta/recipes-support/curl/curl_8.7.1.bb["poky/meta/recipes-support/curl/curl_8.7.1.bb"]
    end
    subgraph meta-oem[meta-oem]
        meta-oem/classes/dynamic-packagearch.bbclass{{"meta-oem/classes/dynamic-packagearch.bbclass"}}
        meta-oem/classes/machine-overrides-extender.bbclass{{"meta-oem/classes/machine-overrides-extender.bbclass"}}
    end
    poky/meta-poky/classes/poky-sanity.bbclass{{"poky/meta-poky/classes/poky-sanity.bbclass"}}
    meta-work/recipes-support/curl/curl__.bbappend(["meta-work/recipes-support/curl/curl_%.bbappend"])
    meta-oem/classes/dynamic-packagearch.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    meta-oem/classes/machine-overrides-extender.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta-poky/classes/poky-sanity.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-global/debian.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-global/package.bbclass -->|"inherit"| poky/meta/classes-global/debian.bbclass
    poky/meta/classes-global/package.bbclass -->|"inherit"| poky/meta/classes-global/package_rpm.bbclass
    poky/meta/classes-global/package_pkgdata.bbclass -->|"inherit"| poky/meta/classes-global/package.bbclass
    poky/meta/classes-global/package_rpm.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-global/package_rpm.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-global/packagedata.bbclass -->|"inherit"| poky/meta/classes-global/package.bbclass
    poky/meta/classes-recipe/autotools.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/autotools.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/binconfig.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/binconfig.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/multilib_header.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/multilib_header.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/multilib_script.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/multilib_script.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/pkgconfig.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/pkgconfig.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/ptest.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/ptest.bbclass -->|"inherit"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes-recipe/siteinfo.bbclass -->|"inherit"| poky/meta/classes-recipe/autotools.bbclass
    poky/meta/classes-recipe/siteinfo.bbclass -->|"inherit"| poky/meta/classes-recipe/multilib_header.bbclass
    poky/meta/classes-recipe/update-alternatives.bbclass -->|"inherit"| poky/meta/classes-recipe/multilib_script.bbclass
    poky/meta/classes/chrpath.bbclass -->|"inherit"| poky/meta/classes-global/package.bbclass
    poky/meta/classes/image-buildinfo.bbclass -->|"INHERIT"| poky/meta/recipes-support/curl/curl_8.7.1.bb
    poky/meta/classes/siteconfig.bbclass -->|"inherit"| poky/meta/classes-recipe/autotools.bbclass
    poky/meta/conf/distro/include/ptest-packagelists.inc -->|"require"| poky/meta/classes-recipe/ptest.bbclass
    poky/meta/recipes-support/curl/curl_8.7.1.bb -->|"appends"| meta-work/recipes-support/curl/curl__.bbappend
Loading

minpath

Shorten file paths to the minimal unique suffix. Useful for displaying lists of files in a compact way while keeping them distinguishable.

$ minpath <<EOF
/home/user/project/src/main.rs
/home/user/project/src/lib.rs
/home/user/project/tests/main.rs
EOF

src/main.rs
lib.rs
tests/main.rs

Multiple options are available to customize and tune the output. See minpath --help for details.

About

Gizmos for working with CSVs (and other shell utilities)

Resources

License

Stars

Watchers

Forks

Contributors