All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- added sample regexset file for PII-screening.
apply geocode --formatstr
now accepts less US-centric format selectors.searchset --flag
now shows which regexes match as a list (e.g. "[1, 3, 5]"), not just "1" or "0".
foreach
command now returns error message on Windows.foreach
still doesn't work on Windows (yet), but at least it returns "foreach command does not work on Windows.".apply geocode
was not accepting valid lat/longs below the equator. Fixed regex validator.- more robust
searchset
error handling when attempting to load regexset files. apply
link on README was off by one.
- bumped
dateparser
to 0.1.6. This now allowsapply datefmt
to properly reformat dates without a time component. Before, when reformatting a date like "July 4, 2020", qsv returns "2020-07-04T00:00:00+00:00". It now returns "2020-07-04". - minor clippy refactoring
- removed rust-stats submodule introduced in 0.17.1. It turns out crates.io does not allow publishing of crates with local dependencies on submodules. Published the modified rust-stats fork as qsv-stats instead. This allows us to publish qsv on crates.io
- removed unused
textwrap
dependency
- explicitly specified embedded modified rust-stats version in Cargo.toml.
- added
searchset
command. Run multiple regexes over CSV data in a single pass. - added
--unicode
flag tosearch
,searchset
andreplace
commands. Previously, regex unicode support was on by default, which comes at the cost of performance. And sinceqsv
optimizes for performance ("q is for quick"), it is now off by default. - added quartiles calculation to
stats
. Pulled in upstream pending PRs from @m15a to implement.
- changed variance algorithm. For some reason, the previous variance algorithm was causing intermittent test failures on macOS. Pulled in pending upstream PR from @ruppertmillard.
- embedded rust-stats fork submodule which implements quartile and new variance algorithm.
- changed GitHub Actions to pull in submodules.
- the project was not following semver properly, as several new features were released in the 0.16.x series that should have been MINOR version bumps, not PATCH bumps.
- added
geocode
operation toapply
command. It geocodes to the closest city given a column
with coordinates in Location format ('latitude, longitude') using a static geonames lookup file.
(see https://docs.rs/reverse_geocoder) - added
currencytonum
operation toapply
command. - added
getquarter.lua
helper script to supportlua
example in Cookbook. - added
turnaroundtime.lua
helper script to compute turnaround time. - added
nyc311samp.csv
to provide sample data for recipes. - added several Date Enrichment and Geocoding recipes to Cookbook.
- fixed
publish.yml
Github Action workflow to properly create platform specific binaries. - fixed variance test to eliminate false positives in macOS.
- added
docs
directory. For README reorg, and to add detailed examples per command in the future. - added
emptyreplace
operation toapply
command. - added
datefmt
operation toapply
command. - added support for reading from stdin to
join
command. - setup GitHub wiki to host Cookbook and sundry docs to encourage collaborative editing.
- added footnotes to commands table in README.
- changed GitHub Actions publish workflow so it adds the version to binary zip filename.
- changed GitHub Actions publish workflow so binary is no longer in
target/release
directory. - reorganized README.
- moved whirlwind tour and benchmarks to
docs
directory. - use zipped repo copy of worldcitiespop_mil.csv for benchmarks.
- fixed links to help text in README for
fixlengths
andslice
cmds exclude
not listed in commands table. Added to README.
- Removed
empty0
andemptyNA
operations inapply
command. Replaced withemptyreplace
.
- changed Makefile to remove github recipe as we are now using GitHub Actions.
- Applied rustfmt to entire project #56
- Changed stats variance test as it was causing false positive test failures on macOS (details)
- removed
-amd64
suffix from binaries built by GitHub Actions.
- fixed publish Github Actions workflow to zip binaries before uploading.
- removed
.travis.yml
as we are now using GitHub Actions. - removed scripts
build-release
,github-release
andgithub-upload
as we are now using GitHub Actions. - removed
ci
folder as we are now using GitHub Actions. - removed
py
command. #58
- Bumped qsv version to 0.16.1. Inadvertently released 0.16.0 with qsv version still at 0.15.0.
-
Added a CHANGELOG.
-
Added additional commands/options from @Yomguithereal xsv fork.
apply
- Apply series of string transformations to a CSV column.behead
- Drop headers from CSV file.enum
- Add a new column enumerating rows by adding a column of incremental or uuid identifiers. Can also be used to copy a column or fill a new column with a constant value.explode
- Explode rows into multiple ones by splitting a column value based on the given separator.foreach
- Loop over a CSV file to execute bash commands.jsonl
- Convert newline-delimited JSON to CSV.lua
- Execute a Lua script over CSV lines to transform, aggregate or filter them.pseudo
- Pseudonymise the value of the given column by replacing them by an incremental identifier.py
- Evaluate a Python expression over CSV lines to transform, aggregate or filter them.replace
- Replace CSV data using a regex.sort
--uniq option - When set, identical consecutive lines will be dropped to keep only one line per sorted value.search
--flagcolumn
option - If given, the command will not filter rows but will instead flag the found rows in a new column namedcolumn
.
-
Added conditional compilation logic for
foreach
command to only compile ontarget_family=unix
as it has a dependency onstd::os::unix::ffi::OsStrExt
which only works in unix-like OSes. -
Added
empty0
andemptyNA
operations toapply
command with corresponding test cases. -
Added GitHub Actions to check builds on
ubuntu-latest
,windows-latest
andmacos-latest
. -
Added GitHub Action to publish binaries on release.
-
Added
build.rs
build-dependency to check that Rust is at least at version 1.50.0 and above.
- reformatted README listing of commands to use a table, and to link to corresponding help text.
- Removed appveyor.yml as qsv now uses GitHub Actions.
dedup
cmd from @ronohm.table
cmd--align
option from @alex-ozdemir.fmt
cmd--quote-never
option from @niladic.exclude
cmd from @lalaithion- Added
--dupes-output
option todedup
cmd. - Added datetime type detection to
stats
cmd. - Added datetime
min/max
calculation tostats
cmd. - es-ES translation from @ZeliosAriex.
- Updated benchmarks script.
- Updated whirlwind tour to include additional commands.
- Made whirlwind tour reproducible by using
sample
--seed
option.
- Fixed
sample
percentage sampling to be always reproducible even if sample size < 10% when using--seed
option. - Fixed BOM issue with tests, leveraging unreleased xsv fix.
- Fixed count help text typo.
- Removed
session.vim
file.
- Performance: enabled link-time optimization (
LTO="fat"
). - Performance: used code generation units.
- Performance: used mimalloc allocator.
- Changed benchmark to compare xsv 0.13.0 and qsv.
- Changed chart from png to svg.
- Performance: Added note in README on how to optimize local compile
by setting
target-cpu=native
.
- Renamed fork to qsv.
- Revised highlight note explaining reason for qsv renamed fork in README.
- Added (NEW) and (EXPANDED) notations to command listing.
- Adapted to Rust 2018 edition.
- used serde derive feature.
Initial fork from xsv.
rename
cmd from @Kerollmops.fill
cmd from @alexrudy.transpose
cmd from @mintyplanet.select
cmd regex support from @sd2k.stats
cmd--nullcount
option from @scpike.- added percentage sampling to
sample
cmd.
- Updated README with additional commands.