Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tsv-select --exclude #267

Merged
merged 9 commits into from
Mar 2, 2020
Merged

Conversation

jondegenhardt
Copy link
Contributor

@jondegenhardt jondegenhardt commented Mar 2, 2020

This PR adds a new feature to tsv-select, the ability to exclude fields. Fields to exclude are specified with the --e|exclude option. Some examples:

$   # Drop the first field, keep everything else.
$   # Equivalent to `cut -f 2- file.tsv`
$   tsv-select --exclude 1 file.tsv

$   # Drop fields 3-10, keep everything else
$   tsv-select --exclude 3-10 file.tsv

$   # Move field 2 to the start of the line, drop fields 10-15
$   tsv-select -f 2 -e 10-15 file.tsv

$   # Move field 2 to the end, dropping fields 10-15
$   tsv-select -f 2 --rest first -e 10-15 file.tsv

This PR also improves performance of the --rest operator. This is done by bulk appending fields from the last specified field until the end of the line. The difference is dramatic for data streams with many fields. These performance improvements apply to --exclude as well, as it uses the implementation of --rest. Ad hoc tests on OS X indicate meaningful improvement for operations like tsv-select -f 1 --rest first (move first field to end of line). And tsv-select --exclude 1 is dramatically faster than cut -f 2- for files with a reasonable number of fields (tested on a 29 field file against GNU cut on OS X).

Documentation for tsv-select was also improved.

The new --exclude option implements enhancement request #72, though with some syntactic differences.

@jondegenhardt jondegenhardt merged commit 7d5f7da into eBay:master Mar 2, 2020
@jondegenhardt jondegenhardt deleted the tsv-select-exclude branch March 2, 2020 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant