RBQL is minimalistic but powerful SQL-like language that supports "select" queries with python expressions. RBQL works with tsv and csv files, so you don't need a database to use it. RBQL is similar to "awk" unix tool.
- Use python expressions inside "select", "where" and "order by" statements
- Use "a1", "a2", ... , "aN" as column names to write select queries
- Output entries appear in the same order as in input unless "ORDER BY" is provided.
- "NR" variable holds record's line number
- Input csv/tsv table may contain varying number of entries (but select query must be written in a way that prevents output of missing values)
- select
- where
- order by
- desc/asc
- distinct
a1
,a2
, ... ,aN
- column names*
- whole line/entryNR
- line number (1-based)NF
- number of columns in current line/entry
select * where NR <= 10
- this is an equivalent of bash command "head -n 10", NR is 1-based')select a1, a4
- this is an equivalent of bash command "cut -f 1,4"select * order by int(a2) desc
- this is an equivalent of bash command "sort -k2,2 -r -n"select * order by random.random()
- random sort, this is an equivalent of bash command "sort -R"select NR, *
- enumerate lines, NR is 1-basedselect * where re.match(".*ab.*", a1) is not None
- select entries where first column has "ab" pattern
CLI is self explanatory.
See ./rbql.py --help
for all options
./rbql.py --query 'select a1, a2 order by a1' < input.tsv > output.tsv
- python2 or python3