Using duckdb to make CSV files talk
Sometimes you want to ask data questions. And often that data is in a CSV. Sure, you can write a quick Python script and use that to extract the information you want. Or you can import it into a database and use SQL.
But TIL the easiest thing is to just ask the duck.
The duck is DuckDB here.
Why? Because you can use SQL queries directly on CSV files.
For examples, let's use a random CSV called luarocks-packages.csv
I have lying around:
It starts like this:
And how do I query it? Well, suppose I want to find all packages where alerque is one of the maintainers:
And boom! There you go. So, if you know even some very basic SQL (and you should!) you can leverage duckdb to extract information from CSV files quickly, reliably and in a repeatable manner.
Which is awesome!