O

Services

Case studies

Contact

27.09.20

DuckDB - a lightweight analytics database

featured image thumbnail for post DuckDB - a lightweight analytics database

DuckDB is a new lightweight database, designed to support data science.

The new system has been built by a research group in the Netherlands. It is columnar, which means it is optimised for selecting data from an entire column and doing operations on it. This is in contrast to most standard SQL engines (Postgres, Mysql etc) which are optimised for selecting a single row. This feature makes it perfectly suited to data science and machine learning tasks.

It is also embedded, which means installing its is as easy as running pip install duckdb. No need for messing around on dedicated servers. This makes prototyping and developing quick and easy. I've been looking for something like this for a while for supporting applications that use machine learning.

Here is a little code snippet that shows how to make a table from a .csv file.

import duckdb con = duckdb.connect(database='test.db', read_only=False) con.execute("CREATE TABLE test AS SELECT * FROM read_csv_auto('test.csv');") con.execute("SELECT * FROM test;") print(con.fetchall())

If you are looking for a database to support your ML project, give it a go.

←Previous: How much computation does a brain need?

Next: Neil Ferguson on modelling Covid-19→


Keep up with the latest developments in data science. One email per month.

ortom logoortom logoortom logoortom logo

©2025

LINKEDIN

CLUTCH.CO

TERMS & PRIVACY