Part 6: Querying Basics
Mastering SQL in ClickHouse: Aggregations, PREWHERE, and performance tips.
If you know SQL, you know ClickHouse. It supports standard SQL (ANSI 2011), but with some superpowers.
The Secret Weapon: PREWHERE
In standard SQL, you use WHERE to filter rows. In ClickHouse,
there is a more efficient variation called PREWHERE.
How it works
PREWHERE filters data before reading the rest
of the columns. It reads only the columns needed for filtering, drops
the non-matching rows, and then reads the other columns.
Good news: ClickHouse automatically moves conditions from WHERE to PREWHERE when possible!
Aggregations on Steroids
ClickHouse is an OLAP database, so it shines at aggregations.
SELECT toStartOfHour(event_time) AS hour, count() AS total_events, uniq(user_id) AS unique_users, avg(duration) AS avg_duration FROM events GROUP BY hour ORDER BY hour DESC LIMIT 10;
count(): Faster than
count(*).uniq(): Approximate unique count (HyperLogLog). Extremely fast and memory efficient.
uniqExact(): Exact unique count. Slower and uses more memory.
Date and Time Functions
ClickHouse has a rich library of functions for working with time series data.
toStartOfHour,toStartOfDay,toStartOfMonthformatDateTimedateDiff
Conclusion
Writing queries in ClickHouse feels familiar, but the performance is
unlike anything else. By understanding how ClickHouse executes
queries (like PREWHERE), you can write even faster SQL.
Next, we’ll look at some advanced features like arrays and maps.