ClickHouse is a fast open-source OLAP database management system

It is column-oriented and allows to generate analytical reports using SQL queries in real-time.

ClickHouse is capable of generating analytical data reports in real time, with sub-second latencies

Upcoming Event

October 22, 2020  ClickHouse virtual office hours

ClickHouse works 100-1000x faster than traditional approaches

ClickHouse's performance exceeds comparable column-oriented database management systems currently available on the market. It processes hundreds of millions to more than a billion rows and tens of gigabytes of data per single server per second.

Detailed comparison

Independent benchmarks

Why ClickHouse might be the right choice?

Blazing fast

Blazing fast

ClickHouse uses all available hardware to its full potential to process each query as fast as possible. Peak processing performance for a single query stands at more than 2 terabytes per second (after decompression, only used columns). In distributed setup reads are automatically balanced among healthy replicas to avoid increasing latency.

Fault tolerant

Fault-tolerant

ClickHouse supports multi-master asynchronous replication and can be deployed across multiple datacenters. All nodes are equal, which allows avoiding having single points of failure. Downtime of a single node or the whole datacenter won't affect the system's availability for both reads and writes.

Easy to use

Easy to use

ClickHouse is simple and works out-of-the-box. It streamlines all your data processing: ingest all your structured data into the system and it becomes instantly available for building reports. SQL dialect allows expressing the desired result without involving any custom non-standard API that could be found in some alternative systems.

Highly reliable

Highly reliable

ClickHouse DBMS can be configured as a purely distributed system located on independent nodes, without any single points of failure. It also includes a lot of enterprise-grade security features and fail-safe mechanisms against human errors.

Success stories

Hardware efficient

ClickHouse processes typical analytical queries two to three orders of magnitude faster than traditional row-oriented systems with the same available I/O throughput and CPU capacity. Columnar storage format allows fitting more hot data in RAM, which leads to shorter typical response times.

Total cost of ownership could be further lowered by using commodity hardware with rotating disk drives instead of enterprise grade NVMe or SSD without significant sacrifices in latency for most kinds of queries.

ClickHouse is hardware efficient

Strives for CPU efficiency

Vectorized query execution involves relevant SIMD processor instructions and runtime code generation. Processing data in columns increases CPU line cache hit rate.

Optimizes disk drive access

ClickHouse minimizes the number of seeks for range queries, which increases the efficiency of using rotational disk drives, as it maintains locality of reference for continually stored data.

Minimizes data transfers

ClickHouse enables companies to manage their data and create reports without using specialized networks that are aimed at high-performance computing.

ClickHouse не тормозит

Feature-rich SQL database

1

User-friendly SQL dialect

ClickHouse features a SQL query dialect with a number of built-in analytics capabilities. In addition to common functions that could be found in most DBMS, ClickHouse comes with a lot of domain-specific functions and features for OLAP scenarios out of the box.

2

Efficient managing of denormalized data

Column-oriented nature of ClickHouse allows having hundreds or thousands of columns per table without slowing down SELECT queries. It's possible to pack even more data in by leveraging wide range data organizing options, such as arrays, tuples and nested data structures.

3

Join distributed or co-located data

ClickHouse provides various options for joining tables. Joins could be either cluster local, they can also access data stored in external systems. There's also an external dictionaries support that provides an alternative more simple syntax for accessing data from an outside source.

4

Approximate query processing

Users can control the trade-off between result accuracy and query execution time, which is handy when dealing with multiple terabytes or petabytes of data. ClickHouse also provides probabilistic data structures for fast and memory-efficient calculation of cardinalities and quantiles

ClickHouse. Just makes you think faster!

  • True column-oriented storage
  • Vectorized query execution
  • Parallel and distributed query execution
  • Real-time query processing
  • Real-time data ingestion
  • On-disk locality of reference
  • Secondary data-skipping indexes
  • Data compression
  • Hot and cold storage separation
  • SQL support
  • JSON documents query functions
  • Features for web and mobile analytics
  • High availability
  • Cross-datacenter replication
  • Local and distributed joins
  • Adaptive join algorithm
  • Pluggable external dimension tables
  • Arrays and nested data types
  • Focus on OLAP workloads
  • S3-compatible object storage support
  • Hadoop, MySQL, Postgres integration
  • Approximate query processing
  • Probabilistic data structures
  • Full support of IPv6
  • State-of-the-art algorithms
  • Detailed documentation
  • Clean documented code

Linearly scalable

ClickHouse scales well both vertically and horizontally. ClickHouse is easily adaptable to perform either on a cluster with hundreds or thousands of nodes or on a single server or even on a tiny virtual machine. Currently, there are installations with more multiple trillion rows or hundreds of terabytes of data per single node.

There are many ClickHouse clusters consisting of multiple hundred nodes, including few clusters of Yandex Metrica, while the largest known ClickHouse cluster is well over a thousand nodes.

Lineraly scalable

When to use ClickHouse

For analytics over a stream of clean, well structured and immutable events or logs. It is recommended to put each such stream into a single wide fact table with pre-joined dimensions.

  • ✓ Web and App analytics
  • ✓ Advertising networks and RTB
  • ✓ Telecommunications
  • ✓ E-commerce and finance
  • ✓ Information security
  • ✓ Monitoring and telemetry
  • ✓ Time series
  • ✓ Business intelligence
  • ✓ Online games
  • ✓ Internet of Things

When NOT to use ClickHouse

  • ✕ Transactional workloads (OLTP)
  • ✕ Key-value requests with a high rate
  • ✕ Blob or document storage
  • ✕ Over-normalized data

Quick start

System requirements for pre-built packages: Linux, x86_64 with SSE 4.2.

sudo apt-get install apt-transport-https ca-certificates dirmngr
sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv E0C56BD4

echo "deb https://repo.clickhouse.tech/deb/stable/ main/" | sudo tee \
    /etc/apt/sources.list.d/clickhouse.list
sudo apt-get update

sudo apt-get install -y clickhouse-server clickhouse-client

sudo service clickhouse-server start
clickhouse-client
sudo yum install yum-utils
sudo rpm --import https://repo.clickhouse.tech/CLICKHOUSE-KEY.GPG
sudo yum-config-manager --add-repo https://repo.clickhouse.tech/rpm/clickhouse.repo
sudo yum install clickhouse-server clickhouse-client

sudo /etc/init.d/clickhouse-server start
clickhouse-client
export LATEST_VERSION=$(curl -s https://repo.clickhouse.tech/tgz/stable/ | \
    grep -Eo '[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+' | sort -V -r | head -n 1)
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-common-static-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-common-static-dbg-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-server-$LATEST_VERSION.tgz
curl -O https://repo.clickhouse.tech/tgz/stable/clickhouse-client-$LATEST_VERSION.tgz

tar -xzvf clickhouse-common-static-$LATEST_VERSION.tgz
sudo clickhouse-common-static-$LATEST_VERSION/install/doinst.sh

tar -xzvf clickhouse-common-static-dbg-$LATEST_VERSION.tgz
sudo clickhouse-common-static-dbg-$LATEST_VERSION/install/doinst.sh

tar -xzvf clickhouse-server-$LATEST_VERSION.tgz
sudo clickhouse-server-$LATEST_VERSION/install/doinst.sh
sudo /etc/init.d/clickhouse-server start

tar -xzvf clickhouse-client-$LATEST_VERSION.tgz
sudo clickhouse-client-$LATEST_VERSION/install/doinst.sh

For other operating systems the easiest way to get started is using official Docker images of ClickHouse, this is not the only option though. Alternatively, you can easily get a running ClickHouse instance or cluster at Yandex Managed Service for ClickHouse.

After you got connected to your ClickHouse server, you can proceed to:

   Tutorial       Documentation

ClickHouse community

Like ClickHouse?

Help to spread the word about it via Facebook, Twitter and LinkedIn!

Hosting ClickHouse Meetups

ClickHouse meetups are essential for strengthening community worldwide, but they couldn't be possible without the help of local organizers. Please, fill this form if you want to become one or want to meet ClickHouse core team for any other reason.

ClickHouse Meetup

If you have any more thoughts or questions, feel free to contact Yandex ClickHouse team directly at turn on JavaScript to see email address.