ClickHouse 简介
Yandex开源的数据分析的数据库,名字叫做ClickHouse,适合流式或批次入库的时序数据。ClickHouse不应该被用作通用数据库,而是作为超高性能的海量数据快速查询的分布式实时处理平台,在数据汇总查询方面(如GROUP BY),ClickHouse的查询速度非常快。
ClickHouse =
Click
Event Stream + DataWareHouse
ClickHouse is a column-oriented database management system (DBMS) for online analytical processing of queries (OLAP).
OLAP场景特征
· 大多数是读请求
· 数据总是以相当大的批(> 1000 rows)进行写入
· 不修改已添加的数据
· 每次查询都从数据库中读取大量的行,但是同时又仅需要少量的列
· 宽表,即每个表包含着大量的列
· 较少的查询(通常每台服务器每秒数百个查询或更少)
· 对于简单查询,允许延迟大约50毫秒
· 列中的数据相对较小: 数字和短字符串(例如,每个URL 60个字节)
· 处理单个查询时需要高吞吐量(每个服务器每秒高达数十亿行)
· 事务不是必须的
· 对数据一致性要求低
· 每一个查询除了一个大表外都很小
· 查询结果明显小于源数据,换句话说,数据被过滤或聚合后能够被盛放在单台服务器的内存中
官网文档:https://clickhouse.tech/
https://clickhouse.tech/docs/en/
Github 地址:https://github.com/ClickHouse/ClickHouse
源码阅读:https://clickhouse.tech/codebrowser/html_report/ClickHouse/src/index.html
安装
https://clickhouse.tech/docs/en/getting-started/install/
快速开始
Creating a Table :
CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
(
name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
...
INDEX index_name1 expr1 TYPE type1(...) GRANULARITY value1,
INDEX index_name2 expr2 TYPE type2(...) GRANULARITY value2
) ENGINE = MergeTree()
ORDER BY expr
[PARTITION BY expr]
[PRIMARY KEY expr]
[SAMPLE BY expr]
[TTL expr
[DELETE|TO DISK 'xxx'|TO VOLUME 'xxx' [, ...] ]
[WHERE conditions]
[GROUP BY key_expr [SET v1 = aggr_func(v1) [, v2 = aggr_func(v2) ...]] ] ]
[SETTINGS name=value, ...]
系统架构
源码阅读:
| [+] Access/ | |
| [+] AggregateFunctions/ | |
| [+] Bridge/ | |
| [+] Client/ | |
| [+] Columns/ | |
| [+] Common/ | |
| [+] Compression/ | |
| [+] Coordination/ | |
| [+] Core/ | |
| [+] DataStreams/ | |
| [+] DataTypes/ | |
| [+] Databases/ | |
| [+] Dictionaries/ | |
| [+] Disks/ | |
| [+] Formats/ | |
| [+] Functions/ | |
| [+] IO/ | |
| [+] Interpreters/ | |
| [+] Parsers/ | |
| [+] Processors/ | |
| [+] Server/ | |
| [+] Storages/ | |
| [+] TableFunctions/ |
网友评论