一、概述
clickhouse是一个数据库管理系统,对于数据的读写、存储、查询、修改、复制、事务、效率等有其独特的方法架构论;另外它是列式存储的,将一列数据作为最小的存储单元,需要多少列读多少列,减少了IO的数据量,提升了效率;最后它适合分析结构化的、干净的、不可变的流式数据,作为olap是很好的选择。
二、安装部署
clickhouse的分布式是由分片(shard)+副本(replica)来实现的,数据的一致性、高可用及容错是结合zookeeper来控制的。所以生产环境上决定采用4台机器来搭建clickhouse集群(2shard * 2 replica)。详细部署步骤如下:
1、查看机器是否支持SSE 4.2
grep -q sse4_2 /proc/cpuinfo && echo "SSE 4.2 supported" || echo "SSE 4.2 not supported"
2、通过yum安装clickhouse-server、clickhouse-client(每台机器都要),安装过程中可能需要升级glibc,解决办法https://cloud.tencent.com/developer/article/1463094。
yum install yum-utils
rpm --import https://repo.yandex.ru/clickhouse/CLICKHOUSE-KEY.GPG
yum-config-manager --add-repo https://repo.yandex.ru/clickhouse/rpm/stable/x86_64search
yum install clickhouse-server clickhouse-client
3、修改默认配置文件config.xml
<!--日志相关-->
<logger>
<!-- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger.h#L105 -->
<level>trace</level>
<log>/home/hadoop/logs/clickhouse-server/clickhouse-server.log</log>
<errorlog>/home/hadoop/logs/clickhouse-server/clickhouse-server.err.log</errorlog>
<size>1000M</size>
<count>10</count>
<!-- <console>1</console> --> <!-- Default behavior is autodetection (log to console if not daemon mode and is tty) -->
</logger>
<!--端口-->
<http_port>8123</http_port>
<tcp_port>9000</tcp_port>
<!--用于副本间通信的端口-->
<interserver_http_port>9009</interserver_http_port>
<interserver_http_host>192.168.16.1</interserver_http_host>
<listen_host>0.0.0.0</listen_host>
<!--数据路径及权限控制 -->
<path>/home/hadoop/data/clickhouse/</path>
<tmp_path>/home/hadoop/data/clickhouse/tmp/</tmp_path>
<user_files_path>/home/hadoop/data/clickhouse/user_files/</user_files_path>
<users_config>users.xml</users_config>
<!-- Default profile of settings. -->
<default_profile>default</default_profile>
<!--集群相关配置-->
<remote_servers incl=""clickhouse_remote_servers"" />
<zookeeper incl=""zookeeper-servers"" optional=""true"" />
<macros incl=""macros"" optional=""true"" />
4、添加集群配置文件metrika.xml
<yandex>
<!--ck集群节点-->
<clickhouse_remote_servers>
<hadooptest_clusters>
<!--分片1-->
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.16.1</host>
<port>9000</port>
</replica>
</shard>
<!--分片2-->
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>192.168.16.2</host>
<port>9010</port>
</replica>
</shard>
</hadooptest_clusters>
</clickhouse_remote_servers>
<!--zookeeper相关配置-->
<zookeeper-servers>
<node index="1">
<host>192.168.16.1</host>
<port>2281</port>
</node>
<node index="2">
<host>192.168.16.2</host>
<port>2281</port>
</node>
<node index="3">
<host>192.168.16.24</host>
<port>2281</port>
</node>
</zookeeper-servers>
<macros>
<layer>hadooptest_clusters</layer>
<shard>01</shard> <!--分片号-->
<replica>192.168.16.1</replica> <!--当前节点IP-->
</macros>
<networks>
<ip>::/0</ip>
</networks>
<!--压缩相关配置-->
<clickhouse_compression>
<case>
<min_part_size>10000000000</min_part_size>
<min_part_size_ratio>0.01</min_part_size_ratio>
<method>lz4</method> <!--压缩算法lz4压缩比zstd快, 更占磁盘-->
</case>
</clickhouse_compression>
</yandex>
5、添加数据盘配置,冷热数据分离
<yandex>
<storage_configuration>
<disks>
<default>
<keep_free_space_bytes>1024</keep_free_space_bytes>
</default>
<fast>
<path>/home/hadoop/data/clickhouse/</path>
</fast>
<normal>
<path>/home/hadoop/data/clickhouse/</path>
<keep_free_space_bytes>10485760</keep_free_space_bytes>
</normal>
</disks>
<policies>
<ssd_and_hdd>
<volumes>
<hot>
<disk>fast</disk>
<max_data_part_size_bytes>1073741824</max_data_part_size_bytes>
</hot>
<cold>
<disk>normal</disk>
</cold>
</volumes>
<move_factor>0.2</move_factor>
</ssd_and_hdd>
</policies>
</storage_configuration>
</yandex>
6、配置用户权限
<?xml version="1.0"?>
<yandex>
<!-- Profiles of settings. -->
<profiles>
<!-- Default settings. -->
<default>
<!-- Maximum memory usage for processing single query, in bytes. --> <max_memory_usage_for_all_queries>101310968832</max_memory_usage_for_all_queries> <max_bytes_before_external_group_by>50655484416</max_bytes_before_external_group_by>
<max_memory_usage>101310968832</max_memory_usage> <distributed_aggregation_memory_efficient>1</distributed_aggregation_memory_efficient>
<use_uncompressed_cache>0</use_uncompressed_cache>
<load_balancing>random</load_balancing>
</default>
<readonly>
<readonly>1</readonly>
</readonly>
</profiles>
<quotas>
<default>
<interval>
<duration>3600</duration>
<queries>0</queries>
<errors>0</errors>
<result_rows>0</result_rows>
<read_rows>0</read_rows>
<execution_time>0</execution_time>
</interval>
</default>
</quotas>
<users>
<!-- If user name was not specified, 'default' user is used. -->
<default>
<password>MVZqc4ne</password>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<!-- Settings profile for user. -->
<profile>default</profile>
<!-- Quota for user. -->
<quota>default</quota>
<!-- For testing the table filters -->
<databases>
<test>
<!-- Simple expression filter -->
<filtered_table1>
<filter>a = 1</filter>
</filtered_table1>
<!-- Complex expression filter -->
<filtered_table2>
<filter>a + b < 1 or c - d > 5</filter>
</filtered_table2>
<filtered_table3>
<filter>c = 1</filter>
</filtered_table3>
</test>
</databases>
</default>
<bi_test>
<password_sha256_hex>645cb15583a65c5b7d89f02b37a97fe162e79dafdacb7450de2a679ff602c9ea</password_sha256_hex>
<networks incl="networks" replace="replace">
<ip>::/0</ip>
</networks>
<profile>default</profile>
<quota>default</quota>
</bi_test>
</users>
</yandex>
7、由于需要用到zookeeper来管理,这里选择单独安装zookeeper。
7.1、将线上腾讯云的kafka安装包copy到新机器。
7.2、配置zoo.cfg文件
# The number of milliseconds of each tick
tickTime=2000
initLimit=10
dataDir=/home/hadoop2/zookeeper
clientPort=2181
minSessionTimeout=6000
maxSessionTimeout=180000
autopurge.snapRetainCount=10
autopurge.purgeInterval=1
server.21=hadoop1:2888:3888
server.22=hadoop2:2888:3888
server.23=hadoophadoop3:2888:3888
maxClientCnxns=500
7.3、启动zookeeper
bin/zkServer.sh start</pre>
8、启动clickhouse server
service clickhouse-server start
9、查看日志是否异常,连接clickhouse-client测试建表,导数,查询等。
clickhouse-client -h 192.168.16.1 --port 9000 -u bi_test --password Ny3jTUoTQUlQAb4i
hadoop2 :) select count(*) from dws_sb_olap_user_basic_1d;
SELECT count(*)
FROM dws_sb_olap_user_basic_1d
┌─count()─┐
│ 475835 │
└─────────┘
1 rows in set. Elapsed: 0.005 sec. Processed 475.83 thousand rows, 475.83 KB (92.74 million rows/s., 92.74 MB/s.)
三、clickhouse表引擎使用介绍
clickhouse的表引擎决定了数据的存放和读取方式,从而也就决定了IO效率。不同的表引擎主要决定以下几点:
数据存储和读取的位置
支持哪些查询方式
能否并发式访问数据
能不能使用索引
是否可以执行多线程请求
数据复制使用的参数
这里主要介绍三种表引擎:MergeTree、ReplicatedMergeTree、Distributed。
- MergeTree
四、监控及后期维护
1、监控
此次监控采取clickhouse exporter + prometheus + grafana方式。
1.1、在centos 7的机器上安装docker
#安装
yum -y install docker
#启动
service docker start
#验证
docker version
docker run hello-world
1.2、重新制作clickhouse exporter的镜像(因为需要更改相关环境变量)
#编写dockerfile,将参数传递进去
#Dockerfile
FROM docker.io/f1yegor/clickhouse-exporter
ADD clickhouse_exporter_start.sh /opt/clickhouse_exporter_start.sh
ENTRYPOINT ["/opt/clickhouse_exporter_start.sh"]
#clickhouse_exporter_start.sh
#!/bin/sh
export CLICKHOUSE_USER=default
export CLICKHOUSE_PASSWORD=**********
/usr/local/bin/clickhouse_exporter $*
#制作镜像
docker build -t ck_clickhouse_exporter .
#查看镜像
╰─># docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
ck_clickhouse_exporter latest 3384e729d116 23 hours ago 19.6 MB
docker.io/f1yegor/clickhouse-exporter latest 9d9bfc1c7cb2 9 months ago 19.6 MB
docker.io/hello-world latest fce289e99eb9 11 months ago 1.84 kB
docker run -d -p 9116:9116 ck_clickhouse_exporter -scrape_uri=http://ip1:8123/
docker run -d -p 9117:9116 ck_clickhouse_exporter -scrape_uri=http://ip2:8123/
docker run -d -p 9118:9116 ck_clickhouse_exporter -scrape_uri=http://ip3:8123/
docker run -d -p 9119:9116 ck_clickhouse_exporter -scrape_uri=http://ip4:8123/
1.4、编辑prometheus配置文件,加入刚才启动的exporter。
- job_name : 'clickhouse'
scrape_interval: 30s
static_configs:
- targets: ['ck1:9116','ck1:9117','ck1:9118','ck1:9119']
labels:
env: 'tx_clickhouse'
1.5、导入dashboard到grafana
先将开源的dashboard导入grafana然后再根据具体需要做调整。
1.6、利用prometheus的alter manager设置报警。
配置rules
groups:
- name: Clickhouse监控规则
rules:
- alert: "clickhouse实例状态告警"
expr: clickhouse_version_integer != 19016003
for: 3m
labels:
severity: critical
annotations:
summary: "Clickhouse实例异常"
description: "Clickhouse {{$labels.instance}}实例状态异常 当前状态:{{ $value }}"
- alert: "Clickhouse查询告警"
expr: clickhouse_memory_tracking > 32212254720
for: 3m
labels:
severity: critical
annotations:
summary: "Clickhouse查询内存异常"
description: "Clickhouse {{$labels.exported_instance}}实例查询内存大于30G 当前状态:{{ $value }}"
后续再补充相关报警。
1.7、编写alertmanager_webhook相关邮件或者短信接口服务。
2、扩容
2.1、机器cpu内存扩容,因为是云主机可直接申请升配。
2.2、机器磁盘扩容,添加磁盘后可参考冷热数据分离配置不同的数据分布方式,也可以手动更改表数据分区的磁盘分布。
ALTER TABLE table_name MOVE PARTITION|PART partition_expr TO DISK|VOLUME 'disk_name'