精华 硬件重启,clickhouse加载数据长达10分钟之久
发布于 2 个月前 作者 helloboy 505 次浏览 来自 问答

硬件重启,clickhouse加载数据长达10分钟之久,总共9个表,600多G数据,怎么优化?求大神指点迷律

13 回复

建表语句贴出来看一下。

硬件配置如何?

top命令看一下,瓶颈在cpu, 内存还是磁盘上?

怀疑是用来不合理的分区键,导致分区过多。

可以看下log输出,应该是在做大量的校验工作。

用的服务器,2个物理cpu,一共16个物理核,32个超线程。图一8个给clickhouse用QQ图片20180607102758.png内存64G的,硬盘只用了30%,图下是加载数据的打印信息,从信息上看是一个个表顺序加载。QQ图片20180607103631.png。还有个问题是每个表分区是按小时分区的。

[CLICK_DB_TCP_SESSION]= { .time_interval = 3600,

	.data_save_time = 3600 * 24 * 90,

	.table_name = "t_log_tcp_session",

	.create_db_table = "create table if not exists t_log_tcp_session(\

			link_id UInt8,\

			user_name String,\

			src_mac UInt64,\

			dst_mac UInt64,\

			src_ip UInt32,\

			dst_ip UInt32,\

			src_port UInt16,\

			dst_port UInt16,\

			protonum UInt8,\

			app_id UInt16,\

			byte_up UInt64,\

			byte_down UInt64,\

			pkt_up UInt64,\

			pkt_down UInt64,\

			byte_up_payload UInt64,\

			byte_down_payload UInt64,\

			pkt_up_payload UInt64,\

			pkt_down_payload UInt64,\

			conn_status UInt8,\

			create_mseconds UInt64,\

			start_mseconds UInt64,\

			end_mseconds UInt64,\

			tcp_create_time UInt16,\

			rtt_up Double,\

			rtt_down Double,\

			ack_delay_up Double,\

			ack_delay_down Double,\

			tcp_syn_num UInt8,\

			tcp_syn_ack_num UInt8,\
			tcp_syn_ack_enable UInt8,\

			tcp_ack_num UInt8,\

			tcp_rst_up UInt8,\

			tcp_rst_down UInt8,\

			tcp_fin_up UInt8,\

			tcp_fin_down UInt8,\

			tcp_retrans_up UInt32,\

			tcp_retrans_down UInt32,\

			tcp_reack_up UInt32,\

			tcp_reack_down UInt32,\

			tcp_0win_up UInt32,\

			tcp_0win_down UInt32,\

			tcp_lose_up UInt32,\

			tcp_lose_down UInt32,\

			tcp_request_num UInt32,\

			tcp_response_num UInt32,\

			tcp_success_num UInt32,\
            http_transaction_total_num UInt32,\
            http_transaction_req_num UInt32,\
            http_transaction_rsp_num UInt32,\
            http_transaction_finish_num UInt32,\
            http_transaction_max_delay UInt32,\
            http_transaction_mean_delay UInt32,\

			server_max_delay Double,\

			server_ave_delay Double,\

			client_request_max_time UInt32,\

			client_request_ave_time UInt32,\

			server_response_max_time UInt32,\

			server_response_ave_time UInt32,\

			net_type UInt16,\

			net_type_num UInt16,\

			tcp_dscp UInt8,\

			pkt_64_num UInt64,\

			pkt_64_511_num UInt64,\

			pkt_512_1023_num UInt64,\

			pkt_1024_num UInt64,\
			unicast_pkt_num UInt64,\
			multicast_pkt_num UInt64,\
			broadcast_pkt_num UInt64,\
			app_delay Double,\
			hash_index UInt32,\

			create_time UInt32,\

			insert_time DateTime default toDateTime(now()))\

			ENGINE = MergeTree PARTITION BY create_time order by(end_mseconds, link_id, src_ip, dst_ip, dst_port, app_id, src_mac, dst_mac, protonum,hash_index) SAMPLE by (hash_index);",
	.insert = "/usr/local/sbin/clickhouse/clickhouse-client -h 127.0.0.1 --password v84bzd6C -d %s --query=\"\

			insert into t_log_tcp_session(link_id,user_name,src_mac,dst_mac,\

			src_ip,dst_ip,src_port,dst_port,protonum,app_id,byte_up,byte_down,\

			pkt_up,pkt_down,byte_up_payload,byte_down_payload,pkt_up_payload,pkt_down_payload,conn_status,\

			create_mseconds,start_mseconds,end_mseconds,tcp_create_time,rtt_up,rtt_down,\

			ack_delay_up,ack_delay_down,tcp_syn_num,tcp_syn_ack_num,tcp_syn_ack_enable,tcp_ack_num,tcp_rst_up,tcp_rst_down,tcp_fin_up,\

			tcp_fin_down,tcp_retrans_up,tcp_retrans_down,tcp_reack_up,tcp_reack_down,tcp_0win_up,tcp_0win_down,\

			tcp_lose_up,tcp_lose_down,tcp_request_num,tcp_response_num,tcp_success_num,\
            http_transaction_total_num,http_transaction_req_num,http_transaction_rsp_num,http_transaction_finish_num,http_transaction_max_delay,http_transaction_mean_delay,\
            server_max_delay,\

			server_ave_delay,client_request_max_time,client_request_ave_time,server_response_max_time,\

			server_response_ave_time,net_type,net_type_num,tcp_dscp,pkt_64_num,pkt_64_511_num,\

			pkt_512_1023_num,pkt_1024_num,unicast_pkt_num,multicast_pkt_num,broadcast_pkt_num,app_delay,hash_index,create_time) format CSV\" < %s/%s",
	.aggr = "",
},

[CLICK_DB_UDP_SESSION]=

{

	.time_interval = 3600,

	.data_save_time = 3600 * 24 * 90,

	.table_name = "t_log_udp_session",

	.create_db_table = "create table if not exists t_log_udp_session(\

			link_id UInt8,\

			user_name String,\

			src_mac UInt64,\

			dst_mac UInt64,\

			src_ip UInt32,\

			dst_ip UInt32,\

			src_port UInt16,\

			dst_port UInt16,\

			protonum UInt8,\

			app_id UInt16,\

			byte_up UInt64,\

			byte_down UInt64,\

			pkt_up UInt64,\

			pkt_down UInt64,\

			conn_status UInt8,\

			net_delay Double,\

			app_delay Double,\

			create_mseconds UInt64,\

			start_mseconds UInt64,\

			end_mseconds UInt64,\

			net_type UInt16,\

			net_type_num UInt16,\

			udp_dscp UInt8,\

			pkt_64_num UInt64,\

			pkt_64_511_num UInt64,\

			pkt_512_1023_num UInt64,\

			pkt_1024_num UInt64,\
			unicast_pkt_num UInt64,\
			multicast_pkt_num UInt64,\
			broadcast_pkt_num UInt64,\
			hash_index UInt32,\

			create_time UInt32,\

			insert_time DateTime default toDateTime(now()))\

			ENGINE = MergeTree PARTITION BY create_time order by(end_mseconds, link_id, src_ip, dst_ip, dst_port, app_id, src_mac, dst_mac, protonum,hash_index) SAMPLE by (hash_index);",

	.insert = "/usr/local/sbin/clickhouse/clickhouse-client -h 127.0.0.1 --password v84bzd6C -d %s --query=\"\

			insert into t_log_udp_session(link_id,user_name,src_mac,dst_mac,\

			src_ip,dst_ip,src_port,dst_port,protonum,app_id,byte_up,byte_down,\

			pkt_up,pkt_down,conn_status,net_delay,app_delay,create_mseconds,start_mseconds,end_mseconds,\

			net_type,net_type_num,udp_dscp,pkt_64_num,pkt_64_511_num,\

			pkt_512_1023_num,pkt_1024_num,unicast_pkt_num,multicast_pkt_num,broadcast_pkt_num,hash_index,create_time) format CSV\" < %s/%s",
	.aggr = "",

},
其中两个表的建表语句

@jackpgao 果然有经验,看上去确实分区太多了。

create_time UInt32 … PARTITION BY create_time

@helloboy create_time有很多值,每个值对应一个分区,导致分区数量太多了。 建议改成按天分区 PARTITION BY toYYYYMMDD(create_time)

@kikanjuu @jackpgao 是这个问题,上午造数据验证了,谢谢你们哈哈。大家建表默认用8192?QQ图片20180607140841.png图上每个文件里面会有索引,每次加载会读取,块大了是不是生成文件读少?读取索引次数少了呢?

@helloboy 如果数据表很大,而且每个SQL都要过滤成百万上千万的数据,可以适当把索引粒度调大,比如翻倍。

如果表很小,改成64都是可以的。

每个文件里都有索引,这句话说的不对,ClickHouse只有主键,没有其他索引了。这个主键是稀疏索引,说白了就是每隔多长的绳子,画条线,所以,通过条件判断你的数据在那一截绳子里。所以,数据多的时候,调大,是有意义的。

@jackpgao 说的非常形象,我的理解又深了些,谢谢哈。因为每张表要保存三个月的数据(接近100亿),所以想把索引颗粒度调大点。加速查询上也有点问题,不知如何优化查询?

@helloboy 粒度大小跟你保留时长没啥关系的。这是个逻辑概念。影响你压缩效率的,会是分区,如果是月的压缩比会大,如果是天的,压缩比就小了。整体的数据压缩后的比例,不是算法的压缩比。

@helloboy 你这个是监控交换机数据吗

@liuqian1990 不是的,网络流量分析设备

请教一下,在你的例子SAMPLE by 是如何运行的?

@pleasure21cn 采样查询,数据量很庞大时,业务容许的情况下,通过比率采样

回到顶部