clickhouse查询耗时30多秒还有优化的地方?
发布于 4 个月前 作者 helloboy 1344 次浏览 来自 问答

一天5千多万数据查询耗时30多秒,查询语句如下:select link_id, src_ip, dst_ip, sum(session_num) as session_num, sum(byte_up + byte_down) as total_byte, sum(byte_up) as n_byte_up, sum(byte_down) as n_byte_down, sum(pkt_up + pkt_down) as total_pkt, sum(pkt_up) as n_pkt_up, sum(pkt_down) as n_pkt_down, sum(tcp_syn_num) as n_tcp_syn_num, sum(tcp_syn_ack_num) as n_tcp_syn_ack_num, sum(tcp_rst_up) as n_tcp_rst_up, sum(tcp_rst_down) as n_tcp_rst_down, ROUND(sum(ave_rtt_up),2) as rtt_up, ROUND(sum(ave_rtt_down),2) as rtt_down, sum(tcp_retrans_up) as tcp_reack_up, sum(tcp_retrans_down) as tcp_reack_down, sum(tcp_0win_up) as n_tcp_0win_up, sum(tcp_0win_down) as n_tcp_0win_down, ROUND(total_byte/85886,2) as total_byte_s, ROUND(n_byte_up/85886,2) as byte_up_s, ROUND(n_byte_down/85886,2) as byte_down_s, ROUND(total_pkt/85886,2) as total_pkt_s, ROUND(n_pkt_up/85886,2) as pkt_up_s, ROUND(n_pkt_down/85886,2) as pkt_down_s from aggr_log_app_ip_tcp where link_id = 0 and vlink_id = 0 and (start_time BETWEEN 1528554524 and 1528640410) and (create_time BETWEEN 20180609 and 20180610) group by link_id, src_ip, dst_ip order by total_byte desc limit 0,50 物理内存峰值7G多,cpu状态如下:QQ图片20180611142815.png

6 回复

物理内存不要看ClickHouse用了多少,看OS的page cache多少。也就是你的数据是不是都被OS cache住了,这样可以减少IO层面的压力。

另外,关注一下主键,性能可能会好一些。

不过看你这SQL,感觉已经很不错了。

请贴一下完整的建表语句,让我们看看分区策略和主键设计,这对性能至关重要。

@kikanjuu [CLICK_DB_AGGR_NEW_APP_IP_TCP]= { .time_interval = 3600 * 24, .data_save_time = 3600 * 24 * 90, .table_name = “aggr_log_app_ip_tcp”, .create_db_table = “create table if not exists aggr_log_app_ip_tcp(
link_id UInt8,
vlink_id UInt8,
src_ip UInt32,
dst_ip UInt32,
byte_up UInt64,
byte_down UInt64,
pkt_up UInt64,
pkt_down UInt64,
multicast_pkt_num UInt64,
broadcast_pkt_num UInt64,
tcp_syn_num UInt64,
tcp_syn_ack_num UInt64,
ave_rtt_up Double,
ave_rtt_down Double,
tcp_rst_up UInt64,
tcp_rst_down UInt64,
tcp_retrans_up UInt64,
tcp_retrans_down UInt64,
tcp_0win_up UInt64,
tcp_0win_down UInt64,
app_delay Double,
session_num UInt64,
start_time UInt32,
create_time UInt32,
insert_time DateTime default toDateTime(now()))
ENGINE = MergeTree PARTITION BY create_time order by(link_id,vlink_id, src_ip, dst_ip) SETTINGS index_granularity = 16384;”, .insert = “”, .aggr = “insert into %s(link_id,vlink_id,src_ip, dst_ip, byte_up,
byte_down, pkt_up, pkt_down, multicast_pkt_num, broadcast_pkt_num,tcp_syn_num,tcp_syn_ack_num,ave_rtt_up,ave_rtt_down,
tcp_rst_up,tcp_rst_down,tcp_retrans_up,tcp_retrans_down,tcp_0win_up,tcp_0win_down,app_delay,session_num,start_time, create_time) select
link_id, vlink_id,src_ip, dst_ip,
sum(tmp_byte_up), sum(tmp_byte_down), sum(tmp_pkt_up), sum(tmp_pkt_down),
sum(tmp_multicast_pkt_num),sum(tmp_broadcast_pkt_num),sum(tmp_tcp_syn_num), sum(tmp_tcp_syn_ack_num), avg(tmp_ave_rtt_up),
avg(tmp_ave_rtt_down),sum(tmp_tcp_rst_up),sum(tmp_tcp_rst_down),
sum(tmp_tcp_retrans_up),sum(tmp_tcp_retrans_down),sum(tmp_tcp_0win_up),sum(tmp_tcp_0win_down),avg(tmp_app_delay),sum(tmp_session_num), %d, %d
from
(
select
link_id,vlink_id,src_ip, dst_ip,sum(byte_up) as tmp_byte_up,
sum(byte_down) as tmp_byte_down, sum(pkt_up) as tmp_pkt_up, sum(pkt_down) as tmp_pkt_down,
sum(multicast_pkt_num) as tmp_multicast_pkt_num,sum(broadcast_pkt_num) as tmp_broadcast_pkt_num,
sum(tcp_syn_num) as tmp_tcp_syn_num, sum(tcp_syn_ack_num) as tmp_tcp_syn_ack_num,
avg(rtt_up) as tmp_ave_rtt_up, avg(rtt_down) as tmp_ave_rtt_down,sum(tcp_rst_up) as tmp_tcp_rst_up,
sum(tcp_rst_down) as tmp_tcp_rst_down, sum(tcp_retrans_up) as tmp_tcp_retrans_up, sum(tcp_retrans_down) as tmp_tcp_retrans_down,
sum(tcp_0win_up) as tmp_tcp_0win_up,sum(tcp_0win_down) as tmp_tcp_0win_down,avg(app_delay) as tmp_app_delay,
count(*) as tmp_session_num
from %s
where create_time = %u and end_mseconds between %llu and %llu
group by link_id, vlink_id,src_ip, dst_ip
)
group by link_id,vlink_id,src_ip, dst_ip”,

},
谢谢!

(1) PARTITION BY create_time 会导致分区数量惊人,严重影响性能。因为create_time会有很多不同的值,每个值都会有一个分区。 已经有人踩过这个坑了,http://www.clickhouse.com.cn/topic/5b1755409d28dfde2ddc60fb 建议按日期分区 PARTITION BY toDate(create_time) (示例语法,不一定完全正确)。

(2) 可以尝试把查询条件中的这些列,放到主键(order by)里去: start_time, create_time, src_ip

(3) WHERE和GROUP BY中的列顺序,要和建表语句中order by的列顺序统一。

比如, 你的查询有group by link_id, create_time, dst_ip 而你的建表语句却是order by(link_id,vlink_id, src_ip, dst_ip)

这两个列顺序不统一,可以改成下面这样,使得它们有连续不间断的公共前缀 link_id, dst_ip。 group by link_id, dst_ip, create_time, order by (link_id, dst_ip, vlink_id, src_ip)

(4) 建议楼主用markdown语法格式化一下贴出来的建表语句。既然希望得到好的回答,也希望楼主提问时方便大家阅读。

@kikanjuu 感谢宝贵的建议,贴代码有点急没注意格式化。

@jackpgao 你好,请教下,clickhouse内存不停的在涨有没有遇到裹这种情况?开机1G多现在已经11G多了

回到顶部