美文网首页
clickhouse distributed query 分布式

clickhouse distributed query 分布式

作者: helloGlobal | 来源:发表于2019-04-30 17:57 被阅读0次

记一次使用clickhouse遇到的问题

联查两张表,graphite_d 为分布式表, user 为普通表

   select count(*) from default.graphite_d where member in( select member from user where date='2019-04-26')

报错:

message: Received from XXX.XXX.XXX.XXX:9000. DB::Exception: Table default.user doesn't exist.. 并且会有调用栈打印出来

查询文档发现,当 graphite_d 为分布式表时,将会把如下查询发送到所有远程服务器

 select count(*) from default.graphite where member in ( select member from user where date='2019-04-26')

但是 因为user 表是普通表,只存在于一台机器上,所以其他的远程服务器在本地是无法查询到此表的

因此 user 需要成为分布式表

在每台服务器上创建 user的分布式表 user_d

CREATE TABLEuser_d (member Int64)

ENGINE = Distributed(clusterName,defaultDB, user, rand())

运行sql 如下:

 select count(*) from default.graphite_d where member in( select member from user_d where date='2019-04-26')

报错:

Received exception from server:

Code: 288. DB::Exception: Received from localhost:9000, 127.0.0.1. DB::Exception: Double-distributed IN/JOIN subqueries is denied (distributed_product_mode = 'deny'). You may rewrite query to use local tables in subqueries, or use GLOBAL keyword, or set distributed_product_mode to suitable value..

根据提示以及查询clickhouse的官方文档,解决方案如下

sql 语句改为:

select count(*) from default.graphite_d where member global in( select member from user_d where date='2019-04-26')

或者select count(*) from default.graphite_d where member  in( select member from default.user where date='2019-04-26')

参考文档:https://clickhouse.yandex/docs/zh/query_language/select/#select-distributed-subqueries(中文)

https://clickhouse.yandex/docs/en/query_language/select/#select-distributed-subqueries(英文)

相关文章

网友评论

      本文标题:clickhouse distributed query 分布式

      本文链接:https://www.haomeiwen.com/subject/tgavnqtx.html