Hive多维分析函数——With cube、Grouping sets、With rollup

avatar
作者
筋斗云
阅读量:0

有些指标涉及【多维度】的聚合,大的汇总维度,小的明细维度,需要精细化的下钻。

  • grouping sets: 多维度组合,组合维度自定义;
  • with cube: 多维度组合,程序自由组合,组合为各个维度的笛卡尔积;
  • with rollup:是 with cube的子集,以左侧维度为主,即不允许左侧为NULL,右侧为非NULL的情况出现

一、grouping sets

 0 数据准备

with test1 as      (select '2021-08' as month_date,'2021-08-11' as day_date,10 as pv       union all       select '2021-08' as month_date,'2021-08-10' as day_date,15 as pv       union all       select '2021-08' as month_date,'2021-08-08' as day_date,35 as pv       union all       select '2021-07' as month_date,'2021-07-08' as day_date,35 as pv       union all       select '2021-07' as month_date,'2021-07-06' as day_date,25 as pv       union all       select '2021-07' as month_date,'2021-07-03' as day_date,15 as pv)       select          month_date,         day_date,         sum(pv) as pv      from test1       group by month_date,day_date       grouping sets        (           (),           (month_date),           (month_date,day_date),           (day_date)         )       order by month_date,day_date; 

 1 结果分析

二、with cube

 0 数据准备

with test1 as (select '2021-08' as month_date,'2021-08-11' as day_date,10 as pv  union all  select '2021-08' as month_date,'2021-08-10' as day_date,15 as pv  union all  select '2021-08' as month_date,'2021-08-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-06' as day_date,25 as pv  union all  select '2021-07' as month_date,'2021-07-03' as day_date,15 as pv)    select     month_date,    day_date,    sum(pv) as pv  from test1  group by month_date,day_date  with cube  order by month_date,day_date;

 1 结果分析

2 总结

  • 从with cube和grouping sets的案例可以看出,两个结果是一样的;
  • with cube的维度组合(各个维度的笛卡尔积)就是groupingsets里面手动添加的维度,即为month_date,day_date两个维度的笛卡尔积。维度组合即为:()、 (month_date)、 (month_date,day_date)、 (day_date)

三、 with rollup

 0 数据准备

with test1 as (select '2021-08' as month_date,'2021-08-11' as day_date,10 as pv  union all  select '2021-08' as month_date,'2021-08-10' as day_date,15 as pv  union all  select '2021-08' as month_date,'2021-08-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-08' as day_date,35 as pv  union all  select '2021-07' as month_date,'2021-07-06' as day_date,25 as pv  union all  select '2021-07' as month_date,'2021-07-03' as day_date,15 as pv)  select    month_date,   day_date,   sum(pv) as pv from test1 group by month_date,day_date with rollup order by month_date,day_date;

 1 结果分析

 2 总结

   从结果上可以看出,with rollup 和with cube的区别是,少了day_date这个单独维度的聚合,因为with rollup是以左侧维度为主,当左侧month_date维度为NULL时,右侧day_date维度必须为NULL。

参考文章 :

https://zhuanlan.zhihu.com/p/631268351

https://blog.51cto.com/u_14555/6696007

广告一刻

为您即时展示最新活动产品广告消息,让您随时掌握产品活动新动态!