数据分析之hive开窗函数(二)

内容分享2小时前发布
0 0 0

FIRST_VALUE:取分组内排序后,截止到当前行,第一个值。
LAST_VALUE:取分组内排序后,截止到当前行,最后一个值。
LEAD(col,n,DEFAULT):用于统计窗口内往下第n行值。第一个参数为列名,第二个参数为往下第n行(可选,默认为1),第三个参数为默认值(当往下第n行为NULL时候,取默认值,如不指定,则为NULL)。
LAG(col,n,DEFAULT):与lead相反,用于统计窗口内往上第n行值。第一个参数为列名,第二个参数为往上第n行(可选,默认为1),第三个参数为默认值(当往上第n行为NULL时候,取默认值,如不指定,则为NULL)。


first_value 与 last_value

select username,user_type,

row_number() over(partition by user_type order by sales_volume) as row_num,

first_value(username) over (partition by user_type order by sales_volume desc) as max_sales_user,

first_value(username) over (partition by user_type order by sales_volume asc) as min_sales_user,

last_value(username) over (partition by user_type order by sales_volume desc) as curr_last_min_user,

last_value(username) over (partition by user_type order by sales_volume asc ) as curr_last_max_user

from test;

数据分析之hive开窗函数(二)

lead 与 lag

select username,product,

lead( product ) over(order by sales_volume) as default_after_one_line,

lag( product ) over(order by sales_volume) as default_before_one_line,

lead( product,2 ) over(order by sales_volume) as after_two_line,

lag( product,2, abc ) over(order by sales_volume) as before_two_line

from test;

数据分析之hive开窗函数(二)

© 版权声明

相关文章

暂无评论

none
暂无评论...