SQL:查找按用户分组的最新两个记录之间的差异
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了SQL:查找按用户分组的最新两个记录之间的差异相关的知识,希望对你有一定的参考价值。
假设我有一个包含三列的表格:[user_id,created_at,文本]
让U为至少具有两个记录的用户集。如何找到最近两个记录中的文本之间没有差异的U百分比?
答案
您可以使用窗口函数和聚合:
select user_id,
avg(case when min(text) = max(text) then 1.0 else 0 end) as ratio_same
from (select t.*,
row_number() over (partition by user_id order by created_at desc) as seqnum
from t
) t
where seqnum <= 2
group by user_id
having max(seqnum = 2) -- make sure there are two records
另一答案
只是一个记事本涂鸦(未经测试)
SELECT 100.0*SUM(samePrevText)/COUNT(*) as Perc
FROM
(
SELECT user_id, created_at, text,
row_number() over (partition by user_id order by created_at desc) as rn,
case when text = lead(text) over (partition by user_id order by created_at desc) then 1 else 0 end as samePrevText
FROM usertexts
) q
WHERE rn = 1
另一答案
虽然不漂亮,但是似乎可以解决问题:
SELECT SUM((LatestTwoRowsEqual) * 1.00) / COUNT(DISTINCT user_id) AS UsersPercentage
FROM (
SELECT user_id,
CASE
WHEN
ROW_NUMBER() OVER(
PARTITION BY user_id
ORDER BY created_at DESC
) <= 2 AND -- Only look at two latest rows per user_id
MAX(text) OVER(
PARTITION BY user_id
ORDER BY created_at DESC
ROWS BETWEEN 1 PRECEDING AND 1 PRECEDING) = text -- Check if values are the same
THEN 1
ELSE 0
END LatestTwoRowsEqual
FROM MyTable
WHERE user_id IN ( -- Only get users with at least two records
SELECT user_id
FROM MyTable
GROUP BY user_id
HAVING COUNT(*) > 1
)
) src
如果您的DBMS支持LAG
功能,也可以使用它。
以上是关于SQL:查找按用户分组的最新两个记录之间的差异的主要内容,如果未能解决你的问题,请参考以下文章