没错,留存的问题还没有写完,之前两篇把日、周、月当期活跃用户在后续周期的留存率问题解决了。但是还有个非常重要的指标,当期新增用户的留存率,这个指标也是很有价值的,我们必须要关注不同日期拉新用户的质量如何,看看不同时期新用户的后续留存情况,对后续拉新的时间选择也是有参考价值的。
其实实现也很简单,只需要在之前的基础上,先把当期的首次登陆用户找出来就行了。实现方式是,按照用户聚合,然后取日期最小值就能取出每个用户首次登陆日期了,SQL语句如下↓
SELECT
user_id,
DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date
FROM
liucun
GROUP BY
user_id
然后就以此为基础,通过左连接把用户表格再连接一次,判断与首次登陆的日期相差多少天就行了,就能判断是第N天有活跃,就能计算N日留存和留存率了,SQL语句和结果如下↓
SELECT
t1.*,
DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
DATEDIFF(date(lc1.time),date(t1.date)) daydiff
FROM
(SELECT
user_id,
DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date
FROM
liucun
GROUP BY
user_id) as t1
LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id
后面就和之前思路一样了,就可以求出日留存率情况了,SQL语句如下,解释可以看前面两篇。
SELECT
date,
COUNT(DISTINCT user_id) 当日新增户数,
COUNT(DISTINCT CASE WHEN daydiff=1 THEN user_id ELSE NULL END) 次日用户数,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN daydiff=1 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 次日留存率,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN daydiff=2 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 三日留存率,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN daydiff=6 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 七日留存率
FROM
(SELECT
t1.*,
DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
DATEDIFF(date(lc1.time),date(t1.date)) daydiff
FROM
(SELECT
user_id,
DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date
FROM
liucun
GROUP BY
user_id) as t1
LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id) temp
GROUP BY
date
然后按月实现方式和上一篇一样的思路,关联一个辅助表就行了,这里不详细解释了,可以参考上一篇,完整SQL语句和结果如下↓
SELECT
月份,
COUNT(DISTINCT user_id) 当月新增用户,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN mdiff=1 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 次月留存率,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN mdiff=2 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 两月留存率,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN mdiff=3 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 三月留存率
FROM
(SELECT
t1.*,
DATE_FORMAT(t1.date,"%Y-%m") 月份,
DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
d1.monthnum m0,
d2.monthnum m1,
d2.monthnum-d1.monthnum mdiff
FROM
(SELECT
user_id,
DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date
FROM
liucun
GROUP BY
user_id) as t1
LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id
LEFT JOIN date as d1 ON date(t1.date)=d1.日期
LEFT JOIN date as d2 ON date(lc1.time)=d2.日期) temp
GROUP BY
月份
那么按周的留存率也是一样的,SQL语句和结果如下↓
SELECT
周次,
COUNT(DISTINCT user_id) 当周新增用户,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN wdiff=1 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 次周留存率,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN wdiff=2 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 两周留存率,
CONCAT(ROUND(COUNT(DISTINCT CASE WHEN wdiff=3 THEN user_id ELSE NULL END)/COUNT(DISTINCT user_id)*100,2),"%") 三周留存率
FROM
(SELECT
t1.*,
d1.周次 周次,
DATE_FORMAT(lc1.time,"%Y-%m-%d") AS lcdate,
d2.weeknum-d1.weeknum wdiff
FROM
(SELECT
user_id,
DATE_FORMAT(min(time), "%Y-%m-%d" ) AS date
FROM
liucun
GROUP BY
user_id) as t1
LEFT JOIN liucun as lc1 on lc1.user_id = t1.user_id
LEFT JOIN date as d1 ON date(t1.date)=d1.日期
LEFT JOIN date as d2 ON date(lc1.time)=d2.日期) temp
GROUP BY
周次
End
◆ PowerBI开场白
◆ Python高德地图可视化
◆ Python不规则条形图