SQL 跨项目的不同工作日期,不包括休息日期

Posted

技术标签:

【中文标题】SQL 跨项目的不同工作日期,不包括休息日期【英文标题】:SQL distinct Worked Dates across Projects excluding Break Dates 【发布时间】:2021-08-29 07:42:49 【问题描述】:

考虑以下架构;

CREATE TABLE `Project Assignment`
    (`Employee` varchar(1), `Project Id` int, `Project Assignment Date` date, `Project Relieving Date` date)
;

INSERT INTO `Project Assignment`
    (`Employee`, `Project Id`, `Project Assignment Date`, `Project Relieving Date`)
VALUES
    ('A', 1, '2018-04-01', '2019-12-25'),
    ('A', 2, '2019-06-15', '2020-03-31'),
    ('A', 3, '2019-09-07', '2020-05-20'),
    ('A', 4, '2020-07-14', '2020-12-15')
;


CREATE TABLE `Break`
    (`Break Id` int, `Employee` varchar(1), `Project Id` int, `Break Start Date` date, `Break End Date` date)
;

INSERT INTO `Break`
    (`Break Id`, `Employee`, `Project Id`, `Break Start Date`, `Break End Date`)
VALUES
    (1, 'A', 1, '2018-09-01', '2018-09-30'),
    (2, 'A', 1, '2019-10-05', '2019-11-30'),
    (3, 'A', 2, '2019-10-15', '2019-11-15'),
    (4, 'A', 3, '2019-11-01', '2019-11-10'),
    (5, 'A', 2, '2020-01-01', '2020-01-10'),
    (6, 'A', 3, '2020-01-01', '2020-01-10')
;

在项目期间,员工可以在每个项目中休息一次或多次。中断在 Project 内不重叠,但可以在项目之间重叠。

我们想要员工至少分配一个项目的天数(减去)该员工在所有分配项目上的休息天数。

我能够通过使用以下查询得出员工被分配到项目的不同天数:

SELECT merged.employee,
    SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
FROM (SELECT
        s1.employee, s1.`Project Assignment Date`,
        MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
    FROM `Project Assignment` s1
    INNER JOIN `Project Assignment` t1
        ON t1.employee = s1.employee
        AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
        AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
            WHERE t2.employee = s1.employee 
                AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date` 
                AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
    WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
        WHERE s2.employee = s1.employee
            AND s1.`Project Assignment Date` > s2.`Project Assignment Date` 
            AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
    GROUP BY s1.employee, s1.`Project Assignment Date`
    ORDER BY s1.`Project Assignment Date`) merged
GROUP BY merged.employee

结果:

| employee | assigned_days |
| -------- | ------------- |
| A        | 936           |

但想不出一个方法来推算这个人在所有分配的项目上的休息天数。

预期结果:

+----------+---------------+------------+-------------+
| employee | assigned_days | break_days | worked_days |
+==========+===============+============+=============+
| A        | 936           | 50         | 886         |
+----------+---------------+------------+-------------+

Mariadb 10.3.29

锻炼break_days的解释

+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| Employee | Project | Break Start | Break End        | Days Considered | Remarks                                                                                                           |
+==========+=========+=============+==================+=================+===================================================================================================================+
| A        | 1       |  2018-09-01 |  2018-09-30      | 30              | Only one project assigned so consider whole break                                                                 |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A        | 1       |  2019-10-05 |  2019-11-30      | 10              | 3 Projects were   assigned during these breaks. The common days of break fall between   2019-11-01 and 2019-11-10 |
+----------+---------+-------------+------------------+                 |                                                                                                                   |
| A        | 2       |  2019-10-15 |  2019-11-15      |                 |                                                                                                                   |
+----------+---------+-------------+------------------+                 |                                                                                                                   |
| A        | 3       |  2019-11-01 |  2019-11-10      |                 |                                                                                                                   |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
| A        | 2       |  2020-01-01 |  2020-01-10      | 10              | 2 Projects were assigned during this time and break in both projects                                              |
+----------+---------+-------------+------------------+                 |                                                                                                                   |
| A        | 3       |  2020-01-01 |  2020-01-10      |                 |                                                                                                                   |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+
|          |         |             | Total Break Days | 50              |                                                                                                                   |
+----------+---------+-------------+------------------+-----------------+-------------------------------------------------------------------------------------------------------------------+

DB-Fiddle 链接:https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/0

【问题讨论】:

Edit 问题并展示您已经尝试过的内容。解释失败的原因/位置。具体(错误消息、意外结果等)。 感谢@Strawberry 的链接,它非常有用。我没有意识到我可以让其他人更容易帮助我。 请为给定的数据集提供所需的结果 @Strawberry:连同表格说明一起完成 你的mysql是什么版本的? 【参考方案1】:

使用递归 CTE 获取每位员工的所有工作日期和所有休息日期。 然后,对于这两种情况下的每个日期,通过聚合将所有项目作为逗号分隔列表获取,并带有GROUP_CONCAT()。 如果这些列表在某个日期匹配,则这是一个休息日期。

WITH RECURSIVE 
  working_dates AS (
    SELECT `Employee`, `Project Id`, `Project Assignment Date` AS date, `Project Relieving Date`
    FROM `Project Assignment`
    UNION ALL
    SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Project Relieving Date`
    FROM working_dates
    WHERE date < `Project Relieving Date`
  ),
  break_dates AS (
    SELECT `Employee`, `Project Id`, `Break Start Date` AS date, `Break End Date`
    FROM `Break`
    UNION ALL
    SELECT `Employee`, `Project Id`, date + INTERVAL 1 day, `Break End Date`
    FROM break_dates
    WHERE date < `Break End Date`
  ),
  working AS (
    SELECT `Employee`, date,
           GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
    FROM working_dates
    GROUP BY `Employee`, date 
  ),
  breaks AS (
    SELECT `Employee`, date,
           GROUP_CONCAT(`Project Id` ORDER BY `Project Id`) projects
    FROM break_dates
    GROUP BY `Employee`, date
  )
SELECT w.`Employee`,
       COUNT(*) assigned_days, 
       COUNT(b.date) AS break_days,
       COUNT(*) - COUNT(b.date) worked_days
FROM working w LEFT JOIN breaks b
ON w.`Employee` = b.`Employee` AND w.date = b.date AND w.projects = b.projects
GROUP BY w.`Employee`

请参阅demo。

【讨论】:

虽然它确实提供了结果,但该查询在生产数据上花费了 100 多秒,大约有 4.5k 项目分配和 162 次中断。所以它可能无法很好地扩展。但是你的方法给了我另一种选择的暗示。将尝试并报告。谢谢! 将你的答案标记为答案,因为它可以工作(尽管速度很慢),而且我可以根据你的方法得出更快的结果(运行时间不到 10 秒)。谢谢!【参考方案2】:

Break Id 列添加到Break 表后,我可以利用@forpass 建议的聚合技术来推导出休息日:

然后,对于这两种情况下的每个日期,通过 GROUP_CONCAT() 将所有项目作为逗号分隔列表获取。

对于每个中断,获取重叠项目的计数和列表(使用 GROUP_CONCAT)。 然后通过Break 再次加入它以查找重叠中断的计数和列表以及最小的常见重叠(最新开始和最早结束)。使用ROW_NUMBER 消除重复。

将 Assigned Days 的查询移动到另一个 CTE 并与 CTE 连接以获取休息时间 想要的结果。

WITH breaks_summary AS (
    SELECT `Employee`, SUM(break_days) break_days
    FROM (      
        SELECT b.`Employee`, DATEDIFF(b.end_date, b.start_date)+1 break_days, ROW_NUMBER() OVER (PARTITION BY b.break_ids) rn, overlapping_breaks, break_ids, projects_count
        FROM (
            SELECT b_p_cnt.`Employee`, b_p_cnt.`Project Id`, b_p_cnt.projects_count, 
            COUNT(b2.`Break Id`) overlapping_breaks, GROUP_CONCAT(b2.`Break Id`) break_ids, MAX(b2.start_date) start_date, MIN(b2.end_date) end_date
            FROM (
                SELECT b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date, GROUP_CONCAT(pa.`Project Id`) projects, count(pa.`Project Id`) projects_count
                FROM (
                    SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
                    FROM `Break` 
                    ) b1
                LEFT JOIN `Project Assignment` pa ON b1.`Employee` = pa.`Employee`
                    AND ((b1.start_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE()))
                        OR (b1.end_date BETWEEN pa.`Project Assignment Date` AND IFNULL(pa.`Project Relieving Date`,CURDATE())))
                GROUP BY b1.`Break Id`, b1.`Employee`, b1.`Project Id`, b1.start_date, b1.end_date) b_p_cnt
            LEFT JOIN (
                SELECT `Break Id`, `Employee`, `Project Id`, `Break Start Date` AS start_date, `Break End Date` AS end_date
                FROM `Break`
                ORDER BY `Break Id`) b2 ON b_p_cnt.`Employee` = b2.`Employee` 
                    AND ((b_p_cnt.start_date BETWEEN b2.start_date AND b2.end_date)
                        OR (b_p_cnt.end_date BETWEEN b2.start_date AND b2.end_date))
            GROUP BY b_p_cnt.`Break Id`, b_p_cnt.`Employee`, b_p_cnt.`Project Id`, 
                b_p_cnt.start_date, b_p_cnt.end_date, b_p_cnt.projects, b_p_cnt.projects_count
            HAVING count(b2.`Break Id`) = b_p_cnt.projects_count
            ORDER BY b_p_cnt.`Employee`, `Project Id`) b        
            ) breaks
    WHERE rn = 1
    GROUP BY `Employee`),   
assigned AS (
    SELECT merged.`Employee`, SUM(DATEDIFF(merged.EndDate,merged.`Project Assignment Date`)+1) assigned_days
            FROM (SELECT s1.`Employee`, s1.`Project Assignment Date`,
                    MIN(IFNULL(t1.`Project Relieving Date`,CURDATE())) AS EndDate
                FROM `Project Assignment` s1
                INNER JOIN `Project Assignment` t1 ON t1.`Employee` = s1.`Employee`
                    AND s1.`Project Assignment Date` <= IFNULL(t1.`Project Relieving Date`,CURDATE())
                    AND NOT EXISTS( SELECT * FROM `Project Assignment` t2
                        WHERE t2.`Employee` = s1.`Employee`
                            AND IFNULL(t1.`Project Relieving Date`,CURDATE()) >= t2.`Project Assignment Date` 
                            AND IFNULL(t1.`Project Relieving Date`,CURDATE()) < IFNULL(t2.`Project Relieving Date`,CURDATE()))
                WHERE NOT EXISTS( SELECT * FROM `Project Assignment` s2
                    WHERE s2.`Employee` = s1.`Employee`
                        AND s1.`Project Assignment Date` > s2.`Project Assignment Date` 
                        AND s1.`Project Assignment Date` <= IFNULL(s2.`Project Relieving Date`,CURDATE()))
                GROUP BY s1.`Employee`, s1.`Project Assignment Date`
                ORDER BY s1.`Project Assignment Date`) merged
        GROUP BY merged.`Employee`)
SELECT ad.`Employee`,
    ad.assigned_days,
    IFNULL(bs.break_days,0) break_days,
    (ad.assigned_days - IFNULL(bs.break_days,0)) worked_days
FROM assigned ad
LEFT JOIN breaks_summary bs ON ad.`Employee` = bs.`Employee`

使用查询更新 DB-Fiddle:https://www.db-fiddle.com/f/c8fMneAUkhb2P3rzjMtVZm/3

感谢所有通过改进问题和提供可能答案做出贡献的人。

【讨论】:

以上是关于SQL 跨项目的不同工作日期,不包括休息日期的主要内容,如果未能解决你的问题,请参考以下文章

用excel函数判断一个日期是工作日还是休息日

在两个日期之间获得工作日,并提供定制休息日

计算连续日期,不包括 SQL 中的周末

无法在 Azure SQL 跨数据库查询中按日期时间类型进行筛选

如何从 Swift 中特定工作日的日期范围中查找日期?

如何在 SQL 或 PHP 中获取不包括假期(法语)和工作日的最后三天