Postgres:使用子查询更新表列

Posted

技术标签:

【中文标题】Postgres:使用子查询更新表列【英文标题】:Postgres: update table column with subquery 【发布时间】:2020-06-30 10:32:37 【问题描述】:

我有此行程GPS采样表:

SELECT * FROM trajecttories_splitted;
user_id |   session_id   |       timestamp        |    lat    |    lon     | alt 
---------+----------------+------------------------+-----------+------------+-----
       1 | 20081023025304 | 2008-10-23 02:53:04+01 | 39.984702 | 116.318417 | 492
       1 | 20081023025304 | 2008-10-23 02:53:10+01 | 39.984683 |  116.31845 | 492
       1 | 20081023025304 | 2008-10-23 02:53:15+01 | 39.984686 | 116.318417 | 492
       1 | 20081023025304 | 2008-10-23 02:53:20+01 | 39.984688 | 116.318385 | 492
       1 | 20081023025304 | 2008-10-23 02:53:25+01 | 39.984655 | 116.318263 | 492
       1 | 20081023025304 | 2008-10-23 02:53:30+01 | 39.984611 | 116.318026 | 493
       1 | 20081023025304 | 2008-10-23 02:53:35+01 | 39.984608 | 116.317761 | 493
       1 | 20081023025304 | 2008-10-23 02:53:40+01 | 39.984563 | 116.317517 | 496
       1 | 20081023025304 | 2008-10-23 02:53:45+01 | 39.984539 | 116.317294 | 500
       1 | 20081023025304 | 2008-10-23 02:53:50+01 | 39.984606 | 116.317065 | 505

然后为了便于分析,我添加了一个列 sampling_rate 来跟踪连续行之间的 GPS 采样间隔。

现在我想将新列的值设置为不同的值:row[sampling_rate]=timestamp -LAG(timestamp)

所以我使用:

    UPDATE trajectories_splitted 
    SET sampling_rate=timestamp -LAG(timestamp) OVER (
        PARTITION BY user_id
        ORDER BY session_id
    )
    
    ERROR:  window functions are not allowed in UPDATE
 LINE 2:     SET sampling_rate=timestamp -LAG(timestamp) OVER (

预期结果:

 user_id |   session_id   |       timestamp        |    lat    |    lon     | alt | sampling_rate 
---------+----------------+------------------------+-----------+------------+-----+---------------
       1 | 20081023025304 | 2008-10-23 02:53:04+01 | 39.984702 | 116.318417 | 492 |              
       1 | 20081023025304 | 2008-10-23 02:53:10+01 | 39.984683 |  116.31845 | 492 | 6             
       1 | 20081023025304 | 2008-10-23 02:53:15+01 | 39.984686 | 116.318417 | 492 | 5           
       1 | 20081023025304 | 2008-10-23 02:53:20+01 | 39.984688 | 116.318385 | 492 | 10             
       1 | 20081023025304 | 2008-10-23 02:53:25+01 | 39.984655 | 116.318263 | 492 | 5            
       1 | 20081023025304 | 2008-10-23 02:53:30+01 | 39.984611 | 116.318026 | 493 | 5             
       1 | 20081023025304 | 2008-10-23 02:53:35+01 | 39.984608 | 116.317761 | 493 | 5            
       1 | 20081023025304 | 2008-10-23 02:53:40+01 | 39.984563 | 116.317517 | 496 | 5             
       1 | 20081023025304 | 2008-10-23 02:53:45+01 | 39.984539 | 116.317294 | 500 | 5             
       1 | 20081023025304 | 2008-10-23 02:53:50+01 | 39.984606 | 116.317065 | 505 | 5

编辑

根据下面第一个回答,postgres 报错:

ERROR:  column "sampling_rate" is of type timestamp with time zone but expression is of type interval
LINE 2:     set sampling_rate = ts.timestamp - tts.prev_timestamp

如果有帮助,这是表结构:

\d trajectories_splitted
                    Table "postgres.trajectories_splitted"
    Column     |           Type           | Collation | Nullable | Default 
---------------+--------------------------+-----------+----------+---------
 user_id       | integer                  |           |          | 
 session_id    | bigint                   |           | not null | 
 timestamp     | timestamp with time zone |           | not null | 
 lat           | double precision         |           | not null | 
 lon           | double precision         |           | not null | 
 alt           | double precision         |           |          | 
 sampling_rate | timestamp with time zone |           |          | 
Indexes:
    "trajectories_splitted_pkey" PRIMARY KEY, btree (session_id, "timestamp")
    "traj_splitted" btree (user_id, session_id, "timestamp")

【问题讨论】:

【参考方案1】:

您需要在子查询中计算值和join

update trajectories_splitted ts
    set sampling_rate = ts.timestamp - tts.prev_timestamp
    from (select ts.*,
                 lag(timestamp) over (partition by user_id, session_id order by timestamp) as prev_timestamp
    
          from trajectories_splitted ts
         ) tts
    where tts.user_id = ts.user_id and tts.session_id = ts.session_id and
          tts.timestamp = ts.timestamp

【讨论】:

基于这个答案,postgres 报告expression is of type interval 错误,如问题编辑所示。 @arilwan 。 . .这回答了您在这里提出的问题。您需要修复数据类型,以便存储您想要的数据。 谢谢,我将列类型更改为interval。找到了这个相关的答案***.com/questions/58490265/…

以上是关于Postgres:使用子查询更新表列的主要内容,如果未能解决你的问题,请参考以下文章

mysql子查询

在 postgres 选择中,将列子查询作为数组返回?

SQL练习 高级子查询

子查询(章节摘要)

数据库子查询 ---where或having后面----列子查询-多行子查询

使用子查询更新与使用连接更新 - 性能更好