最近邻和点与线之间的距离

Posted

技术标签:

【中文标题】最近邻和点与线之间的距离【英文标题】:Nearest neighbor and distance between points and lines 【发布时间】:2020-01-26 15:02:39 【问题描述】:

在 oracle 空间中,我有两个表(AVALREGULACAOATROCOADUTOR)分别代表点和线。

两个表的结构如下:

AVALREGULACAO(295 分记录) IPID [编号(10)] 几何[MDSYS.SDO_GEOMETRY]

ATROCOADUTOR(12536 行记录) IPID [编号(10)] 几何[MDSYS.SDO_GEOMETRY]

我需要从每个 AVALREGULACAO 中找到最近的 ATROCOADUTOR 邻居并计算它们之间的距离

AVALREGULACAO_IPID | ATROCOADUTOR _IPID |距离

我使用了 2 个选项

1

SELECT /*+ ORDERED */ A.IPID, B.IPID, MIN(SDO_GEOM.SDO_DISTANCE(sdo_cs.make_2d(A.GEOMETRY), sdo_cs.make_2d(B.GEOMETRY), 0.005)) as DISTANCE
FROM AVALREGULACAO A, ATROCOADUTOR B 
GROUP BY c_b.IPID,c_d.IPID;

计算需要很长时间 - 它会生成 295 x 12536 = 3 698 120 种可能组合(笛卡尔积)的巨大输出。此外,csv 文件输出无法容纳所有这些记录(1 048 576 行限制)

我只需要对应295 AVALREGULACAO的295条记录。

2 我还使用最近邻 (nn) 运算符尝试/调整了另一个查询

PROMPT IPID, nearest_IPID, distance  
  select /*+ ORDERED USE_NL(s,s2)*/
         s.IPID,
         s2.IPID as nearest_IPID,
         TO_CHAR(REPLACE(mdsys.sdo_geom.sdo_distance(sdo_cs.make_2d(s.GEOMETRY),sdo_cs.make_2d(s2.GEOMETRY),0.05), ',','.')) as distance
    from AVALREGULACAO s,
         ATROCOADUTOR s2
   where s2.IPID in (select IPID
                from AVALREGULACAO s3
               where sdo_nn(s3.GEOMETRY,s.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
                       and s3.IPID <> s.IPID
                       and rownum < 2)
 order by 1,2;

这个查询需要很长时间 - 我需要在它结束之前关闭进程。

我想我错过了如何优化/过滤所需结果的要点。

任何有关如何有效解决此问题的提示将不胜感激。

提前致谢, 佩德罗

PS: @Boneist。非常感谢您的意见。

不幸的是,我在应用您的查询后遇到了错误(仍在尝试使用新命令 KEEP、dense_rank 的语义/语法)

SELECT a.ipid a_ipid,
       MIN(b.ipid) KEEP (dense_rank FIRST order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) b_ipid,
       MIN(sdo_geom.sdo_distance(sdo_cs.make_2d(a.geometry), sdo_cs.make_2d(b.geometry), 0.005)) AS distance
FROM   avalregulacao a
       INNER JOIN atrocoadutor b ON sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
GROUP  BY a.ipid;

错误

Error starting at line : 1 in command -
SELECT a.ipid a_ipid,
MIN(b.ipid) KEEP (dense_rank FIRST order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) b_ipid,
MIN(sdo_geom.sdo_distance(sdo_cs.make_2d(a.geometry), sdo_cs.make_2d(b.geometry), 0.005)) AS distance 
FROM avalregulacao a 
INNER JOIN atrocoadutor b ON sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
GROUP  BY a.ipid
Error at Command Line : 2 Column : 45
Error report -
SQL Error: ORA-29907: foram encontradas etiquetas em duplicado em invocações primárias
29907. 00000 -  "found duplicate labels in primary invocations"
*Cause:    There are multiple primary invocations of operators with
           the same number as the label.
*Action:   Use distinct labels in primary invocations.

【问题讨论】:

【参考方案1】:

我想你可能会追求类似的东西:

SELECT a.ipid a_ipid,
       MIN(b.ipid) KEEP (dense_rank FIRST order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) b_ipid,
       MIN(sdo_geom.sdo_distance(sdo_cs.make_2d(a.geometry), sdo_cs.make_2d(b.geometry), 0.005)) AS distance
FROM   avalregulacao a
       INNER JOIN atrocoadutor b ON sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) = 'TRUE'
GROUP  BY a.ipid;

这会在最近邻函数上连接两个表,这应该会减少返回的行数。

MIN(b.ipid) KEEP (dense_rank first order by sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1)) 只返回最低差异的最低 b.ipid 值。

(我认为此查询将按原样工作,但我无法对其进行测试。您可能必须进行联接并将sdo_nn(a.GEOMETRY,b.GEOMETRY,'sdo_batch_size=10',1) 作为子查询中的列,然后在外部查询中进行分组。 )

【讨论】:

以上是关于最近邻和点与线之间的距离的主要内容,如果未能解决你的问题,请参考以下文章

点与线线与线之间的位置关系

K 最近邻分类具有相同点的特殊情况

K近邻算法

K-近邻算法

Matlab计算数组中所有(u,v)向量的最近邻距离

KNN分类器之NearestNeighbors详解及实践