使用 Lucene 进行带时间戳的地理空间搜索

Posted

技术标签:

【中文标题】使用 Lucene 进行带时间戳的地理空间搜索【英文标题】:Geospatial search with timestamp using Lucene 【发布时间】:2018-07-28 15:59:49 【问题描述】:

我的任务是寻找靠近给定用户位置的出租车司机(类似于 Grab/Lyft)。我有带时间戳的司机位置(纬度、经度)。这些数据每 2 分钟由他们的手机推送到我的服务器。当用户叫车时,我需要使用司机的数据找到最近的司机。我正在尝试使用 Lucene 的 GeoSpatial 搜索。我已使用驱动程序数据进行索引,并根据用户的纬度和经度进行搜索。我还可以使用给定的纬度/经度组合进行搜索,并使用距离参数获取最近的驱动程序。但我不知道如何在搜索查询中添加另一个参数来指定驱动程序数据的时间戳。例如,我只想搜索在特定时间戳靠近给定位置的司机。 有人可以帮我弄这个吗?这是我正在使用的代码 sn-p:

package com.test.trials;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.IOException;
import java.nio.file.Paths;

import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.StoredField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.spatial3d.Geo3DPoint;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import org.apache.lucene.document.LatLonPoint;

public class LuceneTrial 

    private IndexSearcher searcher;

    public static void main(String[] args) throws IOException 
        LuceneTrial trial = new LuceneTrial();
        trial.generateData();
        trial.search();
    

    private void generateData() throws IOException 
        Directory dir = FSDirectory.open(Paths.get("C:/temp/Lucene"));
        Analyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig iwc = new IndexWriterConfig(analyzer);
        iwc.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
        IndexWriter writer = new IndexWriter(dir, iwc);

        try (BufferedReader br = new BufferedReader(new FileReader(
                "C:/temp/Lucene/Drivers.csv"))) 
            String line;
            String[] fieldNames = new String[]  "Driver ID", "Latitude", "Longitude", "Time" ;
            while ((line = br.readLine()) != null) 
                // process the line.
                String[] tags = line.split(",");
                Document doc = new Document();
                for (int i = 0; i < fieldNames.length; i++)
                    doc.add(new StoredField(fieldNames[i], tags[i]));

                // Add a latlon point to index
                try 
                    doc.add(new LatLonPoint("latlon", Double
                            .parseDouble(tags[1]), Double.parseDouble(tags[2])));
                    Geo3DPoint point = new Geo3DPoint("geo3d",
                            Double.parseDouble(tags[1]),
                            Double.parseDouble(tags[2]));
                    doc.add(point);
                 catch (Exception e) 
                    System.out.println("Skipped: " + line);
                
                writer.addDocument(doc);
            
        
        searcher = new IndexSearcher(DirectoryReader.open(writer));
    

    public void search() throws IOException 
        System.out
                .println("\nLatLonQuery around given point, 10km radius --------------------------------------");
        TopDocs docs = searcher.search(LatLonPoint.newDistanceQuery("latlon",
                6.9270790, 79.8612430, 10 * 1000), 20);
        for (ScoreDoc scoreDoc : docs.scoreDocs) 
            Document doc = searcher.doc(scoreDoc.doc);
            System.out.println(doc);
        
    


这是我正在使用的示例数据:

Driver ID   Latitude    Longitude   Time
1   -6.081689   145.391881  7:01:17
2   -5.207083   145.7887    8:32:40
3   -5.826789   144.295861  8:40:49
4   -6.569828   146.726242  8:57:33
5   -9.443383   147.22005   6:14:26
6   -3.583828   143.669186  8:13:35
7   61.160517   -45.425978  8:58:24
8   64.190922   -51.678064  7:42:16
9   67.016969   -50.689325  6:52:20
10  76.531203   -68.703161  6:08:21
11  65.659994   -18.072703  7:57:45
12  65.283333   -14.401389  7:32:23
13  64.295556   -15.227222  8:20:26
14  65.952328   -17.425978  8:51:34
15  66.058056   -23.135278  8:33:43
16  63.985  -22.605556  7:39:35
17  65.555833   -23.965 7:20:54
18  64.13   -21.940556  7:37:48
19  66.133333   -18.916667  6:46:36
20  63.424303   -20.278875  7:15:12
21  46.485001   -84.509445  6:14:15
22  50.056389   -97.0325    7:12:15
23  44.639721   -63.499444  6:15:31
24  51.391944   -56.083056  7:15:50
25  49.082222   -125.7725   6:52:22

谁能告诉我如何根据距离和时间这两个属性来搜索司机?

【问题讨论】:

【参考方案1】:

您可以使用BooleanQuery 来解决您的用例。您的搜索功能可能如下所示:

public void search() throws IOException 
        System.out
                .println("\nLatLonQuery around given point, 10km radius --------------------------------------");
        Query distQuery = LatLonPoint.newDistanceQuery("latlon", -6.08165, 145.8612430, dist * 1000);
        long startTime=0;//adjust according to your needs
        long endTime=Long.Max_VALUE;//adjust according to your needs
        Query timeQuery = LongPoint.newRangeQuery("timestamp", startTime, endTime);

        BooleanQuery booleanQuery = new BooleanQuery.Builder()
            .add(distQuery, Occur.MUST)
            .add(timeQuery, Occur.MUST)
            .build();
        TopDocs docs = searcher.search(booleanQuery, 20);
        for (ScoreDoc scoreDoc : docs.scoreDocs) 
            Document doc = searcher.doc(scoreDoc.doc);
            System.out.println(doc);
        
    

【讨论】:

非常感谢!你拯救了我的一天!

以上是关于使用 Lucene 进行带时间戳的地理空间搜索的主要内容,如果未能解决你的问题,请参考以下文章

全文搜索与空间查询

通过 zip 进行 MongoDB 地理空间搜索

使用Lucene索引和检索POI数据

如何在普通mysql中进行地理空间搜索

一对多地理空间搜索

Geode上的Apache Lucene LatLonPoint查询