ElasticSearch 索引适用于 REST API,但不适用于 C# 代码

Posted

技术标签:

【中文标题】ElasticSearch 索引适用于 REST API,但不适用于 C# 代码【英文标题】:ElasticSearch index works from REST API, but not C# code 【发布时间】:2019-01-26 23:15:51 【问题描述】:

我正在尝试为 Elastic Search 中包含地理点的数据编制索引。当我通过代码索引时,它失败了。当我通过 REST 端点建立索引时,它成功了。但是我找不到我通过 REST 端点发送的 JSON 和使用代码发送的 JSON 之间的区别。

这是配置索引的代码(作为 LINQPad 程序):

async Task Main()

    var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
    var connectionSettings = new ConnectionSettings(pool)
        .DefaultMappingFor<DataEntity>(m => m.IndexName("data").TypeName("_doc"));

    var client = new ElasticClient(connectionSettings);

    await client.CreateIndexAsync(
        "data",
        index => index.Mappings(mappings => mappings.Map<DataEntity>(mapping => mapping.AutoMap().Properties(
            properties => properties.GeoPoint(field => field.Name(x => x.Location))))));

//    var data = new DataEntity(new GeoLocationEntity(50, 30));
//            
//    var json = client.RequestResponseSerializer.SerializeToString(data);
//    json.Dump("JSON");
//            
//    var indexResult = await client.IndexDocumentAsync(data);
//    indexResult.DebugInformation.Dump("Debug Information");


public sealed class GeoLocationEntity

    [JsonConstructor]
    public GeoLocationEntity(
        double latitude,
        double longitude)
    
        this.Latitude = latitude;
        this.Longitude = longitude;
    

    [JsonProperty("lat")]
    public double Latitude  get; 

    [JsonProperty("lon")]
    public double Longitude  get; 


public sealed class DataEntity

    [JsonConstructor]
    public DataEntity(
        GeoLocationEntity location)
    
        this.Location = location;
    

    [JsonProperty("location")]
    public GeoLocationEntity Location  get; 

运行后,我的映射看起来正确,因为GET /data/_doc/_mapping 返回:


  "data" : 
    "mappings" : 
      "_doc" : 
        "properties" : 
          "location" : 
            "type" : "geo_point"
          
        
      
    
  

我可以通过开发控制台成功地将文档添加到索引中:

POST /data/_doc

  "location": 
    "lat": 88.59,
    "lon": -98.87
  

结果:


  "_index" : "data",
  "_type" : "_doc",
  "_id" : "RqpyjGgBZ27KOduFRIxL",
  "_version" : 1,
  "result" : "created",
  "_shards" : 
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  ,
  "_seq_no" : 0,
  "_primary_term" : 1

但是当我取消注释上面的LINQPad程序中的代码并执行时,我在索引时得到这个错误:

Invalid NEST response built from a unsuccessful low level call on POST: /data/_doc
# Audit trail of this API call:
 - [1] BadResponse: Node: http://localhost:9200/ Took: 00:00:00.0159927
# OriginalException: Elasticsearch.Net.ElasticsearchClientException: The remote server returned an error: (400) Bad Request.. Call: Status code 400 from: POST /data/_doc. ServerError: Type: mapper_parsing_exception Reason: "failed to parse" CausedBy: "Type: parse_exception Reason: "field must be either [lat], [lon] or [geohash]"" ---> System.Net.WebException: The remote server returned an error: (400) Bad Request.
   at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
   at Elasticsearch.Net.HttpWebRequestConnection.<>c__DisplayClass5_0`1.<RequestAsync>b__1(IAsyncResult r)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
   at System.Runtime.CompilerServices.TaskAwaiter.ValidateEnd(Task task)
   at Elasticsearch.Net.HttpWebRequestConnection.<RequestAsync>d__5`1.MoveNext()
   --- End of inner exception stack trace ---
# Request:
<Request stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>
# Response:
<Response stream not captured or already read to completion by serializer. Set DisableDirectStreaming() on ConnectionSettings to force it to be set on the response.>

转储的 JSON 如下所示:


  "location": 
    "latitude": 50.0,
    "longitude": 30.0
  

因此它与开发控制台中工作的 JSON 的结构相匹配。

为了解决这个问题,我编写了一个自定义的JsonConverter,它以lat,lon 的格式序列化我的GeoLocationEntity 对象:

public sealed class GeoLocationConverter : JsonConverter

    public override bool CanConvert(Type objectType) =>
        objectType == typeof(GeoLocationEntity);

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    
        var token = JToken.Load(reader);

        if (!(token is JValue))
        
            throw new JsonSerializationException("Token was not a primitive.");
        

        var stringValue = (string)token;
        var split = stringValue.Split(',');
        var latitude = double.Parse(split[0]);
        var longitude = double.Parse(split[1]);

        return new GeoLocationEntity(latitude, longitude);
    

    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    
        var geoLocation = (GeoLocationEntity)value;

        if (geoLocation == null)
        
            writer.WriteNull();
            return;
        

        var geoLocationValue = $"geoLocation.Latitude,geoLocation.Longitude";
        writer.WriteValue(geoLocationValue);
    

将此JsonConverter 应用于序列化程序设置让我解决了这个问题。但是,我不想解决这样的问题。

谁能告诉我如何解决这个问题?

【问题讨论】:

【参考方案1】:

6.x Elasticsearch 高级客户端 NEST 通过

内化了 Json.NET 依赖项 IL 合并 Json.NET 程序集 将所有类型转换为internalNest.* 下重命名它们

这在实践中意味着客户端不直接依赖于 Json.NET(阅读 release blog post 以了解我们这样做的原因)并且不了解 Json.NET 类型,包括 @ 987654326@或JsonConverter

有几种方法可以解决这个问题。首先,以下设置在开发过程中可能会有所帮助

var defaultIndex = "default-index";
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));

var settings = new ConnectionSettings(pool)
    .DefaultMappingFor<DataEntity>(m => m
        .IndexName(defaultIndex)
        .TypeName("_doc")
    )
    .DisableDirectStreaming()
    .PrettyJson()
    .OnRequestCompleted(callDetails =>
    
        if (callDetails.RequestBodyInBytes != null)
        
            Console.WriteLine(
                $"callDetails.HttpMethod callDetails.Uri \n" +
                $"Encoding.UTF8.GetString(callDetails.RequestBodyInBytes)");
        
        else
        
            Console.WriteLine($"callDetails.HttpMethod callDetails.Uri");
        

        Console.WriteLine();

        if (callDetails.ResponseBodyInBytes != null)
        
            Console.WriteLine($"Status: callDetails.HttpStatusCode\n" +
                     $"Encoding.UTF8.GetString(callDetails.ResponseBodyInBytes)\n" +
                     $"new string('-', 30)\n");
        
        else
        
            Console.WriteLine($"Status: callDetails.HttpStatusCode\n" +
                     $"new string('-', 30)\n");
        
    );

var client = new ElasticClient(settings);

这会将所有请求和响应写入控制台,因此您可以看到客户端从 Elasticsearch 发送和接收的内容。 .DisableDirectStreaming() 在内存中缓冲请求和响应字节,以使它们可用于传递给 .OnRequestCompleted() 的委托,因此它对开发很有用,但您可能不希望它在生产中使用,因为它会降低性能成本。

现在,解决方案:

1。使用PropertyNameAttribute

您可以使用PropertyNameAttribute 来命名序列化的属性,而不是使用JsonPropertyAttribute

public sealed class GeoLocationEntity

    public GeoLocationEntity(
        double latitude,
        double longitude)
    
        this.Latitude = latitude;
        this.Longitude = longitude;
    

    [PropertyName("lat")]
    public double Latitude  get; 

    [PropertyName("lon")]
    public double Longitude  get; 


public sealed class DataEntity

    public DataEntity(
        GeoLocationEntity location)
    
        this.Location = location;
    

    [PropertyName("location")]
    public GeoLocationEntity Location  get; 

使用

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);


var createIndexResponse = client.CreateIndex(defaultIndex, c => c 
    .Mappings(m => m
        .Map<DataEntity>(mm => mm
            .AutoMap()
            .Properties(p => p
                .GeoPoint(g => g
                    .Name(n => n.Location)
                )
            )
        )
    )
);

var indexResponse = client.Index(
    new DataEntity(new GeoLocationEntity(88.59, -98.87)), 
    i => i.Refresh(Refresh.WaitFor)
);

var searchResponse = client.Search<DataEntity>(s => s
    .Query(q => q
        .MatchAll()
    )
);

PropertyNameAttribute 的行为类似于您通常在 Json.NET 中使用 JsonPropertAttribute

2。使用DataMemberAttribute

在这种情况下,这与PropertyNameAttribute 的工作方式相同,如果您希望您的 POCO 不属于 NEST 类型(尽管我认为 POCO 与 Elasticsearch 相关联,因此将它们与 .NET 相关联Elasticsearch 类型可能不是问题)。

3。使用Geolocation类型

您可以将 GeoLocationEntity 类型替换为 Nest 的 GeoLocation 类型,该类型映射到 geo_point 字段数据类型映射。在使用中,它减少了一个POCO,并且可以从属性类型中推断出正确的映射

public sealed class DataEntity

    public DataEntity(
        GeoLocation location)
    
        this.Location = location;
    

    [DataMember(Name = "location")]
    public GeoLocation Location  get; 


// ---

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);

var createIndexResponse = client.CreateIndex(defaultIndex, c => c 
    .Mappings(m => m
        .Map<DataEntity>(mm => mm
            .AutoMap()
        )
    )
);

var indexResponse = client.Index(
    new DataEntity(new GeoLocation(88.59, -98.87)), 
    i => i.Refresh(Refresh.WaitFor)
);

var searchResponse = client.Search<DataEntity>(s => s
    .Query(q => q
        .MatchAll()
    )
);

4。连接 JsonNetSerializer

NEST 允许custom serializer to be hooked up 负责序列化您的类型。一个单独的 nuget 包 NEST.JsonNetSerializer 允许您使用 Json.NET 来序列化您的类型,序列化程序将 NEST 类型的属性委托回内部序列化程序。

首先,您需要将 JsonNetSerializer 传递给 ConnectionSettings 构造函数

var settings = new ConnectionSettings(pool, JsonNetSerializer.Default)

那么您的原始代码将按预期工作,无需自定义 JsonConverter

public sealed class GeoLocationEntity

    public GeoLocationEntity(
        double latitude,
        double longitude)
    
        this.Latitude = latitude;
        this.Longitude = longitude;
    

    [JsonProperty("lat")]
    public double Latitude  get; 

    [JsonProperty("lon")]
    public double Longitude  get; 


public sealed class DataEntity

    public DataEntity(
        GeoLocationEntity location)
    
        this.Location = location;
    

    [JsonProperty("location")]
    public GeoLocationEntity Location  get; 



// ---

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);


var createIndexResponse = client.CreateIndex(defaultIndex, c => c 
    .Mappings(m => m
        .Map<DataEntity>(mm => mm
            .AutoMap()
            .Properties(p => p
                .GeoPoint(g => g
                    .Name(n => n.Location)
                )
            )
        )
    )
);

var indexResponse = client.Index(
    new DataEntity(new GeoLocationEntity(88.59, -98.87)), 
    i => i.Refresh(Refresh.WaitFor)
);

var searchResponse = client.Search<DataEntity>(s => s
    .Query(q => q
        .MatchAll()
    )
);

我最后列出了这个选项,因为在内部,以这种方式将序列化传递给 Json.NET 会产生性能和分配开销。包含它是为了提供灵活性,但我建议仅在您真正需要时才使用它,例如,在序列化结构不是常规的情况下完成 POCO 的自定义序列化。我们正在致力于更快的序列化,这将在未来减少这种开销。

【讨论】:

PropertyNameAttribute 拯救我的一天!

以上是关于ElasticSearch 索引适用于 REST API,但不适用于 C# 代码的主要内容,如果未能解决你的问题,请参考以下文章

Elasticsearch Rest风格操作索引操作

Elasticsearch——Rest API中的常用用法

ElasticSearch核心概念与REST风格说明

Elasticvue - 用于浏览器的免费开源 Elasticsearch GUI

第130天学习打卡(ElasticSearch Rest风格说明 索引 文档 )

elasticsearch REST api