OpenShift Aggregated Logging：解析 Apache 访问日志

Posted 2023-02-15

技术标签:

【中文标题】OpenShift Aggregated Logging：解析 Apache 访问日志【英文标题】：OpenShift Aggregated Logging: Parse Apache access log 【发布时间】：2017-08-11 10:53:48 【问题描述】：

当使用OpenShift Aggregated Logging 时，我可以将日志很好地输入到 elasticsearch 中。但是，apache 记录的行最终会出现在 message 字段中。

我想在 Kibana 中创建查询，我可以在其中单独访问 url、状态代码和其他字段。为此，需要进行特殊的 apache 访问日志解析。

我该怎么做？

这是在 kibana 中看到的示例条目：


  "_index": "42-steinbruchsteiner-staging.3af0bedd-eebc-11e6-af4b-005056a62fa6.2017.03.29",
  "_type": "fluentd",
  "_id": "AVsY3aSK190OXhxv4GIF",
  "_score": null,
  "_source": 
    "time": "2017-03-29T07:00:25.595959397Z",
    "docker_container_id": "9f4fa85a626d2f5197f0028c05e8e42271db7a4c674cc145204b67b6578f3378",
    "kubernetes_namespace_name": "42-steinbruchsteiner-staging",
    "kubernetes_pod_id": "56c61b65-0b0e-11e7-82e9-005056a62fa6",
    "kubernetes_pod_name": "php-app-3-weice",
    "kubernetes_container_name": "php-app",
    "kubernetes_labels_deployment": "php-app-3",
    "kubernetes_labels_deploymentconfig": "php-app",
    "kubernetes_labels_name": "php-app",
    "kubernetes_host": "itsrv1564.esrv.local",
    "kubernetes_namespace_id": "3af0bedd-eebc-11e6-af4b-005056a62fa6",
    "hostname": "itsrv1564.esrv.local",
    "message": "10.1.3.1 - - [29/Mar/2017:01:59:21 +0200] "GET /kwf/status/health HTTP/1.1" 200 2 "-" "Go-http-client/1.1"\n",
    "version": "1.3.0"
  ,
  "fields": 
    "time": [
      1490770825595
    ]
  ,
  "sort": [
    1490770825595
  ]

【问题讨论】：

为此需要进行特殊的 apache 访问日志解析。我该怎么做？这是你的问题吗？ 【参考方案1】：

免责声明：我没有在 openshift 中对此进行测试。我不知道你的微服务使用的是哪个技术栈。

这就是我在 Kubernetes 中部署的 Spring Boot 应用程序（带有 logback）中执行此操作的方式。

1. 使用 logstash 编码器进行 logback（这将以 Json 格式写入日志，这对 ELK 堆栈更友好）

我有一个 gradle 依赖项来启用它

compile "net.logstash.logback:logstash-logback-encoder:3.5"

然后在appender中配置LogstashEncoder作为编码器，在logback-spring.groovy/logback-spring.xml（或logabck.xml）中

2.有一些过滤器或库来写访问日志

对于 2. 要么使用

A.使用“net.rakugakibox.springbootext:spring-boot-ext-logback-access:1.6”库

（这是我正在使用的）

它提供了一个很好的json格式，如下

  
   "@timestamp":"2017-03-29T09:43:09.536-05:00",
   "@version":1,
   "@message":"0:0:0:0:0:0:0:1 - - [2017-03-29T09:43:09.536-05:00] \"GET /orders/v1/items/42 HTTP/1.1\" 200 991",
   "@fields.method":"GET",
   "@fields.protocol":"HTTP/1.1",
   "@fields.status_code":200,
   "@fields.requested_url":"GET /orders/v1/items/42 HTTP/1.1",
   "@fields.requested_uri":"/orders/v1/items/42",
   "@fields.remote_host":"0:0:0:0:0:0:0:1",
   "@fields.HOSTNAME":"0:0:0:0:0:0:0:1",
   "@fields.content_length":991,
   "@fields.elapsed_time":48,
   "HOSTNAME":"ABCD"

或

B. 使用 Logback 的 Tee Filter

或

C. Spring 的 CommonsRequestLoggingFilter（没有真正测试过）

添加一个 bean 定义

    @Bean
    public CommonsRequestLoggingFilter requestLoggingFilter() 
        CommonsRequestLoggingFilter crlf = new CommonsRequestLoggingFilter();
        crlf.setIncludeClientInfo(true);
        crlf.setIncludeQueryString(true);
        crlf.setIncludePayload(true);
        return crlf;

然后将 org.springframework.web.filter.CommonsRequestLoggingFilter 设置为 DEBUG，这可以通过添加 application.properties 来完成：

logging.level.org.springframework.web.filter.CommonsRequestLoggingFilter=DEBUG

【讨论】：

fluentd 会自动检测和解析 JSON 日志消息吗？它应该（带着一粒盐，因为我自己没有测试过）。我不明白为什么 fluentd 不会尊重结构化日志。

以上是关于OpenShift Aggregated Logging：解析 Apache 访问日志的主要内容，如果未能解决你的问题，请参考以下文章