Laravel个人博客集成Elasticsearch和ik分词

Posted johnson108178

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Laravel个人博客集成Elasticsearch和ik分词相关的知识,希望对你有一定的参考价值。

在之前的博客中,写了一篇用laravel5.5和vue写的个人博客。GitHub地址为:https://github.com/Johnson19900110/phpJourney。最近有空,就想着把Elasticsearch集成了进来。

因为博主比较懒,在博客园写博客,所以个人博客就没有同步了,因此就用php的一个爬虫库 fabpot/goutte 把自己博客园文章爬到了自己博客上。

技术分享图片

代码如下:

<?php
namespace AppLibraries;

use AppPost;
use GoutteCLient;
use SymfonyComponentDomCrawlerCrawler;

class CnblogsPostSpider {

    protected $client;

    protected $crawler;

    protected $urls = [];

    public function __construct(Client $client, $url)
    {
        $this->client = $client;
        $this->crawler = $client->request(GET, $url);
    }

    public function getUrls()
    {
        $urls = $this->crawler->filter(.postTitle > a)->each(function ($node) {
            return $node->attr(href);
        });

        foreach ($urls as $url) {
            $crawler = $this->client->request(GET, $url);

            $cnBlogId = $this->getCnBlogId($url);

            $post = new Post();
            if($post->where(cnblogs_id, $cnBlogId)->count()) {
                // 已爬过该博客,只更新阅读和评论数
                $post->where(cnblogs_id, $cnBlogId)->update([
                    views         => $this->getViews($crawler),
                    comments      => $this->getComments($crawler),
                ]);
            }else {
                $post->insert([
                    title         => $this->getTitle($crawler),
                    category_id   => 1,
                    content       => $this->getContent($crawler),
                    user_id       => 1,
                    views         => $this->getViews($crawler),
                    comments      => $this->getComments($crawler),
                    cnblogs_id    => $cnBlogId,
                    cnblogs_url   => $url,
                    created_at    => $this->getCreatedAt($crawler),
                ]);
            }
        }
    }

    public function getCnBlogId($url)
    {
        $url_arr = explode(/, $url);
        $last = array_pop($url_arr);
        $path_arr = explode(., $last);
        return intval(array_shift($path_arr));
    }

    protected function getTitle(Crawler $crawler)
    {
        return trim($crawler->filter(.postTitle > a)->text());
    }

    protected function getContent(Crawler $crawler)
    {
        return trim($crawler->filter(#cnblogs_post_body)->text());
    }

    protected function getViews(Crawler $crawler)
    {
        return intval(trim($crawler->filter(#post_view_count)->text()));
    }

    protected function getComments(Crawler $crawler)
    {
        return intval($crawler->filter(#post_comment_count)->text());
    }

    protected function getCreatedAt(Crawler $crawler)
    {
        return trim($crawler->filter(#post-date)->text());
    }
}

然后开始使用Laravel scout 集成ES:

首先,先下载ES包:

 composer require tamayo/laravel-scout-elastic 

这个包依赖 Laravel scout包,所以也就顺便装好了。

然后 publish config 和添加  ServiceProviders 。

这时候就可以装ES了。因为我们要使用中文分词 ik 插件,在安装ik插件的时候,如果我们自己取想办法安装会浪费你很多精力。

因为博主也是刚接触ES,所以我们直接使用现成的项目: https://github.com/medcl/elasticsearch-rtf

这个项目当前的版本是 Elasticsearch 5.1.1,当然ik 插件也就顺便装好了。

$ curl http://localhost:9200

{
  "name" : "Rkx3vzo",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "Ww9KIfqSRA-9qnmj1TcnHQ",
  "version" : {
    "number" : "5.1.1",
    "build_hash" : "5395e21",
    "build_date" : "2016-12-06T12:36:15.409Z",
    "build_snapshot" : false,
    "lucene_version" : "6.3.0"
  },
  "tagline" : "You Know, for Search"
}

当你出现这个界面,说明ES已经装好了。

这时候就可以创建一个 artisan 命令,来创建ES的index和template。

<?php

namespace AppConsoleCommands;

use GuzzleHttpClient;
use IlluminateConsoleCommand;

class InitEs extends Command
{
    /**
     * The name and signature of the console command.
     *
     * @var string
     */
    protected $signature = es:init;

    /**
     * The console command description.
     *
     * @var string
     */
    protected $description = Init es to create index;

    /**
     * Create a new command instance.
     *
     * @return void
     */
    public function __construct()
    {
        parent::__construct();
    }

    /**
     * Execute the console command.
     *
     * @return mixed
     */
    public function handle()
    {
        //
        $client = new Client();
        $this->createTemplate($client);
        $this->createIndex($client);
    }

    public function createTemplate(Client $client)
    {
        $url = config(scout.elasticsearch.hosts)[0] . :9200/ . _template/rtf;
        $client->put($url, [
            json => [
                template => *,
                settings => [
                    number_of_shards => 1
                ],
                mappings => [
                    _default_ => [
                        _all => [
                            enabled => true
                        ],
                        dynamic_templates => [
                            [
                                strings => [
                                    match_mapping_type => string,
                                    mapping => [
                                        type => text,
                                        analyzer => ik_smart,
                                        ignore_above => 256,
                                        fields => [
                                            keyword => [
                                                type => keyword
                                            ]
                                        ]
                                    ]
                                ]
                            ]
                        ]
                    ]
                ]
            ]
        ]);

    }

    public function createIndex(Client $client)
    {
        $url = config(scout.elasticsearch.hosts)[0] . :9200/ . config(scout.elasticsearch.index);
        $client->put($url, [
            json => [
                settings => [
                    refresh_interval => 5s,
                    number_of_shards => 1,
                    number_of_replicas => 0,
                ],
                mappings => [
                    _default_ => [
                        _all => [
                            enabled => false
                        ]
                    ]
                ]
            ]
        ]);
    }
}

因为 tamayo/laravel-scout-elastic 不带 highlight 功能,所以我们需要稍微修改一下。新建一个EsEngine继承ElasticsearchEngine类,然后重写几个方法即可。

<?php
/**
 * Created by PhpStorm.
 * User: johnson
 * Date: 2018/6/14
 * Time: 下午3:10
 */

namespace AppLibraries;


use LaravelScoutBuilder;
use ScoutEnginesElasticsearchElasticsearchEngine;
use IlluminateDatabaseEloquentCollection;

class EsEngine extends ElasticsearchEngine
{
    public function search(Builder $builder)
    {
        return $this->performSearch($builder, array_filter([
            numericFilters => $this->filters($builder),
            size => $builder->limit,
        ]));
    }

    protected function performSearch(Builder $builder, array $options = [])
    {
        $params = [
            index => $this->index,
            type => $builder->model->searchableAs(),
            body => [
                query => [
                    bool => [
                        must => [
                            [
                                query_string => [
                                    query => "*{$builder->query}*",
                                ]
                            ]
                        ]
                    ]
                ],
            ]
        ];
        /**
         * 这里使用了 highlight 的配置
         */
        if ($builder->model->searchSettings
            && isset($builder->model->searchSettings[attributesToHighlight])
        ) {
            $attributes = $builder->model->searchSettings[attributesToHighlight];
            foreach ($attributes as $attribute) {
                $params[body][highlight][fields][$attribute] = new stdClass();
            }
        }

        if ($sort = $this->sort($builder)) {
            $params[body][sort] = $sort;
        }

        if (isset($options[from])) {
            $params[body][from] = $options[from];
        }

        if (isset($options[size])) {
            $params[body][size] = $options[size];
        }

        if (isset($options[numericFilters]) && count($options[numericFilters])) {
            $params[body][query][bool][must] = array_merge($params[body][query][bool][must],
                $options[numericFilters]);
        }

        return $this->elastic->search($params);
    }

    public function map($results, $model)
    {
        if ($results[hits][total] === 0) {
            return Collection::make();
        }

        $keys = collect($results[hits][hits])
            ->pluck(_id)->values()->all();

        $models = $model->whereIn(
            $model->getKeyName(), $keys
        )->get()->keyBy($model->getKeyName());

        return collect($results[hits][hits])->map(function ($hit) use ($model, $models) {

            $one = $models[$hit[_id]];
            /**
             * 这里返回的数据,如果有 highlight,就把对应的  highlight 设置到对象上面
             */
            if (isset($hit[highlight])) {
                $one->highlight = $hit[highlight];
            }
            return $one;
        });
    }
}

我们这里要搜索的是博客,所以在Post模型中添加

  use Searchable;
  public
$searchSettings = [ attributesToHighlight => [ * ] ]; public $highlight = [];

然后在查询数据的时候使用scout的search方法即可。

public function search(Request $request)
    {
        $q = $request->get(q, false);

        $posts = [];
        if($q !== false) {
            $posts = Post::search($q)->paginate();
        }

        return view(index, compact(posts, q));
    }

查询到的数据中,包含 highlight 属性。所以在模版中就可以这样用


@if(isset($post->highlight[‘content‘]))
@foreach($post->highlight[‘content‘] as $item)
...{!! $item !!}...
@endforeach
@else
{{ empty($post->content) ? ‘...‘ : mb_substr($post->content, 0, 300) . ‘...‘ }}
@endif

最终的效果是这样滴

技术分享图片









以上是关于Laravel个人博客集成Elasticsearch和ik分词的主要内容,如果未能解决你的问题,请参考以下文章

laravel的启动过程---摘自网络博客个人学习之用

laravel5.4+vue2搭建个人博客

基于laravel5.5和vue2开发的个人博客

Spring Boot制作个人博客-页面插件集成

个人博客开发之xadmin与ueditor集成

如何将paypal与laravel集成