ES IK拼音插件踩坑及填坑记录
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ES IK拼音插件踩坑及填坑记录相关的知识,希望对你有一定的参考价值。
参考技术A 最近在给es插入文档的时候,忽然报错,提示如下:ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards startOffset=0,endOffset=3,lastStartOffset=3 for field 'title.pinyin']]
感觉是在用ik拼音分词的时候报的错。
并不是所有的文档在插入的时候会报错,只有在title字段包含特殊符号的时候会报错,上网查了一下原因,说是ik拼音插件的错误,只需要升级一下插件的版本即可。
然而因为用的是阿里云的es服务,不能对插件进行升级,所以这个方法行不通。
也有说是因为,将参数“ignore_pinyin_offset”设置为false后,并向pinyin分词字段批量写入数据,即会出现“startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards”异常。所以将ignore_pinyin_offset设为true即可。
设置前的配置:
设置后的配置:
改完之后重新索引一遍文件,结果还是报这个错。
就这么折腾了两天之后,发现既然这是这个版本的本身的bug,然后阿里云那边又不给升级版本,那为什么不曲线救国呢?
前面一开始排查的时候发现,是标题含有特殊符号的时候,才会出现这个bug,那么可以在拼音分词之前,再加一个filter,把特殊符号过滤掉,自然就没问题了。
于是我先配置了一个filter:
然后再把这个filter配置到analyzer里:
这样,在用ik_pinyin_analyzer分词之前,会先通过specialCharactersFilter的正则表达式过滤,把所有的特殊符号都过滤掉,然后再用my_pinyin进行拼音分词。
这么处理后,就不会出现上面出现的错误了。
注:上述配置都是个性化配置,得根据自己的习惯和要求来配置,不用照搬,只要是加个新建一个filter然后加到pinyin前即可。
NHiberante从.net framework转移到.net standard(.net core 2.2)时遇到的坑及填坑
在.net framework中的创建session代码先贴一个
1 public class SessionBuilder 2 3 private static ISessionFactory _sessionFactory = null; 4 5 public SessionBuilder() 6 7 if (_sessionFactory == null) 8 9 //创建ISessionFactory 10 _sessionFactory = GetSessionFactory(); 11 12 13 14 /// <summary> 15 /// 创建ISessionFactory 16 /// </summary> 17 /// <returns></returns> 18 public static ISessionFactory GetSessionFactory() 19 20 //HibernatingRhinos.Profiler.Appender.NHibernate.NHibernateProfiler.Initialize(); 21 22 var mappers = new ModelMapper(); 23 mappers.AddMappings(Assembly.GetExecutingAssembly().GetExportedTypes()); 24 25 var cfg = new Configuration().Configure(); 26 cfg.AddDeserializedMapping(mappers.CompileMappingForAllExplicitlyAddedEntities(), ""); 27 28 return cfg.BuildSessionFactory(); 29 30 31 /// <summary> 32 /// 打开ISession 33 /// </summary> 34 /// <returns></returns> 35 public static ISession GetSession() 36 37 if (_sessionFactory == null || _sessionFactory.IsClosed) 38 39 //创建ISessionFactory 40 _sessionFactory = GetSessionFactory(); 41 42 43 44 45 #region 打开一个新的Session 46 public static ISession OpenSession() 47 48 return _sessionFactory.OpenSession(); 49 50 51 #endregion 52 53
与数据库的交互时,需要先在web.config配置(数据库为sql server)
<configuration> <configSections> <section name="hibernate-configuration" type="NHibernate.Cfg.ConfigurationSectionHandler, NHibernate" /> </configSections> <!--NHibernate配置开始--> <hibernate-configuration xmlns="urn:nhibernate-configuration-2.2"> <session-factory> <property name="connection.provider">NHibernate.Connection.DriverConnectionProvider</property> <property name="connection.driver_class">NHibernate.Driver.SqlClientDriver</property> <property name="dialect">NHibernate.Dialect.MsSql2012Dialect</property> <property name="show_sql">false</property> <property name="connection.connection_string_name">ylsdai</property> <property name="adonet.batch_size">30</property> <property name="generate_statistics">false</property> <property name="format_sql">true</property> <property name="command_timeout">60</property> <property name="current_session_context_class">web</property> <!--<property name="cache.provider_class">NHibernate.Caches.SysCache2.SysCacheProvider,NHibernate.Caches.SysCache2</property> <property name="cache.default_expiration">120</property> <property name="cache.use_second_level_cache">true</property> <property name="cache.use_query_cache">true</property>--> </session-factory> </hibernate-configuration> <!--NHibernate配置结束--> <connectionStrings> <!--test_db--> <!--<add name="ylsdai" connectionString="data source=0.0.0.1,111;database=test_db;uid=sa;pwd=123456" providerName="System.Data.SqlClient" /> </connectionStrings> </configuration>
映射类
using NHibernate.Mapping.ByCode; using NHibernate.Mapping.ByCode.Conformist; namespace ClassMapping #region CityMap public class CityMap : ClassMapping<City> public CityMap() SelectBeforeUpdate(true); DynamicUpdate(true); //Cache(p => p.Usage(CacheUsage.ReadWrite)); Id(p => p.CityId, map => map.Generator(Generators.Native)); Property(p => p.OldCityId); Property(p => p.ParentId); Property(p => p.CityName); Property(p => p.EnCityName); Property(p => p.CityImgUrl); Property(p => p.LatLng); Property(p => p.Keywords); Property(p => p.IsRecommend); Property(p => p.IsDepart); Property(p => p.AreaId); Property(p => p.CityContent); #endregion
将实体写好,就可以进行实现了
但是在迁移到.net core的时候遇到的问题:
1. 创建Session,使用.net framework的方法将不可用
2.config中对于NHiberante的配置也读取不到
3.基于问题1,映射类也无法进行实现
好在是在.net core中有一个辅助的开源框架Fluent NHibernate,它可以帮我解决上面遇到的问题,但是在具体使用时也踩了不少坑
1.网上的文档基本都是映射类在xml中的,但是当实际项目在.cs文件中时,大量的文件映射从.cs文件迁到到.xml文件将变得特别繁琐
最后的解决办法,创建session方法中GetSessionFactory方法做以下修改
/// <summary> /// 创建ISessionFactory /// </summary> /// <returns></returns> public static ISessionFactory GetSessionFactory() var assemblyName = Assembly.Load("ClassMapping"); NHibernate.Cfg.Configuration setCfg(NHibernate.Cfg.Configuration c) c.Properties.Add("show_sql", "true"); c.Properties.Add("adonet.batch_size", "1000"); c.Properties.Add("generate_statistics", "false"); c.Properties.Add("format_sql", "true"); c.Properties.Add("command_timeout", "60"); c.Properties.Add("current_session_context_class", "web"); return c; return Fluently.Configure().Database( FluentNHibernate.Cfg.Db.MsSqlConfiguration.MsSql2012.ConnectionString("Server=0.0.0.1,111;Database=test_db;Uid=sa;Pwd=123456;")) .Mappings(m => m.FluentMappings.AddFromAssembly(assemblyName)) .ExposeConfiguration(c => new SchemaUpdate(c).Execute(true, false)) .ExposeConfiguration(c => setCfg(c)) //.ExposeConfiguration(f => f.SetInterceptor(new SqlStatementInterceptor())) .BuildSessionFactory();
其中映射类的引入在
.Mappings(m => m.FluentMappings.AddFromAssembly(assemblyName))
这句话,是将命名空间引入,所以具体映射类可以重新新建一个项目,名字就叫做ClassMapping,具体的映射类做以下修改
1 using Entity; 2 using FluentNHibernate.Mapping; 3 4 namespace ClassMapping 5 6 #region CityMap 7 public class CityMap : ClassMap<City> 8 9 public CityMap() 10 11 Table("City"); 12 13 //SelectBeforeUpdate(true); 14 //DynamicUpdate(true); 15 //Cache(p => p.Usage(CacheUsage.ReadWrite)); 16 Id(p => p.CityId); 17 Map(p => p.OldCityId); 18 Map(p => p.ParentId); 19 Map(p => p.CityName); 20 Map(p => p.EnCityName); 21 Map(p => p.CityImgUrl); 22 Map(p => p.LatLng); 23 Map(p => p.Keywords); 24 Map(p => p.IsRecommend); 25 Map(p => p.IsDepart); 26 Map(p => p.AreaId); 27 Map(p => p.CityContent); 28 29 30 #endregion 31 32
在迁移过程中会碰到映射关系的迁移
以前的版本中为
OneToOne(p => p.City, map => map.Cascade(Cascade.All); map.Lazy(LazyRelation.Proxy); );//一对一 ManyToOne(p => p.Province, map => map.Column("ProvinceId"));//多对一 Bag(p => p.Counties, map => map.Key(k => k.Column("CountyId")); map.OrderBy(k => k.SortId); , rel => rel.OneToMany());//一对多
.net core版本中对应为
HasOne(p => p.City).Cascade.All().LazyLoad();//一对一 //References<Province>(r => r.Province).Column("ProvinceId").ForeignKey("ProvinceId").Cascade.None();//多对一,在实际运用中会出现问题 HasMany(p => p.Counties).KeyColumn("CountyId").OrderBy("CountyId").LazyLoad();//一对多
city实体
1 [Serializable] 2 public class City : BaseEntity 3 4 /// <summary> 5 /// CityId 6 /// </summary> 7 public virtual int CityId get; set; 8 9 /// <summary> 10 /// Version 11 /// </summary> 12 public virtual int Version get; set; 13 14 /// <summary> 15 /// OldCityId 16 /// </summary> 17 public virtual int OldCityId get; set; 18 19 /// <summary> 20 /// ParentId 21 /// </summary> 22 public virtual int ParentId get; set; 23 24 /// <summary> 25 /// 城市名称 26 /// </summary> 27 public virtual string CityName get; set; 28 29 /// <summary> 30 /// 城市英文名称 31 /// </summary> 32 public virtual string EnCityName get; set; 33 34 /// <summary> 35 /// CityImgUrl 36 /// </summary> 37 public virtual string CityImgUrl get; set; 38 39 /// <summary> 40 /// 经纬度 41 /// </summary> 42 public virtual string LatLng get; set; 43 44 /// <summary> 45 /// 关键字 46 /// </summary> 47 public virtual string Keywords get; set; 48 49 /// <summary> 50 /// 是否推荐 51 /// </summary> 52 public virtual bool IsRecommend get; set; 53 54 /// <summary> 55 /// IsDepart 56 /// </summary> 57 public virtual bool IsDepart get; set; 58 59 /// <summary> 60 /// 航区 61 /// </summary> 62 public virtual int AreaId get; set; 63 64 /// <summary> 65 /// 城市介绍 66 /// </summary> 67 public virtual string CityContent get; set; 68 69
下面是操作session,进行数据库调用的方法
1 /// <summary> 2 /// Hibernate操作Helper 3 /// </summary> 4 /// <typeparam name="T"></typeparam> 5 public class DbHelper<T> where T : BaseEntity 6 7 private readonly ISession _session = SessionBuilder.GetSession(); 8 //protected static readonly ILog Log = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType); 9 10 public DbHelper() 11 12 public DbHelper(ISession session) 13 14 _session = session; 15 16 17 #region 获取一个实体 18 /// <summary> 19 /// 获取一个实体(一级缓存) 20 /// </summary> 21 /// <param name="id"></param> 22 /// <returns></returns> 23 public T Load(int id) 24 25 return _session.Load<T>(id); 26 27 28 #endregion 29 30 #region 获取一个实体(缓存) 31 /// <summary> 32 /// 获取一个实体(二级缓存) 33 /// </summary> 34 /// <param name="id"></param> 35 /// <returns></returns> 36 public T Get(int id) 37 38 return _session.Get<T>(id); 39 40 #endregion 41 42 #region 销毁一个实体 43 /// <summary> 44 /// 销毁一个实体 45 /// </summary> 46 /// <param name="obj"></param> 47 /// <returns></returns> 48 public void Evict(object obj) 49 50 _session.Evict(obj); 51 52 #endregion 53 54 /// <summary> 55 /// 根据SQL语句获取 56 /// </summary> 57 /// <param name="sql"></param> 58 public ISQLQuery CreateSqlQuery(string sql) 59 60 return _session.CreateSQLQuery(sql); 61 62 63 /// <summary> 64 /// 获取集合 65 /// </summary> 66 /// <param name="hql"></param> 67 /// <returns></returns> 68 public ICriteria GetCriteria(string hql) 69 70 return _session.CreateCriteria(hql); 71 72 73 #region 获取全部数据 74 /// <summary> 75 /// 获取全部数据 76 /// </summary> 77 /// <param name="cacheable">是否缓存</param> 78 /// <returns></returns> 79 public IEnumerable<T> GetAll(bool cacheable) 80 81 var ic = _session.CreateCriteria(typeof(T)); 82 IEnumerable<T> list = ic.SetCacheable(cacheable).List<T>() ?? new List<T>(); 83 return list; 84 85 #endregion 86 87 #region 插入或者更新数据 88 /// <summary> 89 /// 插入数据 90 /// </summary> 91 /// <param name="entity"></param> 92 /// <returns></returns> 93 public int Save(T entity) 94 95 var id = 0; 96 var session = this._session; 97 ITransaction tan = session.BeginTransaction(); 98 99 try 100 101 entity = session.Merge(entity); 102 103 catch (Exception e) 104 105 //Log.DebugFormat($"Save124,Exception:e.Message,entity:JsonConvert.SerializeObject(entity)"); 106 107 108 try 109 110 tan.Begin(); 111 id = (int)session.Save(entity); 112 session.Flush(); 113 tan.Commit(); 114 115 catch (Exception e) 116 117 //Log.DebugFormat($"Save136,Exception:e.Message,entity:JsonConvert.SerializeObject(entity)"); 118 tan.Rollback(); 119 throw; 120 121 122 return id; 123 124 #endregion 125 126 #region 更新数据 127 /// <summary> 128 /// 更新数据 129 /// </summary> 130 /// <param name="entity"></param> 131 /// <returns></returns> 132 public int Update(T entity) 133 134 int result = 0; 135 ITransaction tan = _session.BeginTransaction(); 136 var session = this._session; 137 //entity = (T)_session.Merge(entity); 138 try 139 140 entity = session.Merge(entity); 141 142 catch (Exception e) 143 144 //Log.DebugFormat($"Update163,Exception:e.Message,entity:JsonConvert.SerializeObject(entity)"); 145 146 try 147 148 tan.Begin(); 149 _session.Update(entity); 150 _session.Flush(); 151 tan.Commit(); 152 result++; 153 154 catch (Exception e) 155 156 //Log.DebugFormat($"Update175,Exception:e.Message,entity:JsonConvert.SerializeObject(entity)"); 157 tan.Rollback(); 158 throw; 159 160 return result; 161 162 #endregion 163 164 #region 删除一条数据 165 /// <summary> 166 /// 删除一条数据 167 /// </summary> 168 /// <param name="id"></param> 169 /// <returns></returns> 170 public int DeleteModelById(int id) 171 172 int result = 0; 173 object item = Get(id); 174 //ITransaction tan = _session.BeginTransaction(); 175 try 176 177 //tan.Begin(); 178 _session.Delete(item); 179 _session.Flush(); 180 //tan.Commit(); 181 result++; 182 183 catch (Exception) 184 185 //tan.Rollback(); 186 throw; 187 188 189 return result; 190 191 #endregion 192 193 #region 删除一个实体对象 194 /// <summary> 195 /// 删除一个实体对象 196 /// </summary> 197 /// <param name="entity"></param> 198 /// <returns></returns> 199 public int DeleteModel(BaseEntity entity) 200 201 var result = 0; 202 //ITransaction tan = _session.BeginTransaction(); 203 try 204 205 //tan.Begin(); 206 _session.Delete(entity); 207 _session.Flush(); 208 //tan.Commit(); 209 result++; 210 211 catch (Exception) 212 213 //tan.Rollback(); 214 throw; 215 216 return result; 217 218 #endregion 219 220 /// <summary> 221 /// 根据SQL语句删除 222 /// </summary> 223 /// <param name="sql"></param> 224 public void DeleteList(string sql) 225 226 _session.CreateSQLQuery(sql).UniqueResult(); 227 228 229 /// <summary> 230 /// 删除泛型集合 231 /// </summary> 232 /// <param name="models"></param> 233 public void DeleteList(IList<T> models) 234 235 foreach (var model in models) 236 237 DeleteModel(model); 238 239 240 241 public bool IsExist(Expression<Func<T, bool>> keyWhere) 242 243 var ss = _session.QueryOver<T>().Where(keyWhere); 244 return ss.RowCount() > 0; 245 246 247 #region GetQuery 248 /// <summary> 249 /// GetQuery 250 /// </summary> 251 /// <returns></returns> 252 public IQueryable<T> GetQuery() 253 254 try 255 256 return _session.Query<T>(); 257 258 catch (Exception e) 259 260 var session = SessionBuilder.GetSession(); 261 return session.Query<T>(); 262 263 264 #endregion 265 266 #region GetQueryOver 267 /// <summary> 268 /// GetQueryOver 269 /// </summary> 270 /// <returns></returns> 271 public IQueryOver<T, T> GetQueryOver(Expression<Func<T, bool>> keyWhere) 272 273 return _session.QueryOver<T>().Where(keyWhere); 274 275 276 #endregion 277 278 #region GetQueryOver 279 /// <summary> 280 /// GetQueryOver 281 /// </summary> 282 /// <returns></returns> 283 public IQueryOver<T, T> GetQueryOver() 284 285 return _session.QueryOver<T>(); 286 287 #endregion 288 289 #region 获取集合ByHql 290 /// <summary> 291 /// 获取集合ByHql 292 /// </summary> 293 /// <param name="strHql"></param> 294 /// <returns></returns> 295 public IQuery GetQueryByHql(string strHql) 296 297 return _session.CreateQuery(strHql); 298 299 #endregion 300 301 #region 获取集合BySql 302 /// <summary> 303 /// 获取集合BySql 304 /// </summary> 305 /// <param name="strSql"></param> 306 /// <returns></returns> 307 public IQuery GetQueryBySql(string strSql) 308 309 return _session.CreateSQLQuery(strSql); 310 311 #endregion 312
以上是关于ES IK拼音插件踩坑及填坑记录的主要内容,如果未能解决你的问题,请参考以下文章
Ubuntu18.04安装cuda 11.3和TensorRT 8教程(碰到的坑及填坑方法,以及python和c++的TensorRT环境搭建)