LightBGM之Dataset

Posted demo-deng

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了LightBGM之Dataset相关的知识,希望对你有一定的参考价值。

最近使用了LightBGM的Dataset,记录一下:

1.说明:  classlightgbm.Dataset(datalabel=Nonereference=Noneweight=Nonegroup=Noneinit_score=Nonesilent=Falsefeature_name=‘auto‘categorical_feature=‘auto‘params=Nonefree_raw_data=True)

Bases: object

Dataset in LightGBM.

Constract Dataset.

Parameters:
  • data (stringnumpy arraypandas DataFramescipy.sparse or list of numpy arrays) – Data source of Dataset. If string, it represents the path to txt file.
  • label (listnumpy 1-D arraypandas one-column DataFrame/Series or Noneoptional (default=None)) – Label of the data.
  • reference (Dataset or Noneoptional (default=None)) – If this is Dataset for validation, training data should be used as reference.
  • weight (listnumpy 1-D arraypandas Series or Noneoptional (default=None)) – Weight for each instance.
  • group (listnumpy 1-D arraypandas Series or Noneoptional (default=None)) – Group/query size for Dataset.
  • init_score (listnumpy 1-D arraypandas Series or Noneoptional (default=None)) – Init score for Dataset.
  • silent (booloptional (default=False)) – Whether to print messages during construction.
  • feature_name (list of strings or ‘auto‘optional (default="auto")) – Feature names. If ‘auto’ and data is pandas DataFrame, data columns names are used.
  • categorical_feature (list of strings or int, or ‘auto‘optional (default="auto")) – Categorical features. If list of int, interpreted as indices. If list of strings, interpreted as feature names (need to specify feature_name as well). If ‘auto’ and data is pandas DataFrame, pandas categorical columns are used. All values in categorical features should be less than int32 max value (2147483647). All negative values in categorical features will be treated as missing values.
  • params (dict or Noneoptional (default=None)) – Other parameters.
  • free_raw_data (booloptional (default=True)) – If True, raw data is freed after constructing inner Dataset.

  输出是一个dataset对象

2.使用:

  根据说明使用自己的数据,我这里data和label都用了DataFrame格式的

 

以上是关于LightBGM之Dataset的主要内容,如果未能解决你的问题,请参考以下文章

python机器学习之lightBGM

python机器学习之lightBGM

detectron2报AttributeError: Attribute ‘evaluator_type‘ does not exist in the metadata of dataset(代码片段

为啥这段代码会泄露? (简单的代码片段)

Linq实战 之 DataSet操作详解

11.spark sql之RDD转换DataSet