RandomizedSearchCV 溢出错误:无法将“int”放入索引大小的整数中

Posted

技术标签:

【中文标题】RandomizedSearchCV 溢出错误:无法将“int”放入索引大小的整数中【英文标题】:RandomizedSearchCV OverflowError: cannot fit 'int' into aan index-sized integer 【发布时间】:2019-12-15 10:36:02 【问题描述】:

我正在尝试在神经网络上运行 RandomizedSearchCV 来识别最佳参数。我已经创建了模型函数和参数分布,但我不断收到溢出错误。我该如何纠正错误?

我重新查看了代码,但不确定错误在哪里;我想可能是关于我如何定义随机搜索的?


# Model Definition

K.clear_session()

input_depth = features.shape[1]
output_depth = target.shape[1]

#np.random.seed(32)

def grid_search_model(layer_units_1, act_fn_1, layer_initialise_1, L1_ker_1, L2_ker_1, L1_act_1, bias_init_1, kernel_const_1, drop_1,
                      layer_units_2, act_fn_2, layer_initialise_2, L1_ker_2, L2_ker_2, L1_act_2, bias_init_2, kernel_const_2, drop_2,
                      layer_units_hidden, act_fn_hidden, layer_initialise_hidden, L1_ker_hidden, L2_ker_hidden, L1_act_hidden, bias_init_hidden, kernel_const_hidden, drop_hidden,
                      layer_initialise_output, L1_ker_output, L2_ker_output, L1_act_output, bias_init_output, kernel_const_output):
    model = Sequential()
    metric = Metrics()

    model.add(Dense(units = layer_units_1,
                    activation = act_fn_1,
                    kernel_initializer = layer_initialise_1,
                    kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_1, l2 = L2_ker_1),
                    activity_regularizer = regularizers.l1(L1_act_1),
                    bias_initializer = tf.constant_initializer(value = bias_init_1),
                    kernel_constraint = kernel_const_1,
                    input_shape=(input_depth,),
                    name='hidden_layer1'))

    model.add(Dropout(drop_1))

    model.add(Dense(units = layer_units_2,
                    activation = act_fn_2,
                    kernel_initializer = layer_initialise_2,
                    kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_2, l2 = L2_ker_2),
                    activity_regularizer = regularizers.l1(L1_act_2),
                    bias_initializer = tf.constant_initializer(value = bias_init_2),
                    kernel_constraint = kernel_const_2,
                    name='hidden_layer2'))

    model.add(Dropout(drop_2))

    for i in range(hidden_layer_no):
        model.add(Dense(units = hidden_layer_depth_hidden,
                        activation = act_fn_hidden,
                        kernel_initializer = layer_initialise_hidden,
                        kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_hidden, l2 = L2_ker_hidden),
                        activity_regularizer = regularizers.l1(L1_act_hidden),
                        bias_initializer = tf.constant_initializer(value = bias_init_hidden),
                        kernel_constraint = kernel_const_hidden))
        model.add(Dropout(drop_hidden))

    model.add(Dense(units = output_depth,
                    activation = 'softmax',
                    kernel_initializer = layer_initialise_output,
                    kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_output, l2 = L2_ker_output),
                    activity_regularizer = regularizers.l1(L1_act_output),
                    bias_initializer = tf.constant_initializer(value = bias_init_output),
                    kernel_constraint = kernel_const_output,
                    name='output_layer'))

  adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0, amsgrad=True, clipvalue=0.5, clipnorm=1) #0.0001 ln rate is the same #0.2 decay #learning rate 0.001 decay=0, lr rate = 0.001

  model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])#'categorical_accuracy'])

  return model

# Parameter definition

a = input_depth - (round((input_depth-output_depth)/3))

hyperparameters = 'layer_units_1' : [input_depth, a, 10, 50, 100, 200, 1000],
                       'act_fn_1' : ['relu','sigmoid'],
                       'layer_initializer_1' : [None,
                                              keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
                                              keras.initializers.glorot_uniform(seed=1),
                                              keras.initializers.he_uniform(seed=1)],
                       'L1_ker_1' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L2_ker_1' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L1_act_1' : [None,0.001,0.005,0.01,0.05,0.1],
                       'bias_init_1' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
                       'kernel_const_1' : [None,
                                         keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
                                         keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)],
                       'drop_1' : [0.2,0.4,0.5,0.8],
                       'layer_units_2' : [input_depth, a, 10, 50, 100, 200, 1000],
                       'act_fn_2' : ['relu','sigmoid'],
                       'layer_initializer_2' : [None,
                                              keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
                                              keras.initializers.glorot_uniform(seed=1),
                                              keras.initializers.he_uniform(seed=1)],
                       'L1_ker_2' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L2_ker_2' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L1_act_2' : [None,0.001,0.005,0.01,0.05,0.1],
                       'bias_init_2' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
                       'kernel_const_2' : [None,
                                         keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
                                         keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)],
                       'drop_2' : [0.2,0.4,0.5,0.8],
                       'layer_units_hidden' : [input_depth, a, 10, 50, 100, 200, 1000],
                       'act_fn_hidden' : ['relu','sigmoid'],
                       'layer_initializer_hidden' : [None,
                                                   keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
                                                   keras.initializers.glorot_uniform(seed=1),
                                                   keras.initializers.he_uniform(seed=1)],
                       'L1_ker_hidden' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L2_ker_hidden' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L1_act_hidden' : [None,0.001,0.005,0.01,0.05,0.1],
                       'bias_init_hidden' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
                       'kernel_const_hidden' : [None,
                                         keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
                                         keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)],
                       'drop_hidden' : [0.2,0.4,0.5,0.8],
                       'layer_units_hidden' : [input_depth, a, 10, 50, 100, 200, 1000],
                       'layer_initializer_output' : [None,
                                                   keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
                                                   keras.initializers.glorot_uniform(seed=1),
                                                   keras.initializers.he_uniform(seed=1)],
                       'L1_ker_output' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L2_ker_output' : [None,0.001,0.005,0.01,0.05,0.1],
                       'L1_act_output' : [None,0.001,0.005,0.01,0.05,0.1],
                       'bias_init_output' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
                       'kernel_const_output' : [None,
                                              keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
                                              keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)]
                      

# RandomizedSearchCV
metric = Metrics()
class_neural_network = KerasClassifier(build_fn=grid_search_model, epochs=200)
grid = RandomizedSearchCV(estimator=class_neural_network, param_grid=hyperparameters, n_jobs = -1, pre_dispatch = 5, random_state = 42, return_train_score = True, verbose=10)
time = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
grid = grid.fit(X_train_rus, y_train_rus_1, callbacks=[metric])

我希望搜索能够运行,没有问题。我收到以下错误消息:

---------------------------------------------------------------------------
OverflowError                             Traceback (most recent call last)
<ipython-input-34-a4148e6688c1> in <module>()
      4 grid = RandomizedSearchCV(estimator=class_neural_network, param_distributions=hyperparameters, n_jobs = -1, pre_dispatch = 5, random_state = 42, return_train_score = True, verbose=10)
      5 time = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
----> 6 grid = grid.fit(X_train_rus, y_train_rus_1, callbacks=[metric])

/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
    720                 return results_container[0]
    721 
--> 722             self._run_search(evaluate_candidates)
    723 
    724         results = results_container[0]

/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
   1513         evaluate_candidates(ParameterSampler(
   1514             self.param_distributions, self.n_iter,
-> 1515             random_state=self.random_state))

/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params)
    694 
    695             def evaluate_candidates(candidate_params):
--> 696                 candidate_params = list(candidate_params)
    697                 n_candidates = len(candidate_params)
    698 

/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in __iter__(self)
    261             # look up sampled parameter settings in parameter grid
    262             param_grid = ParameterGrid(self.param_distributions)
--> 263             grid_size = len(param_grid)
    264             n_iter = self.n_iter
    265 

OverflowError: cannot fit 'int' into an index-sized integer

【问题讨论】:

【参考方案1】:

这是因为您的参数 dict 仅由列表组成,例如你指定'bias_init_1' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0]。 这意味着您实际上已经指定了一个离散网格参数。 似乎sklearn 代码正在尝试计算离散参数网格的大小,并且由于这是所有参数列表的笛卡尔积,因此您得到的大小非常大,对于整数来说太大了。 根据我收集的信息,您会收到此错误,因为在将参数指定为网格时,sklearn 将尝试按索引访问它们,因此网格的总大小应该适合整数。

使用随机搜索进行交叉验证时,最好为您的参数指定一个分布,如下所示:

import scipy.stats.distributions as dists

param_grid = dict(
    param1=dists.uniform(0, 1),        # continuous distribution
    param2=dists.randint(16, 512 + 1), # discrete distribution
    param3=['foo', 'bar'],             # specifying possible values directly
)

以这种方式指定参数网格时,sklearn 不会尝试计算参数网格的大小(因为它在技术上是无限的),因此这应该可以防止出现错误。

对连续变量使用连续分布还可以提高您对搜索空间的有效覆盖范围,因此总体而言,这是一种更好的 CV 方法。 另请注意,在上面的示例中,您可以将离散参数(例如 param3)与连续参数结合起来。

【讨论】:

拯救了我的一天@avivr!谢谢!

以上是关于RandomizedSearchCV 溢出错误:无法将“int”放入索引大小的整数中的主要内容,如果未能解决你的问题,请参考以下文章

为啥在为 RandomizedSearchCV 的参数“param_distributions”提供字典列表时出现错误?

Rust语言——无虚拟机无垃圾收集器无运行时无空指针/野指针/内存越界/缓冲区溢出/段错误无数据竞争

如何使用 RandomizedSearchCV 正确实现 StratifiedKFold

如何为 RandomizedSearchCV 使用预定义拆分

RandomizedSearchCV - param_distrubitions 的问题? [关闭]

sklearn:在 RandomizedSearchCV 中使用管道?