RandomizedSearchCV 溢出错误:无法将“int”放入索引大小的整数中
Posted
技术标签:
【中文标题】RandomizedSearchCV 溢出错误:无法将“int”放入索引大小的整数中【英文标题】:RandomizedSearchCV OverflowError: cannot fit 'int' into aan index-sized integer 【发布时间】:2019-12-15 10:36:02 【问题描述】:我正在尝试在神经网络上运行 RandomizedSearchCV 来识别最佳参数。我已经创建了模型函数和参数分布,但我不断收到溢出错误。我该如何纠正错误?
我重新查看了代码,但不确定错误在哪里;我想可能是关于我如何定义随机搜索的?
# Model Definition
K.clear_session()
input_depth = features.shape[1]
output_depth = target.shape[1]
#np.random.seed(32)
def grid_search_model(layer_units_1, act_fn_1, layer_initialise_1, L1_ker_1, L2_ker_1, L1_act_1, bias_init_1, kernel_const_1, drop_1,
layer_units_2, act_fn_2, layer_initialise_2, L1_ker_2, L2_ker_2, L1_act_2, bias_init_2, kernel_const_2, drop_2,
layer_units_hidden, act_fn_hidden, layer_initialise_hidden, L1_ker_hidden, L2_ker_hidden, L1_act_hidden, bias_init_hidden, kernel_const_hidden, drop_hidden,
layer_initialise_output, L1_ker_output, L2_ker_output, L1_act_output, bias_init_output, kernel_const_output):
model = Sequential()
metric = Metrics()
model.add(Dense(units = layer_units_1,
activation = act_fn_1,
kernel_initializer = layer_initialise_1,
kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_1, l2 = L2_ker_1),
activity_regularizer = regularizers.l1(L1_act_1),
bias_initializer = tf.constant_initializer(value = bias_init_1),
kernel_constraint = kernel_const_1,
input_shape=(input_depth,),
name='hidden_layer1'))
model.add(Dropout(drop_1))
model.add(Dense(units = layer_units_2,
activation = act_fn_2,
kernel_initializer = layer_initialise_2,
kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_2, l2 = L2_ker_2),
activity_regularizer = regularizers.l1(L1_act_2),
bias_initializer = tf.constant_initializer(value = bias_init_2),
kernel_constraint = kernel_const_2,
name='hidden_layer2'))
model.add(Dropout(drop_2))
for i in range(hidden_layer_no):
model.add(Dense(units = hidden_layer_depth_hidden,
activation = act_fn_hidden,
kernel_initializer = layer_initialise_hidden,
kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_hidden, l2 = L2_ker_hidden),
activity_regularizer = regularizers.l1(L1_act_hidden),
bias_initializer = tf.constant_initializer(value = bias_init_hidden),
kernel_constraint = kernel_const_hidden))
model.add(Dropout(drop_hidden))
model.add(Dense(units = output_depth,
activation = 'softmax',
kernel_initializer = layer_initialise_output,
kernel_regularizer = regularizers.l1_l2(l1 = L1_ker_output, l2 = L2_ker_output),
activity_regularizer = regularizers.l1(L1_act_output),
bias_initializer = tf.constant_initializer(value = bias_init_output),
kernel_constraint = kernel_const_output,
name='output_layer'))
adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0, amsgrad=True, clipvalue=0.5, clipnorm=1) #0.0001 ln rate is the same #0.2 decay #learning rate 0.001 decay=0, lr rate = 0.001
model.compile(optimizer=adam, loss='categorical_crossentropy', metrics=['accuracy'])#'categorical_accuracy'])
return model
# Parameter definition
a = input_depth - (round((input_depth-output_depth)/3))
hyperparameters = 'layer_units_1' : [input_depth, a, 10, 50, 100, 200, 1000],
'act_fn_1' : ['relu','sigmoid'],
'layer_initializer_1' : [None,
keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
keras.initializers.glorot_uniform(seed=1),
keras.initializers.he_uniform(seed=1)],
'L1_ker_1' : [None,0.001,0.005,0.01,0.05,0.1],
'L2_ker_1' : [None,0.001,0.005,0.01,0.05,0.1],
'L1_act_1' : [None,0.001,0.005,0.01,0.05,0.1],
'bias_init_1' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
'kernel_const_1' : [None,
keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)],
'drop_1' : [0.2,0.4,0.5,0.8],
'layer_units_2' : [input_depth, a, 10, 50, 100, 200, 1000],
'act_fn_2' : ['relu','sigmoid'],
'layer_initializer_2' : [None,
keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
keras.initializers.glorot_uniform(seed=1),
keras.initializers.he_uniform(seed=1)],
'L1_ker_2' : [None,0.001,0.005,0.01,0.05,0.1],
'L2_ker_2' : [None,0.001,0.005,0.01,0.05,0.1],
'L1_act_2' : [None,0.001,0.005,0.01,0.05,0.1],
'bias_init_2' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
'kernel_const_2' : [None,
keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)],
'drop_2' : [0.2,0.4,0.5,0.8],
'layer_units_hidden' : [input_depth, a, 10, 50, 100, 200, 1000],
'act_fn_hidden' : ['relu','sigmoid'],
'layer_initializer_hidden' : [None,
keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
keras.initializers.glorot_uniform(seed=1),
keras.initializers.he_uniform(seed=1)],
'L1_ker_hidden' : [None,0.001,0.005,0.01,0.05,0.1],
'L2_ker_hidden' : [None,0.001,0.005,0.01,0.05,0.1],
'L1_act_hidden' : [None,0.001,0.005,0.01,0.05,0.1],
'bias_init_hidden' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
'kernel_const_hidden' : [None,
keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)],
'drop_hidden' : [0.2,0.4,0.5,0.8],
'layer_units_hidden' : [input_depth, a, 10, 50, 100, 200, 1000],
'layer_initializer_output' : [None,
keras.initializers.RandomNormal(mean=0.0, stddev=input_depth**(-0.5), seed=1),
keras.initializers.glorot_uniform(seed=1),
keras.initializers.he_uniform(seed=1)],
'L1_ker_output' : [None,0.001,0.005,0.01,0.05,0.1],
'L2_ker_output' : [None,0.001,0.005,0.01,0.05,0.1],
'L1_act_output' : [None,0.001,0.005,0.01,0.05,0.1],
'bias_init_output' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0],
'kernel_const_output' : [None,
keras.constraints.min_max_norm(min_value=-1.0, max_value=1.0, rate=1.0, axis=0),
keras.constraints.min_max_norm(min_value=0, max_value=1.0, rate=1.0, axis=0)]
# RandomizedSearchCV
metric = Metrics()
class_neural_network = KerasClassifier(build_fn=grid_search_model, epochs=200)
grid = RandomizedSearchCV(estimator=class_neural_network, param_grid=hyperparameters, n_jobs = -1, pre_dispatch = 5, random_state = 42, return_train_score = True, verbose=10)
time = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
grid = grid.fit(X_train_rus, y_train_rus_1, callbacks=[metric])
我希望搜索能够运行,没有问题。我收到以下错误消息:
---------------------------------------------------------------------------
OverflowError Traceback (most recent call last)
<ipython-input-34-a4148e6688c1> in <module>()
4 grid = RandomizedSearchCV(estimator=class_neural_network, param_distributions=hyperparameters, n_jobs = -1, pre_dispatch = 5, random_state = 42, return_train_score = True, verbose=10)
5 time = datetime.now().strftime("%Y-%m-%d-%H-%M-%S")
----> 6 grid = grid.fit(X_train_rus, y_train_rus_1, callbacks=[metric])
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
720 return results_container[0]
721
--> 722 self._run_search(evaluate_candidates)
723
724 results = results_container[0]
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
1513 evaluate_candidates(ParameterSampler(
1514 self.param_distributions, self.n_iter,
-> 1515 random_state=self.random_state))
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params)
694
695 def evaluate_candidates(candidate_params):
--> 696 candidate_params = list(candidate_params)
697 n_candidates = len(candidate_params)
698
/anaconda/envs/py35/lib/python3.5/site-packages/sklearn/model_selection/_search.py in __iter__(self)
261 # look up sampled parameter settings in parameter grid
262 param_grid = ParameterGrid(self.param_distributions)
--> 263 grid_size = len(param_grid)
264 n_iter = self.n_iter
265
OverflowError: cannot fit 'int' into an index-sized integer
【问题讨论】:
【参考方案1】:这是因为您的参数 dict
仅由列表组成,例如你指定'bias_init_1' : [0,0.001,0.005,0.01,0.05,0.1,0.5,1.0]
。
这意味着您实际上已经指定了一个离散网格参数。
似乎sklearn
代码正在尝试计算离散参数网格的大小,并且由于这是所有参数列表的笛卡尔积,因此您得到的大小非常大,对于整数来说太大了。
根据我收集的信息,您会收到此错误,因为在将参数指定为网格时,sklearn
将尝试按索引访问它们,因此网格的总大小应该适合整数。
使用随机搜索进行交叉验证时,最好为您的参数指定一个分布,如下所示:
import scipy.stats.distributions as dists
param_grid = dict(
param1=dists.uniform(0, 1), # continuous distribution
param2=dists.randint(16, 512 + 1), # discrete distribution
param3=['foo', 'bar'], # specifying possible values directly
)
以这种方式指定参数网格时,sklearn
不会尝试计算参数网格的大小(因为它在技术上是无限的),因此这应该可以防止出现错误。
对连续变量使用连续分布还可以提高您对搜索空间的有效覆盖范围,因此总体而言,这是一种更好的 CV 方法。
另请注意,在上面的示例中,您可以将离散参数(例如 param3
)与连续参数结合起来。
【讨论】:
拯救了我的一天@avivr!谢谢!以上是关于RandomizedSearchCV 溢出错误:无法将“int”放入索引大小的整数中的主要内容,如果未能解决你的问题,请参考以下文章
为啥在为 RandomizedSearchCV 的参数“param_distributions”提供字典列表时出现错误?
Rust语言——无虚拟机无垃圾收集器无运行时无空指针/野指针/内存越界/缓冲区溢出/段错误无数据竞争
如何使用 RandomizedSearchCV 正确实现 StratifiedKFold
如何为 RandomizedSearchCV 使用预定义拆分