skopt 的 gp_minimize() 函数引发 ValueError: array must not contain infs or NaNs

Posted

技术标签:

【中文标题】skopt 的 gp_minimize() 函数引发 ValueError: array must not contain infs or NaNs【英文标题】:skopt's gp_minimize() function raises ValueError: array must not contain infs or NaNs 【发布时间】:2020-09-16 07:53:53 【问题描述】:

我目前正在使用 skopt (scikit-optimize) 包对神经网络进行超参数调优(我试图最小化 -1* 精度)。在引发 Value Error: array must not contain infs or NaNs 之前,它似乎运行良好(并成功打印到控制台)。

这可能是什么原因?我的数据不包含 infs 或 NaN,我的搜索参数范围也不包含。神经网络代码比较长,为了简洁,我贴一下相关部分: 进口:

import pandas as pd

import numpy as np
from skopt import gp_minimize
from skopt.utils import use_named_args
from skopt.space import Real, Categorical, Integer
from tensorflow.python.framework import ops
from sklearn.model_selection import train_test_split

import tensorflow
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Conv1D, Dropout, MaxPooling1D, Flatten

from keras import backend as K

创建搜索参数:

dim_num_filters_L1 = Integer(low=1, high=50, name='num_filters_L1')
#dim_kernel_size_L1 = Integer(low=1, high=70, name='kernel_size_L1')
dim_activation_L1 = Categorical(categories=['relu', 'linear', 'softmax'], name='activation_L1')
dim_num_filters_L2 = Integer(low=1, high=50, name='num_filters_L2')
#dim_kernel_size_L2 = Integer(low=1, high=70, name='kernel_size_L2')
dim_activation_L2 = Categorical(categories=['relu', 'linear', 'softmax'], name='activation_L2')
dim_num_dense_nodes = Integer(low=1, high=28, name='num_dense_nodes')
dim_activation_L3 = Categorical(categories=['relu', 'linear', 'softmax'], name='activation_L3')
dim_dropout_rate = Real(low = 0, high = 0.5, name = 'dropout_rate')
dim_learning_rate = Real(low=1e-4, high=1e-2, name='learning_rate')

dimensions = [dim_num_filters_L1,
              #dim_kernel_size_L1,
              dim_activation_L1,
              dim_num_filters_L2,
             #dim_kernel_size_L2,
              dim_activation_L2,
              dim_num_dense_nodes,
              dim_activation_L3,
              dim_dropout_rate,
              dim_learning_rate,
             ]

创建将要测试的所有模型的函数:

def create_model(num_filters_L1, #kernel_size_L1, 
                 activation_L1, 
                 num_filters_L2, #kernel_size_L2, 
                 activation_L2,
                 num_dense_nodes, activation_L3,
                 dropout_rate,
                 learning_rate):

    input_shape = (X_train.shape[1], 1)
    model = Sequential()
    model.add(Conv1D(num_filters_L1, kernel_size = 40, activation = activation_L1, input_shape = input_shape))
    model.add(MaxPooling1D(pool_size=2))
    model.add(Conv1D(num_filters_L2, kernel_size=20, activation=activation_L2))
    model.add(MaxPooling1D(pool_size=2))
    model.add(Flatten())
    model.add(Dense(num_dense_nodes, activation = activation_L3))
    model.add(Dropout(dropout_rate))
    model.add(Dense(y_train.shape[1], activation='linear'))
    adam = tensorflow.keras.optimizers.Adam(learning_rate = learning_rate)
    model.compile(optimizer=adam, loss='mean_squared_error', metrics=['accuracy'])

    return model

定义适应度函数:

@use_named_args(dimensions=dimensions)
def fitness(num_filters_L1, #kernel_size_L1, 
                 activation_L1, 
                 num_filters_L2, #kernel_size_L2, 
                 activation_L2,
                 num_dense_nodes, activation_L3,
                 dropout_rate,
                 learning_rate):

    model = create_model(num_filters_L1, #kernel_size_L1, 
                 activation_L1, 
                 num_filters_L2, #kernel_size_L2, 
                 activation_L2,
                 num_dense_nodes, activation_L3,
                 dropout_rate,
                 learning_rate)

    history_opt = model.fit(x=X_train,
                        y=y_train,
                        validation_data=(X_val,y_val), 
                        shuffle=True, 
                        verbose=2,
                        epochs=10
                        )

    #return the validation accuracy for the last epoch.
    accuracy_opt = model.evaluate(X_test,y_test)[1]

    # Print the classification accuracy:
    print("Experimental Model Accuracy: 0:.2%".format(accuracy_opt))

    # Delete the Keras model with these hyper-parameters from memory:
    del model

    # Clear the Keras session, otherwise it will keep adding new models to the same TensorFlow graph each time we create model with a different set of hyper-parameters.
    K.clear_session()
    ops.reset_default_graph()

    # the optimizer aims for the lowest score, so return negative accuracy:
    return -accuracy # or sum(RMSE)? 

运行超参数搜索:

gp_result = gp_minimize(func=fitness,
                            dimensions=dimensions)

print("best accuracy was " + str(round(gp_result.fun *-100,2))+"%.")

【问题讨论】:

【参考方案1】:

您的激活函数未在随机获取函数调用中收敛。我遇到了这个问题并从搜索空间中删除了“relu”功能。

【讨论】:

以上是关于skopt 的 gp_minimize() 函数引发 ValueError: array must not contain infs or NaNs的主要内容,如果未能解决你的问题,请参考以下文章

skopt学习之路1-函数介绍:dummy_minimize

text `gp_minimize`结果问题的示例

从skopt中的检查点恢复高斯过程

可以限制 skopt.Lhs.generate 的结果吗?

使用 skopt 优化超参数 hidden_​​layer_size MLPClassifier

在带有 sklearn/Keras 的神经网络上使用 skopt 进行超参数优化