多目标非支配排序遗传算法-NSGA-II（二)

Posted 2022-06-09 流浪若相惜

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了多目标非支配排序遗传算法-NSGA-II（二)相关的知识，希望对你有一定的参考价值。

序

在（一）中我介绍了GA、NSGA以及NSGA-II算法以及他们之间的关系，据完成（一）已经差不多10个月了，当初的愿望实现了吗，事到如今只好祭奠吗？任岁月风干了…，心中不由一首老男孩送给自己。面临毕业，最后一块还没有搞完，折腾了小半年，我又回来了，回到了最初的起点😂。从头开始，未来虽不可期，但我还是得不卑不亢的继续下去，心灵鸡汤要喝，万一梦想实现了呢，况且无心插柳柳都成荫了，我还是有心地去种树呢🙃。

废话不多说，有关NSGA-II算法的请参考第一篇文章：多目标非支配排序遗传算法-NSGA-II（一）。本文主要是继续上次我们提到的关于决策偏好的算法原理及实现，然后顺便再定个小计划：把NSGA-III搞一下。

文章目录

参考资料

首先，参考之前博客样式，我们先列出参考的文献、博客及源码。
文献：Reference Point Based Multi-Objective Optimization Using Evolutionary Algorithms
博客：多目标非支配排序遗传算法-NSGA-II（一)
源码：NGPM Song Lin

拥挤度选择缺陷 or 基于参考点选择方法提出背景

NSGA-II算法采用拥挤度选择策略，通过非支配等级以及拥挤度距离来实现对于Pareto解集的选择。这种选择方式通过二元锦标赛方法确实实现了两两解的对比，但是当我们回归现实时，却发现决策者的选择看起来并不像我们之前解释的那么简单，仅给予Pareto前沿并不是决策者所想要的，貌似仍然有一种“给你饭你吃”的赶脚。怎么办？最简单的方式是让决策者参与到决策中，这种参与不仅仅是选择方案，而且还应给予一定的需求建议。换句话说，制定者在设定问题，构建模型的时候，决策者会给予一些基于偏好或者经验性的指导意见，这些指导意见进而转化为一些目标点，我们这里称之为 “参考点”，通过这些参考点，然后再进行建模、求解、决策（邪性制图如下）。
可能有同学会说有交互式多目标Pareto解集也会给予基于偏好的解，但那种传统的偏好方法往往将多目标转换为单目标并给予一个偏好解，这种单偏好解的方法往往并不可靠，而本文所介绍的方法则是给予一组基于偏好的解集，从而提高决策的可靠性。接下来，我们详细了解下传统和新的参考方法的原理及区别。

传统参考点方法

相关学者对基于参考点的进化多目标算法（EMO）进行了较多探究，并形成了所谓的传统参考点方法。Wierzbicki 对参考点的定义如下：
The reference point approach in which the goal is to achieve a weakly, ε-properly or Pareto-optimal solution closest to a supplied reference point of aspiration level based on solving an achievement scalarizing problem.
对于多目标优化问题，通过设置权重将多目标转换为单目标问题，对于每个目标都给予目标函数方向的参考点向量，当获取的参考点与Pareto优化前沿点距离最小，即优化前沿点最接近于参考点时，那么获取的的优化点将是在考虑参考点的基础上获得的最终结果。其表达式如下：
$\\sum^M_i[w_i*(f_i(\\bf x)-\\overline z_i)] s.t. \\bf x∈S \\tag 1$ 其中, $\\bf \\overline z_i$ 是参考点, $f_i$ 是目标函数， $w_i$ 表示权重。
注：原文公式是

感觉是有问题的，因此做了修改。
为了让参考点方法有效，Wierzbicki 对该过程进行了如下操作：
1、先确定参考点 $\\overline z$ 的最近优化点 $z^{'}$
2、重新创建M个参考点
3、获取各个目标函数的最优点
$\\bf z_i=\\overline z+(z'-\\overline z)*e_i \\tag 2$
其中 $\\bf e_i$ 为方向向量，原理如图所示：

$\\bf z_A和z_B$ 是新生成的两个参考点，需要注意的是，若决策者对于结果不满意，上述过程将会重复进行。

因此，参考点提供了关于要关注的区域的更高层次的信息，而权值向量提供了关于要收敛帕累托最优前沿上哪个点的更详细的信息。

新参考点方法

那么这种传统参考点方式有什么缺陷，我想大家很容易就从公式中看出来。没错，一看到赋权这东西，搞多目标优化的人就很谨慎，这种主观赋权方式诟病很大，尽管是模型制定者或决策者给予了经验性的赋权，但这种多目标转单目标方式有时会因此产生较大的风险。此外，这种方式每次只能使用一个参考点，当决策者给予多个参考点时，这种方法就无能为力了。怎么办呢？

是的，主角上场了，EMO方法可以解决以上这些问题，本文介绍的EMO是基于NSGA-II方法之上的参考点方法，改方法的思路如下：

具体细节我先直接上原文，如下：

第一步比较好理解，意思就是求前沿点与参考点的距离；
第二步和第三步说实话，我看了半天没搞明白啥意思，最终还是通过代码才搞明白了这两步的具体含义。其实这两步是为了将与前沿点最近的参考点距离设置为其preDistance, 与此同时，为了种群多样性，在第三步首先随机选择一个前沿点作为新的的 “前沿参考点”，然后设置约束限制，凡是距离之间相近或者小于ε-clearing的前沿点，将与前沿参考点相近的那个preDistance增大，进而实现种群的多样性，避免局部收敛或者收敛过快的问题。

代码实现

源码实现过程，主要是在非支配排序和选择部分体现，上述思想过程主要是在非支配排序中实现，具体实现代码如下：

function [opt, pop] = calcPreferenceDistance(opt, pop, front)
% Function: [opt, pop] = calcPreferenceDistance(opt, pop, front)
% Description: Calculate the 'preference distance' used in R-NSGA-II.
% Return: 
%   opt : This structure may be modified only when opt.refUseNormDistance=='ever'.
%
%    Copyright 2011 by LSSSSWC
%    Revision: 1.1  Data: 2011-07-26
%*************************************************************************

%*************************************************************************
% 1. Initialization
%*************************************************************************
numObj = length( pop(1).obj );  % number of objectives

refPoints = opt.refPoints;
refWeight = opt.refWeight;      % weight factor of objectives
if(isempty(refWeight))
    refWeight = ones(1, numObj);
end
epsilon = opt.refEpsilon;
numRefPoint = size(refPoints, 1);

% Determine the normalized factors
bUseFrontMaxMin = false;    % bUseFrontMaxMin : If use the maximum and minimum value in the front as normalized factor.
if( strcmpi(opt.refUseNormDistance, 'ever') )
    % 1) Find possiable (not current population) maximum and minimum value
    %     of each objective.
    obj = vertcat(pop.obj);
    if( ~isfield(opt, 'refObjMax_tmp') )
        opt.refObjMax_tmp = max(obj);
        opt.refObjMin_tmp = min(obj);
    else
        objMax = max(obj);
        objMin = min(obj);
        for i = 1:numObj
            if(opt.refObjMax_tmp(i) < objMax(i))
                opt.refObjMax_tmp(i) = objMax(i);
            end
            if(opt.refObjMin_tmp(i) > objMin(i))
                opt.refObjMin_tmp(i) = objMin(i);
            end
        end
        clear objMax objMin
    end
    objMaxMin = opt.refObjMax_tmp - opt.refObjMin_tmp;
    clear obj
elseif( strcmpi(opt.refUseNormDistance, 'front') )
    % 2) Do not use normalized Euclidean distance.
    bUseFrontMaxMin = true;
elseif( strcmpi(opt.refUseNormDistance, 'no') )
    % 3) Do not use normalized Euclidean distance.
    objMaxMin = ones(1,numObj);
else
    % 3) Error
    error('NSGA2:ParamError', ...
        'No support parameter: options.refUseNormDistance="%s", only "yes" or "no" are supported',...
        opt.refUseNormDistance);
end


%*************************************************************************
% 2. Calculate preference distance pop(:).prefDistance
%*************************************************************************
for fid = 1:length(front)
    % Step1: Calculate the weighted Euclidean distance in each front
    idxFront = front(fid).f;            % idxFront : index of individuals in current front
    numInd = length(idxFront);          % numInd : number of individuals in current front
    popFront = pop(idxFront);           % popFront : individuals in front fid

    objFront = vertcat(popFront.obj);   % objFront : the whole objectives of all individuals

    if(bUseFrontMaxMin)
        objMaxMin = max(objFront) - min(objFront); % objMaxMin : the normalized factor in current front
    end

    % normDistance : weighted normalized Euclidean distance
    normDistance = calcWeightNormDistance(objFront, refPoints, objMaxMin, refWeight);
    
    
    % Step2: Assigned preference distance
    prefDistanceMat = zeros(numInd, numRefPoint);
    for ipt = 1:numRefPoint
        [~,ix] = sort(normDistance(:, ipt));%升序排序，选择距离小的
        prefDistanceMat(ix, ipt) = 1:numInd;%排序位置
    end
    prefDistance = min(prefDistanceMat, [], 2); % 2表示计算每行min 对于参考点距离的rank位置，rank的值当作参考距离
    clear ix

    
    % Step3: Epsilon clearing strategy
    idxRemain = 1:numInd;           % idxRemain : index of individuals which were not processed
    while(~isempty(idxRemain))
        % 1. Select one individual from remains
        objRemain = objFront( idxRemain, :);
        selIdx = randi( [1,length(idxRemain)] );
        selObj = objRemain(selIdx, :);

        % 2. Calc normalized Euclidean distance
        % distanceToSel : normalized Euclidean distance to the selected points
        distanceToSel = calcWeightNormDistance(objRemain, selObj, objMaxMin, refWeight);
        

        % 3. Process the individuals within a epsilon-neighborhood
        idx = find( distanceToSel <= epsilon );     % idx : index in idxRemain
        if(length(idx) == 1)    % the only individual is the selected one
            idxRemain(selIdx)=[];
        else
            for i=1:length(idx)
                if( idx(i)~=selIdx )
                    idInIdxRemain = idx(i);     % idx is the index in idxRemain vector
                    id = idxRemain(idInIdxRemain);
                    
                    % *Increase the preference distance to discourage the individuals
                    % to remain in the selection.
                    prefDistance(id) = prefDistance(id) + round(numInd/2);
                end
            end
            idxRemain(idx) = [];
        end
        
    end

    % Save the preference distance
    for i=1:numInd
        id = idxFront(i);
        pop(id).prefDistance = prefDistance(i);
    end
end

参考点距离计算：

function pop = calcCrowdingDistance(opt, pop, front)
% Function: pop = calcCrowdingDistance(opt, pop, front)
% Description: Calculate the 'crowding distance' used in the original NSGA-II.
% Syntax:
% Parameters: 
% Return: 
%
%    Copyright 2011 by LSSSSWC
%    Revision: 1.0  Data: 2011-07-11
%*************************************************************************

numObj = length( pop(1).obj );  % number of objectives
for fid = 1:length(front)
    idx = front(fid).f;
    frontPop = pop(idx);        % frontPop : individuals in front fid
    
    numInd = length(idx);       % nInd : number of individuals in current front
    
    obj = vertcat(frontPop.obj);
    obj = [obj, idx'];          % objctive values are sorted with individual ID
    % 获取每个rank下每个个体不同目标函数下的拥挤度距离，个体拥挤度为个目标函数拥挤度之和s
    for m = 1:numObj
        obj = sortrows(obj, m); %Sort the rows of A based on the values in the second column. When the specified column has repeated elements, 
                                %the corresponding rows maintain their original order.
        %对于每一个目标函数排序后的第一个和最后一个个体的距离设置为inf
        colIdx = numObj+1;
        pop( obj(1, colIdx) ).distance = Inf;         % the first one
        pop( obj(numInd, colIdx) ).distance = Inf;    % the last one
        
        minobj = obj(1, m);         % the minimum of objective m
        maxobj = obj(numInd, m);    % the maximum of objective m
        
        for i = 2:(numInd-1)
            id = obj(i, colIdx);
            pop(id).distance = pop(id).distance + (obj(i+1, m) - obj(i-1, m)) / (maxobj - minobj); %均一化
        end
    end
end

在选择过程中也做了简单的修改，距离改成了preDistance，修改代码如下：


function result = preferenceComp(guy1, guy2)
% Function: result = preferenceComp(guy1, guy2)
% Description: Preference operator used in R-NSGA-II
% Return: 
%   1 = guy1 is better than guy2
%   0 = other cases
%
%    Copyright 2011 by LSSSSWC
%    Revision: 1.0  Data: 2011-07-11
%*************************************************************************

if(  (guy1.rank  < guy2.rank) || ...
    ((guy1.rank == guy2.rank) && (guy1.prefDistance < guy2.prefDistance)) )
    result = 1;
else
    result = 0;
end

实现结果

可以看出，从图中可以找到参考点较近的Pareto 优化解。

需要注意的是，Pareto 前沿可以根据ε的值变化而变化，如下图，因此，在选择ε值时需要根据决策者的需求寻找合适的前沿。

我们不妨将ε理解为宽松度或者是一种期望水平误差，当ε越小时，也就是我们的期望水平要求误差越小，那么将会看到Pareto前沿点更加的逼近参考点。通过下图对比更容易理解这个意思，如下图所示
ε=0.01

ε=0.0001

开发者涨薪指南

48位大咖的思考法则、工作方式、逻辑体系

以上是关于多目标非支配排序遗传算法-NSGA-II（二)的主要内容，如果未能解决你的问题，请参考以下文章