为啥在内置类型上使用 memset 会导致问题？ [关闭]

Posted 2023-02-16

技术标签:

【中文标题】为啥在内置类型上使用 memset 会导致问题？ [关闭]【英文标题】：Why is memset causing problem despite being used on built-in types? [closed]为什么在内置类型上使用 memset 会导致问题？ [关闭] 【发布时间】：2019-01-12 17:28:22 【问题描述】：

我对@987654325@ 非常陌生，并且在Codeforces 上向this 提交问题，突然发现使用memset() 导致Wrong answer 进入其中一个测试用例。

这是测试用例：

Input:
4 4
3 3 3 5
Participant's output
NO
Jury's answer
YES
1 2 3 4 
Checker comment
wrong answer Jury has the answer but the participant hasn't

代码如下：

#include<iostream>
using namespace std;


int check_if_painted[5010][5010];
int input_array[5010];
int main()

    int n,k;
    cin>>n>>k;

    int occurence_count[n];//Keeps track of the total no. of occurences of an element in the input_array.
    memset(occurence_count,0,sizeof(occurence_count));
    /*
    The following loop checks if the occurrence of a particular 
    element is not more than k. If the occurence>k the "NO" is printed and program ends.
    */
    for (int i = 0; i < n; ++i)
    
        cin>>input_array[i];
        ++occurence_count[input_array[i]];
        if(occurence_count[input_array[i]]>k)
            cout<<"NO";
            return 0;
        
    
    cout<<"YES\n";


    /*
    The following loop uses the array check_if_painted as a counter to check if the particular 
    occurrence of an element has been painted with a colour from 1 to k or not. 
    If some previous occurrence of this particular element has been painted with f%k+1, 
    then f is incremented until we encounter any value(of `f%k+1`) from 1 to k that hasn't been 
    used yet to colour and then we colour this element with that value by printing it.
    */
    int f=0;//
    /*
    f is a global value which increments to a very large value but f%k+1 is used 
    to restrict it within the 1 to k limit(both inclusive). So, we are able to check 
    if any previous occurrence of the current element has already been coloured with the value f%k+1 or not.  
    which essentially is 1 to k.
    */ 
    for(int i=0;i<n;++i)
        while(check_if_painted[input_array[i]][f%k+1]>0)
            ++f;
        
        cout<<f%k+1<<" ";
        ++check_if_painted[input_array[i]][f%k+1];
        ++f;
    
    return 0;

但是，当我尝试下面的代码时，它被成功接受。

#include<iostream>
using namespace std;


int check_if_painted[5010][5010];
int input_array[5010];
int occurence_count[5010];
int main()

    int n,k;
    cin>>n>>k;




    for (int i = 0; i < n; ++i)
    
        cin>>input_array[i];
        ++occurence_count[input_array[i]];
        if(occurence_count[input_array[i]]>k)
            cout<<"NO";
            return 0;
        
    
    cout<<"YES\n";



    int f=0;

    for(int i=0;i<n;++i)
        while(check_if_painted[input_array[i]][f%k+1]>0)
            ++f;
        
        cout<<f%k+1<<" ";
        ++check_if_painted[input_array[i]][f%k+1];
        ++f;
    
    return 0;

从 SO 上的this 帖子中，我发现memset 在内置类型上运行良好。那么，当它被用于默认类型的 int 数组时，为什么会在我的情况下导致问题。

另外，我还读到std::fill() 是更好的选择，并在this 帖子中读到memset 是一个危险的功能，那么为什么它仍在使用？

【问题讨论】：

从技术上讲，它们都不是“正确的”，因为 C++ 不允许声明使用动态大小的数组。大小必须是常量表达式。检查sizeof(h) 产生的值。我敢打赌这是错的。投反对票可能是因为您丢弃了两个变量名不佳、语义不清楚且没有 cmets 的大代码块，迫使我们在视觉上区分它们并尝试找出 memset 与它，而不是执行一些调试并生成minimal reproducible example。因为除了memset 的存在之外，还有更多的差异。真的看不出这个问题对未来的访问者有什么用，抱歉。（仅仅因为您发布了“出于学习目的”的内容并不意味着您不会被否决！）顺便说一句，您的 typedef long long ll; 使您的代码无法读取，恕我直言。 cmets 是一个很大的改进（我知道这是为在线评委准备的，不会给代码看起来像什么废话）但经验法则：任何时候你发现自己需要添加注释来解释代码的作用，问问自己“有什么方法可以改进代码以使其具有自我描述性吗？”代码本身应该说明大部分情况。注释应该是一个附加组件，以提供更多的上下文、假设和其他无法从代码中推断出来的信息。 【参考方案1】：

这与memset 无关。您的代码超出了数组的边界，简单明了。

在您的输入案例中，您有 n = 4 和 k = 4，因此 occurrence_count 的长度为 4 个元素（其有效索引从 0 到 3 包括在内）。然后，你做

    cin>>input_array[i];
    ++occurence_count[input_array[i]];

鉴于最后一个值为 4，您最终将执行++occurence_count[4]，这超出了数组的边界。这是未定义的行为，在您的情况下，它表现为不属于该数组的递增内存，这很可能不会从 0 开始并且会弄乱以后的检查。

在您的第二个代码 sn-p 中没有发现问题，因为您使 occurence_count 5010 个元素变大并默认为零初始化，因为它是一个全局变量。

现在，如果您要计算数组值的出现次数，当然将出现次数数组的大小调整为与元素数一样大是错误的 - 这就是您要读取的数字计数（这很好到input_array) 的大小，不是您可以读取的最大值。鉴于数组元素 values 的范围是 1 到 5000，因此出现数组的大小必须为 5001（保持值不变）或大小为 5000（将您读取的值减 1 以索引此类数组)。

（通常，要小心，因为问题文本中的所有索引都是基于 1 的，而 C++ 中的索引是基于 0 的；如果您对问题索引进行推理，然后将它们用作 C 索引，则可能会出现错误, 除非您将数组的大小加一并忽略第 0 个元素）。

最后，补充几点：

如果你编译时启用了足够多的警告或使用了足够新的编译器，它会正确地抱怨memset 未定义或它被隐式定义（使用不正确的原型，顺便说一句）；你应该#include <string.h> 使用memset;

正如 @Nicol Bolas 在他的回答中竭尽全力解释，在声明一个仅在运行时知道大小的本地数组时，您使用的是 VLA（可变长度数组） (int occurence_count[n]) .

VLA 不是标准 C++，因此它们没有得到很好的说明，一些编译器不支持它们，而且通常在很多方面存在问题（大多数情况下，您不应该真正分配未知数量的栈上的数据，一般比较小）；

您可能应该避免使用std::vector，或者考虑到问题为您提供了颜色和元素的上限 (5000)，只是静态数组。

【讨论】：

以上是关于为啥在内置类型上使用 memset 会导致问题？ [关闭]的主要内容，如果未能解决你的问题，请参考以下文章