CS50 Speller 中的分段错误。为啥？

Posted 2023-02-14

技术标签:

【中文标题】CS50 Speller 中的分段错误。为啥？【英文标题】：Segmentation fault in CS50 Speller. Why?CS50 Speller 中的分段错误。为什么？ 【发布时间】：2022-01-15 23:49:13 【问题描述】：

我正在使用 CS50 pset5 拼写器，但我不断收到分段错误错误。 Debug50 表明问题出在load 函数的实现中的n->next = table[index]; 行，第110 行。我试图修改，但我不知道为什么它会出错。在我的代码下面，有人可以帮助我吗？

// Implements a dictionary's functionality

#include <stdbool.h>
#include <strings.h>
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "dictionary.h"

// Represents a node in a hash table
typedef struct node 
    char word[LENGTH + 1];
    struct node *next;
 node;

// Number of buckets in hash table
const unsigned int N = 150000;

// Nodes counter
int nodes_counter = 0;

// Hash table
node *table[N];

// Returns true if word is in dictionary, else false
bool check(const char *word)

    // TODO
    int hash_value = hash(word);
    node *cursor = malloc(sizeof(node));
    if (cursor != NULL)
    
        cursor = table[hash_value];
    

    if (strcasecmp(cursor->word, word) == 0) // If word is first item in linked list
    
        return 0;
    
    else // Iterate over the list by moving the cursor
    
        while (cursor->next != NULL)
        
            if (strcasecmp(cursor->word, word) == 0) // If word is found
            
                return 0;
            
            else
            
                cursor = cursor->next;
            
        
    
    return false;


// Hashes word to a number
unsigned int hash(const char *word)

    // Adaptation of FNV function, source https://www.programmingalgorithms.com/algorithm/fnv-hash/c/
    const unsigned int fnv_prime = 0x811C9DC5;
    unsigned int hash = 0;
    unsigned int i = 0;

    for (i = 0; i < strlen(word); i++)
    
        hash *= fnv_prime;
        hash ^= (*word);
    

    return hash;


// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)

    // Open Dictionary File (argv[1] or dictionary?)
    FILE *file = fopen(dictionary, "r");
    if (file == NULL)
    
        printf("Could not open file\n");
        return 1;
    
    // Read until end of file word by word (store word to read in word = (part of node)?)

    char word[LENGTH + 1];

    while(fscanf(file, "%s", word) != EOF)
    
        // For each word, create a new node
        node *n = malloc(sizeof(node));
        if (n != NULL)
        
            strcpy(n->word, word);
            //Omitted to avoid segmentation fault n->next = NULL;
            nodes_counter++;
        
        else
        
            return 2;
        

        // Call hash function (input: word --> output: int)
        int index = hash(word);

        // Insert Node into Hash Table
        n->next = table[index];
        table[index] = n;
    
    return false;


// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)

    // Return number of nodes created in Load
    if (nodes_counter > 0)
    
        return nodes_counter;
    

    return 0;


// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)

    // TODO
    for (int i = 0; i < N; i++)
    
        node *cursor = table[i];
        while (cursor->next != NULL)
        
            node *tmp = cursor;
            cursor = cursor->next;
            free(tmp);
        
    
    return false;

【问题讨论】：

-fsanitize=address 擅长调试这些。在check函数node *cursor = malloc(sizeof(node));后面跟cursor = table[hash_value];是内存泄漏。如果 table[hash_value] 为 NULL，则函数中的所有其余代码都使用 NULL 指针。您不需要在该函数内分配任何内容。从表中获取条目，如果它不是 NULL，则检查是否找到该单词，否则返回 false。也不能保证hash 函数将返回一个小于 150000 的值，因此您会越界访问数组。你需要像int hash_value = hash(word) % N ; 这样的东西来强制它不是正确的范围。您需要在使用 hash 函数的返回值的任何地方执行此操作。在你的hash函数中，虽然你用i循环word的长度，你实际上并没有使用i来索引word，所以你只是使用word 的第一个字符一遍又一遍。而不是hash ^= (*word);，我想你想要hash ^= word[i];。 【参考方案1】：

您的代码中存在多个问题：

如果N 是常量表达式，则node *table[N]; 不能仅定义为全局对象。 N 被定义为 const unsigned int，但 N 不是 C 中的常量表达式（尽管它在 C++ 中）。您的程序编译只是因为编译器接受它作为不可移植的扩展。使用宏或枚举。 cursor 在check() 中分配后立即覆盖它。此函数无需分配节点。 hash() 函数应该为仅大小写不同的单词生成相同的哈希值。 hash() 函数只使用word 中的第一个字母。 hash() 函数可以返回一个哈希值 >= N。 fscanf(file, "%s", word) 应防止缓冲区溢出。在unload() 中取消引用之前，您不检查cursor 是否为非空值

这是修改后的版本：

// Implements a dictionary's functionality

#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <strings.h>

#include "dictionary.h"

// Represents a node in a hash table
typedef struct node 
    char word[LENGTH + 1];
    struct node *next;
 node;

// Number of buckets in hash table
enum  N = 150000 ;

// Nodes counter
int nodes_counter = 0;

// Hash table
node *table[N];

// Returns true if word is in dictionary, else false
bool check(const char *word) 
    int hash_value = hash(word);

    // Iterate over the list by moving the cursor
    for (node *cursor = table[hash_value]; cursor; cursor = cursor->next) 
        if (strcasecmp(cursor->word, word) == 0) 
            // If word is found
            return true;
        
    
    // If word is not found
    return false;


// Hashes word to a number
unsigned int hash(const char *word) 
    // Adaptation of FNV function, source https://www.programmingalgorithms.com/algorithm/fnv-hash/c/
    unsigned int fnv_prime = 0x811C9DC5;
    unsigned int hash = 0;

    for (unsigned int i = 0; word[i] != '\0'; i++) 
        hash *= fnv_prime;
        hash ^= toupper((unsigned char)word[i]);
    
    return hash % N;


// Loads dictionary into memory, returning true if successful, else a negative error number
int load(const char *dictionary) 
    // Open Dictionary File (argv[1] or dictionary?)
    FILE *file = fopen(dictionary, "r");
    if (file == NULL) 
        printf("Could not open file\n");
        return -1;
    
    // Read until end of file word by word (store word to read in word = (part of node)?)

    char word[LENGTH + 1];
    char format[10];
    // construct the conversion specifier to limit the word size
    //    read by fscanf()
    snprintf(format, sizeof format, "%%%ds", LENGTH);

    while (fscanf(file, format, word) == 1) 
        // For each word, create a new node
        node *n = malloc(sizeof(node));
        if (n == NULL) 
            fclose(file);
            return -2;
        
        strcpy(n->word, word);
        n->next = NULL;
        nodes_counter++;

        // Call hash function (input: word --> output: int)
        int index = hash(word);

        // Insert Node into Hash Table
        n->next = table[index];
        table[index] = n;
    
    fclose(file);
    return true;


// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void) 
    // Return number of nodes created in Load
    return nodes_counter;


// Unloads dictionary from memory, returning true if successful, else false
bool unload(void) 
    for (int i = 0; i < N; i++) 
        node *cursor = table[i];
        table[i] = NULL;
        while (cursor != NULL) 
            node *tmp = cursor;
            cursor = cursor->next;
            free(tmp);
        
    
    return true;

【讨论】：

哇，看来我的错误很少。我检查并浏览了你所有的要点，有人花时间检查和解释错误真是太好了。非常感谢你，它让我学到了很多东西。

以上是关于CS50 Speller 中的分段错误。为啥？的主要内容，如果未能解决你的问题，请参考以下文章