使用 FIONREAD 调用 ioctl() 会在明显的竞争条件下导致奇怪的副作用,

Posted

技术标签:

【中文标题】使用 FIONREAD 调用 ioctl() 会在明显的竞争条件下导致奇怪的副作用,【英文标题】:Call to ioctl() with FIONREAD results in strange side-effects in apparent race condition, 【发布时间】:2014-12-15 15:32:55 【问题描述】:

我正在编写一个并行神经网络模拟器,最近在我的代码中遇到了一个让我完全困惑的问题(当然,我只是一名中级 C++ 程序员,所以也许我遗漏了一些明显的东西?),。 .. 我的代码涉及一个“服务器”和许多客户端(工作人员),它们从服务器获取工作并将结果返回给服务器。这是服务器部分:

#include <iostream>
#include <fstream>

#include <arpa/inet.h>
#include <sys/epoll.h>
#include <errno.h>

#include <sys/ioctl.h>

void advanceToNextInputValue(std::ifstream &trainingData, char &nextCharacter)
   

      nextCharacter = trainingData.peek();
      while(nextCharacter != EOF && !isdigit(nextCharacter))
         
sleep(1);
            trainingData.get();
sleep(1);
            nextCharacter = trainingData.peek();
         
   

int main()
   
      // Create a socket,...
      int listenerSocketNum = socket(AF_INET, SOCK_STREAM, 0);

      // Name the socket,...
      sockaddr_in socketAddress;
      socklen_t socketAddressLength = sizeof(socketAddress);

      inet_pton(AF_INET, "127.0.0.1", &(socketAddress.sin_addr));
      socketAddress.sin_port = htons(9988);
      bind(listenerSocketNum, reinterpret_cast<sockaddr*>(&socketAddress), socketAddressLength);

      // Create a connection queue for worker processes waiting to connect to this server,...
      listen(listenerSocketNum, SOMAXCONN);


      int epollInstance = epoll_create(3); // Expected # of file descriptors to monitor

      // Allocate a buffer to store epoll events returned from the network layer
      epoll_event* networkEvents = new epoll_event[3];

      // Add the server listener socket to the list of file descriptors monitored by epoll,...
      networkEvents[0].data.fd = -1; // A marker returned with the event for easy identification of which worker process event belongs to
      networkEvents[0].events = EPOLLIN | EPOLLET; // epoll-IN- since we only expect incoming data on this socket (ie: connection requests from workers),...
                                                   // epoll-ET- indicates an Edge Triggered watch
      epoll_ctl(epollInstance, EPOLL_CTL_ADD, listenerSocketNum, &networkEvents[0]);


      char nextCharacter = 'A';
      std::ifstream trainingData;

      // General multi-purpose/multi-use variables,...
      long double v;
      signed char w;
      int x = 0;
      int y;

      while(1)
         
            y = epoll_wait(epollInstance, networkEvents, 3, -1); // the -1 tells epoll_wait to block indefinitely

            while(y > 0)
               
                  if(networkEvents[y-1].data.fd == -1) // We have a notification on the listener socket indicating a request for a new connection (and we expect only one for this testcase),...
                     
                        x = accept(listenerSocketNum,reinterpret_cast<sockaddr*>(&socketAddress), &socketAddressLength);

                        networkEvents[y-1].data.fd = x; // Here we are just being lazy and re-using networkEvents[y-1] temporarily,...
                        networkEvents[y-1].events = EPOLLIN | EPOLLET;

                        // Add the socket for the new worker to the list of file descriptors monitored,...
                        epoll_ctl(epollInstance, EPOLL_CTL_ADD, x, &networkEvents[y-1]);

                        trainingData.open("/tmp/trainingData.txt");
                     
                  else if(networkEvents[y-1].data.fd == x) // Worker is waiting to receive datapoints for calibration,...
                     
                        std::cout << "nextCharacter before call to ioctl: " << nextCharacter << std::endl;
                        ioctl(networkEvents[y-1].data.fd, FIONREAD, &w);
                        std::cout << "nextCharacter after call to ioctl: " << nextCharacter << std::endl;

                        recv(networkEvents[y-1].data.fd, &v, sizeof(v), MSG_DONTWAIT); // Retrieve and discard the 'tickle' from worker

                        if(nextCharacter != EOF)
                           
                              trainingData >> v;

                              send(networkEvents[y-1].data.fd, &v, sizeof(v), MSG_DONTWAIT);
                              advanceToNextInputValue(trainingData, nextCharacter);
                           
                     

                  y--;
               
         

      close(epollInstance);
      return 0;
   

这是客户端部分:

#include <arpa/inet.h>

int main()
   
      int workerSocket = socket(AF_INET, SOCK_STREAM, 0);

      // Name the socket as agreed with the server:
      sockaddr_in serverSocketAddress;
      serverSocketAddress.sin_family = AF_INET;
      serverSocketAddress.sin_port = htons(9988);
      inet_pton(AF_INET, "127.0.0.1", &(serverSocketAddress.sin_addr));

      // Connect your socket to the server's socket:
      connect(workerSocket, reinterpret_cast<sockaddr*>(&serverSocketAddress), sizeof(serverSocketAddress));

      long double z;
      while(1)
         
            send(workerSocket, &z, sizeof(z), MSG_DONTWAIT); // Send a dummy result/tickle to server,...
            recv(workerSocket, &z, sizeof(z), MSG_WAITALL);
         
   

我遇到问题的代码部分如下(来自服务器):

std::cout << "nextCharacter before call to ioctl: " << nextCharacter << std::endl;
ioctl(networkEvents[y-1].data.fd, FIONREAD, &w);
std::cout << "nextCharacter after call to ioctl: " << nextCharacter << std::endl;

在这里(至少在我的系统上),在某些情况下,对 ioctl 的调用基本上会消除 'nextCharacter' 的值,我不知道如何或为什么!

这些是我期望得到的结果:

$ ./server.exe
nextCharacter before call to ioctl: A
nextCharacter after call to ioctl: A
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 9
nextCharacter after call to ioctl: 9
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl: 2
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl: 1
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl: 2
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl: ÿ

(带有变音符号的小写'y'是文件结尾字符EOF)

这些是我得到的结果(请注意,我们最终进入了一个无限循环,因为停止条件依赖于 nextCharacter 的值并且被清除了,所以它永远不会停止):

$ ./server.exe
nextCharacter before call to ioctl: A
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 9
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 1
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: 2
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl:
nextCharacter before call to ioctl: ÿ
nextCharacter after call to ioctl:
.
.
.

如果我注释掉本节中的任何睡眠语句(在服务器中):

void advanceToNextInputValue(std::ifstream &trainingData, char &nextCharacter)
   

      nextCharacter = trainingData.peek();
      while(nextCharacter != EOF && !isdigit(nextCharacter))
         
sleep(1);
            trainingData.get();
sleep(1);
            nextCharacter = trainingData.peek();
         
   

然后我得到了我期望得到的结果,...

这是我正在使用的makefile:

$ cat Makefile
all: server client

server: server.cpp
        g++ server.cpp -o server.exe -ansi -fno-elide-constructors -O3 -pedantic-errors -Wall -Wextra -Winit-self -Wold-style-cast -Woverloaded-virtual -Wuninitialized -Winit-self

client: client.cpp
        g++ client.cpp -o client.exe -ansi -fno-elide-constructors -O3 -pedantic-errors -Wall -Wextra -Winit-self -Wold-style-cast -Woverloaded-virtual -Wuninitialized -Winit-self

使用如下所示的 trainingData.txt 文件:

$ cat trainingData.txt
15616.16993666375,15616.16993666375,9.28693983312753E20,24.99528974548316,16.91935342923897,16.91935342923897,1.386594632397968E6,2.567209162871251

那么我发现了一个新错误还是我只是愚蠢? :) 老实说,我不明白为什么用 FIONREAD 调用 ioctl 应该告诉我套接字上有多少字节等待读取,无论如何都会影响变量 'nextCharacter' 的值,.. .

请注意,这是原始程序的精简版本,它仍然能够重现问题(至少在我的系统上),所以请记住,上面的代码 sn-ps 中的某些内容可能没有意义: )

特里

【问题讨论】:

【参考方案1】:

来自man ioctl_list

FIONREAD int *

也就是说,FIONREAD 需要一个指向整数的指针,但您传递的是指向 signed char 的指针。

解决方案:改变你的:

signed char w;

int w;

否则您将遭受未定义的行为

您所看到的解释可能是编译器将wnextCharacter 变量放在内存中,前者的溢出覆盖了后者的值。

【讨论】:

调用它并没有多大意义。之后可用的字节数可能会迅速改变。 OP 应该一直读取到 EAGAIN/EWOULDBLOCK 出现。 @EJP:是的,我注意到了,但问题特别是关于奇怪的数据损坏,而不是关于代码的non,意义。 @rodrigo:就是这样!非常感谢!显然我还有很多东西要学! :) @EJP:是的,我不想在这种情况下阅读,因为我不想缓冲(在此校准阶段,返回的数字无论如何都会被丢弃)。我知道客户端正在发送 16 个字节,如果我读取的内容少于 16 个,那么我将不得不跟踪我还需要读取多少字节才能继续下一步,...... 只是看起来更容易调用ioctl 重复直到我有 16 个字节,然后一次将它全部读入一个 long double :)

以上是关于使用 FIONREAD 调用 ioctl() 会在明显的竞争条件下导致奇怪的副作用,的主要内容,如果未能解决你的问题,请参考以下文章

在哪里可以找到系统调用参数的定义?

为啥来自 /dev/null 的 ioctl FIONREAD 在 Mac OS X 上返回 0 而在 Linux 上返回随机数?

ioctl怎么使用?

linux套接字或者文件描述符的未读取得字节数FIONREAD,MSG_PEEK标志

ioctl 函数的FIOREAD参数

发现 /proc/mounts 的“真实”大小