也谈如何写一个Webserver

Posted 2021-05-15 grassroot72

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了也谈如何写一个Webserver相关的知识，希望对你有一定的参考价值。

　　在上一篇里，我介绍了我为什么写了个Maestro Webserver以及介绍了我写的http message parser．下面我就介绍一下，我是如何应用socket和epoll的．

　　在socket编程中会碰到到底是使用阻塞和非阻塞的方式，由于想提升效率我选择了非阻塞方式，又由于linux中和非阻塞方式配合的比较好的是epoll，所以我就用了epoll．

　　那么，用什么方式来组织和管理epoll和socket呢？

　　我是这么考虑的，由于任何从客户端发来的信息都是要在建立了Socket连接的基础上才会被处理，那么，epoll和socket就应该用来管理

和处理socket连接．所以，应该创建一个socket连接的结构．在我的程序中这个结构叫httpconn_t，

typedef struct {
  int sockfd;
  int epfd;
  long stamp;

  PGconn *pgconn;
  rbtree_t *cache;
  rbtree_t *timers;
  rbtree_t *authdb;
  httpcfg_t *cfg;
} httpconn_t;

先不用看其他部分，大家目前阶段只关注sockfd和epfd这两个代表socket file descriptor和epoll file descriptor的结构变量就好了，其他的变量是用来后续实现数据库连接，缓存，http keep-alive实现，用户认证等相关的功能的．

epoll在linux底层应该是通过红黑树实现的，但对应user space而言，我们只需要把它看作是可以轮询的epoll file descriptor数组就可以了．所以，我在我的程序中用来do while loop来使用它，请看这个函数，

 1   do {
 2     int nevents = epoll_wait(epfd, events, MAXEVENTS, EPOLL_TIMEOUT);
 3     if (nevents == -1) {
 4       if (errno == EINTR) continue;
 5       perror("epoll_wait()");
 6     }
 7 
 8     if ((mstime() - loop_time) >= EPOLL_TIMEOUT) {
 9       /* expire the timers */
10       thpool_add_task(taskpool, httpconn_expire, timers);
11       /* expire the cache */
12       thpool_add_task(taskpool, httpcache_expire, cache);
13       loop_time = mstime();
14     }
15 
16     /* loop through events */
17     int i = 0;
18     do {
19       httpconn_t *conn = (httpconn_t *)events[i].data.ptr;
20       /* error case */
21       if ((events[i].events & EPOLLERR) || (events[i].events & EPOLLHUP)) {
22         if (errno == EAGAIN || errno == EINTR)
23           nsleep(10);
24         else {
25           D_PRINT("[EPOLL] errno = %d, ", errno);
26           perror("[ERR|HUP]");
27           break;
28         }
29       }
30       /* get input */
31       if (events[i].events & EPOLLIN) {
32         if (conn->sockfd == srvfd)
33           epsock_connect(srvfd, epfd, pgconn, cache, timers, authdb, cfg);
34         else {
35           /* client socket; read client data and process it */
36           thpool_add_task(taskpool, httpconn_task, conn);
37         }
38       }
39       i++;
40     } while (i < nevents);

第17行到第40行是管理组织epoll及socket的关键代码．第31行表示，如果遇到和EPOLLIN相关的事件时，则说明有从客户端传来的信息通过socket连接conn传了进来．

后面的代码则说明，如果传入的连接conn的sockfd是和提前创建的服务器socket file descriptor srvfd相等的话，则创建新的socket连接．否则，则把这个连接作为参数加入到线程池中，让线程池的任务函数httpconn_task处理．httpconn_task负责产生http response返回给客户端（比如，浏览器）．

再来看看上一段代码中的epsock_connect这个函数，它负责建立新的socket连接，

 1 void epsock_connect(const int srvfd,
 2                     const int epfd,
 3                     PGconn *pgconn,
 4                     rbtree_t *cache,
 5                     rbtree_t *timers,
 6                     rbtree_t *authdb,
 7                     httpcfg_t *cfg)
 8 {
 9   struct sockaddr cliaddr;
10   socklen_t len_cliaddr = sizeof(struct sockaddr);
11 
12    /* server socket; accept connections */
13   for (;;) {
14     int clifd = accept(srvfd, &cliaddr, &len_cliaddr);
15 
16     if (clifd == -1) {
17       if (errno == EINTR) continue;
18       if (errno == EAGAIN || errno == EWOULDBLOCK) {
19         /* we processed all of the connections */
20         break;
21       }
22       perror("accept()");
23       close(clifd);
24       break;
25     }
26 
27     char *cli_ip = inet_ntoa(((struct sockaddr_in *)&cliaddr)->sin_addr);
28     D_PRINT("[CONN] client %s connected on socket %d\\n", cli_ip, clifd);
29 
30     _set_nonblocking(clifd);
31 
32     httpconn_t *cliconn = httpconn_new(clifd, epfd,
33                                        pgconn, cache, timers, authdb,
34                                        cfg);
35     /* install the new timer */
36     pthread_mutex_lock(&timers->mutex);
37     rbtree_insert(timers, cliconn);
38     pthread_mutex_unlock(&timers->mutex);
39 
40     if (httpconn_epoll(cliconn, EPOLL_CTL_ADD) == -1) return;
41   }
42 }

其中，第32行httpconn_new()负责创建新的socket连接，第30行的作用是把连接设置成非阻塞方式．

具体的代码还是需要读者去我的github项目Maestro看，我写的程序做的封装不是很深，相信大家都能看懂的．

值得一提的是，在轮询的过程中，作为整体待处理的是httpconn_t的这个结构，在后续的处理过程中也是需要基于这个结构整体设计和处理，socket和epoll的file descriptor要捆绑在一起考虑，否则http keep-alive的实现将很困难，至少我当时是无从下手的．

我会在第三篇内容里介绍　线程池　在Webserver中的应用．．．

以上是关于也谈如何写一个Webserver的主要内容，如果未能解决你的问题，请参考以下文章