thttpd和cgilua安装与运行流程分析

Posted LightSong@计海拾贝

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了thttpd和cgilua安装与运行流程分析相关的知识,希望对你有一定的参考价值。

安装

参考如下博文安装thttpd软件

http://blog.csdn.net/21aspnet/article/details/7045845

http://blog.csdn.net/dragoncheng/article/details/5614559

 

thttpd配置文件:

[email protected]:/usr/local/bin# cat /usr/local/thttpd/
conf/ etc/  logs/ man/  sbin/ www/ 
[email protected]:/usr/local/bin# cat /usr/local/thttpd/conf/thttpd.conf

port=80
user=www
host=0.0.0.0
logfile=/usr/local/thttpd/logs/thttpd.log
pidfile=/usr/local/thttpd/logs/thttpd.pid
#throttles=/usr/local/thttpd/etc/throttle.conf
#urlpat=*.txt|*.mp3
#charset=utf-8
dir=/usr/local/thttpd/www
cgipat=/cgi-bin/*

 

cgilua采用luarocks安装。 其依赖 wsapi运行。

 

cgilua.cgi launcher:

[email protected]:/usr/local/bin# cat cgilua.cgi
#!/bin/sh

exec ‘/usr/bin/lua5.1‘ -e ‘package.path="/root/.luarocks/share/lua/5.1/?.lua;/root/.luarocks/share/lua/5.1/?/init.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua;"..package.path; package.cpath="/root/.luarocks/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/?.so;"..package.cpath‘ -e ‘local k,l,_=pcall(require,"luarocks.loader") _=k and l.add_context("cgilua","5.1.4-2")‘ ‘/usr/local/lib/luarocks/rocks/cgilua/5.1.4-2/bin/cgilua.cgi‘ "[email protected]"
[email protected]:/usr/local/bin#
[email protected]:/usr/local/bin#
[email protected]:/usr/local/bin# cat /usr/local/lib/luarocks/rocks/cgilua/5.1.4-2/bin/cgilua.cgi
#!/usr/bin/env lua

-- CGILua (SAPI) launcher, extracts script to launch
-- either from the command line (use #!cgilua in the script)
-- or from SCRIPT_FILENAME/PATH_TRANSLATED
 
pcall(require, "luarocks.require")
 
local common = require "wsapi.common"
local cgi = require "wsapi.cgi"
 
local sapi = require "wsapi.sapi"
 
local arg_filename = (...)
 
local function sapi_loader(wsapi_env)
  common.normalize_paths(wsapi_env, arg_filename, "cgilua.cgi")
  return sapi.run(wsapi_env)
end
 
cgi.run(sapi_loader)
[email protected]:/usr/local/bin#

 

流程分析

thttpd配置遇到 cgipat规则的请求, 则启动cgi程序

cgipat规则为

cgipat=/cgi-bin/*

即, URL中含有  /cgi-bin/开头的文件名请求。

 

对应thttpd中的启动cgi子程序代码:

    /* Is it world-executable and in the CGI area? */
    if ( hc->hs->cgi_pattern != (char*) 0 &&
     ( hc->sb.st_mode & S_IXOTH ) &&
     match( hc->hs->cgi_pattern, hc->expnfilename ) )
    return cgi( hc );

 

启动cgi程序逻辑

cgi函数中,其实是fork了一个子进程

r = fork( );
if ( r < 0 )
    {
    syslog( LOG_ERR, "fork - %m" );
    httpd_send_err(
    hc, 500, err500title, "", err500form, hc->encodedurl );
    return -1;
    }
if ( r == 0 )
    {
    /* Child process. */
    sub_process = 1;
    httpd_unlisten( hc->hs );
    cgi_child( hc );
    }

 

cgi_child为子进程继续执行逻辑

1、 准备环境变量:


/* Make the environment vector. */
envp = make_envp( hc );

/* Make the argument vector. */
argp = make_argp( hc );

环境变量中, 包括若干 cgi参数:

static char**
make_envp( httpd_conn* hc )
    {
    static char* envp[50];
    int envn;
    char* cp;
    char buf[256];

    envn = 0;
    envp[envn++] = build_env( "PATH=%s", CGI_PATH );
#ifdef CGI_LD_LIBRARY_PATH
    envp[envn++] = build_env( "LD_LIBRARY_PATH=%s", CGI_LD_LIBRARY_PATH );
#endif /* CGI_LD_LIBRARY_PATH */
    envp[envn++] = build_env( "SERVER_SOFTWARE=%s", SERVER_SOFTWARE );
    if ( hc->hs->vhost && hc->hostname != (char*) 0 && hc->hostname[0] != ‘\0‘ )
    cp = hc->hostname;
    else if ( hc->hdrhost != (char*) 0 && hc->hdrhost[0] != ‘\0‘ )
    cp = hc->hdrhost;
    else if ( hc->reqhost != (char*) 0 && hc->reqhost[0] != ‘\0‘ )
    cp = hc->reqhost;
    else
    cp = hc->hs->server_hostname;
    if ( cp != (char*) 0 )
    envp[envn++] = build_env( "SERVER_NAME=%s", cp );
    envp[envn++] = "GATEWAY_INTERFACE=CGI/1.1";
    envp[envn++] = build_env("SERVER_PROTOCOL=%s", hc->protocol);
    (void) my_snprintf( buf, sizeof(buf), "%d", (int) hc->hs->port );
    envp[envn++] = build_env( "SERVER_PORT=%s", buf );
    envp[envn++] = build_env(
    "REQUEST_METHOD=%s", httpd_method_str( hc->method ) );
    if ( hc->pathinfo[0] != ‘\0‘ )
    {
    char* cp2;
    size_t l;
    envp[envn++] = build_env( "PATH_INFO=/%s", hc->pathinfo );
    l = strlen( hc->hs->cwd ) + strlen( hc->pathinfo ) + 1;
    cp2 = NEW( char, l );
    if ( cp2 != (char*) 0 )
        {
        (void) my_snprintf( cp2, l, "%s%s", hc->hs->cwd, hc->pathinfo );
        envp[envn++] = build_env( "PATH_TRANSLATED=%s", cp2 );
        }
    }
    envp[envn++] = build_env(
    "SCRIPT_NAME=/%s", strcmp( hc->origfilename, "." ) == 0 ?
    "" : hc->origfilename );
    if ( hc->query[0] != ‘\0‘)
    envp[envn++] = build_env( "QUERY_STRING=%s", hc->query );
    envp[envn++] = build_env(
    "REMOTE_ADDR=%s", httpd_ntoa( &hc->client_addr ) );
    if ( hc->referrer[0] != ‘\0‘ )
    {
    envp[envn++] = build_env( "HTTP_REFERER=%s", hc->referrer );
    envp[envn++] = build_env( "HTTP_REFERRER=%s", hc->referrer );
    }
    if ( hc->useragent[0] != ‘\0‘ )
    envp[envn++] = build_env( "HTTP_USER_AGENT=%s", hc->useragent );
    if ( hc->accept[0] != ‘\0‘ )
    envp[envn++] = build_env( "HTTP_ACCEPT=%s", hc->accept );
    if ( hc->accepte[0] != ‘\0‘ )
    envp[envn++] = build_env( "HTTP_ACCEPT_ENCODING=%s", hc->accepte );
    if ( hc->acceptl[0] != ‘\0‘ )
    envp[envn++] = build_env( "HTTP_ACCEPT_LANGUAGE=%s", hc->acceptl );
    if ( hc->cookie[0] != ‘\0‘ )
    envp[envn++] = build_env( "HTTP_COOKIE=%s", hc->cookie );
    if ( hc->contenttype[0] != ‘\0‘ )
    envp[envn++] = build_env( "CONTENT_TYPE=%s", hc->contenttype );
    if ( hc->hdrhost[0] != ‘\0‘ )
    envp[envn++] = build_env( "HTTP_HOST=%s", hc->hdrhost );
    if ( hc->contentlength != -1 )
    {
    (void) my_snprintf(
        buf, sizeof(buf), "%lu", (unsigned long) hc->contentlength );
    envp[envn++] = build_env( "CONTENT_LENGTH=%s", buf );
    }
    if ( hc->remoteuser[0] != ‘\0‘ )
    envp[envn++] = build_env( "REMOTE_USER=%s", hc->remoteuser );
    if ( hc->authorization[0] != ‘\0‘ )
    envp[envn++] = build_env( "AUTH_TYPE=%s", "Basic" );
    /* We only support Basic auth at the moment. */
    if ( getenv( "TZ" ) != (char*) 0 )
    envp[envn++] = build_env( "TZ=%s", getenv( "TZ" ) );
    envp[envn++] = build_env( "CGI_PATTERN=%s", hc->hs->cgi_pattern );

    envp[envn] = (char*) 0;
    return envp;
    }

 

 

2、 将连接fd设置为cgi程序的标准输入:

/* Otherwise, the request socket is stdin. */
if ( hc->conn_fd != STDIN_FILENO )
    (void) dup2( hc->conn_fd, STDIN_FILENO );

 

3、 将连接fd设置为cgi程序的标准输出 和 错误:

/* Otherwise, the request socket is stdout/stderr. */
if ( hc->conn_fd != STDOUT_FILENO )
    (void) dup2( hc->conn_fd, STDOUT_FILENO );
if ( hc->conn_fd != STDERR_FILENO )
    (void) dup2( hc->conn_fd, STDERR_FILENO );

 

4、 启动cgi的业务进程, 替代当前的 fork映像

/* Run the program. */
(void) execve( binary, argp, envp );

注意 环境变量已经被注入到 启动进程中, 即在业务进程中, 可以访问到 cgi参数。

包括当前脚本名称: SCRIPT_NAME

 

execv功能

The  exec()  family  of functions replaces the current process image with a new process image.  The functions described in this manual page are front-ends for execve(2).  (See the
manual page for execve(2) for further details about the replacement of the current process image.)

 

http://www.tutorialspoint.com/unix_system_calls/execve.htm

execve() executes the program pointed to by filename. filename must be either a binary executable, or a script starting with a line of the form "#! interpreter [arg]". In the latter case, the interpreter must be a valid pathname for an executable which is not itself a script, which will be invoked as interpreter [arg] filename.

如果待执行文件为脚本, 则启动脚本脚本的解释器程序, 并执行脚本。

 

脚本内容

Z:\cgilua-master\cgilua-master\examples\index.lp 样例中的此脚本

意思为 启动程序 env,  执行cgilua.cgi程序, 来处理脚本文件

#!/usr/bin/env cgilua.cgi

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
    <title>Welcome to Kepler!</title>
    <link rel="stylesheet" href="css/doc.css" type="text/css"/>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
</head>

<body>

 

env命令程序的作用, 可以看出此处专门是用来 启动命令的(cgilua.cgi), 并没有设置环境变量


NAME
       env - run a program in a modified environment

SYNOPSIS
       env [OPTION]... [-] [NAME=VALUE]... [COMMAND [ARG]...]

DESCRIPTION
       Set each NAME to VALUE in the environment and run COMMAND.

       Mandatory arguments to long options are mandatory for short options too.

 

cgilua.cgi

cgilua.cgi 主要业务文件为 /usr/local/lib/luarocks/rocks/cgilua/5.1.4-2/bin/cgilua.cgi

依赖wsapi.cgi 和  wsapi.common 和 wsapi.cgi模块

[email protected]:/usr/local/bin# cat cgilua.cgi
#!/bin/sh

exec ‘/usr/bin/lua5.1‘ -e ‘package.path="/root/.luarocks/share/lua/5.1/?.lua;/root/.luarocks/share/lua/5.1/?/init.lua;/usr/local/share/lua/5.1/?.lua;/usr/local/share/lua/5.1/?/init.lua;"..package.path; package.cpath="/root/.luarocks/lib/lua/5.1/?.so;/usr/local/lib/lua/5.1/?.so;"..package.cpath‘ -e ‘local k,l,_=pcall(require,"luarocks.loader") _=k and l.add_context("cgilua","5.1.4-2")‘ ‘/usr/local/lib/luarocks/rocks/cgilua/5.1.4-2/bin/cgilua.cgi‘ "[email protected]"
[email protected]:/usr/local/bin#
[email protected]:/usr/local/bin#
[email protected]:/usr/local/bin#
[email protected]:/usr/local/bin# cat /usr/local/lib/luarocks/rocks/cgilua/5.1.4-2/bin/cgilua.cgi
#!/usr/bin/env lua

-- CGILua (SAPI) launcher, extracts script to launch
-- either from the command line (use #!cgilua in the script)
-- or from SCRIPT_FILENAME/PATH_TRANSLATED
 
pcall(require, "luarocks.require")
 
local common = require "wsapi.common"
local cgi = require "wsapi.cgi"
 
local sapi = require "wsapi.sapi"
 
local arg_filename = (...)
 
local function sapi_loader(wsapi_env)
  common.normalize_paths(wsapi_env, arg_filename, "cgilua.cgi")
  return sapi.run(wsapi_env)
end
 
cgi.run(sapi_loader)
 [email protected]:/usr/local/bin#

 

 

1、 wsapi.cgi模块为 脚本入口, 其提供了 获取环境变量的通道, 设置到 wsapi_env表中:

并将前文中说的, cgi程序将  连接fd, 接管后, 作为标准输入 和 输出 以及错误的代表。

local os = require"os"
local io = require"io"
local common = require"wsapi.common"

common.setmode()

local _M = {}

-- Runs an WSAPI application for this CGI request
function _M.run(app_run)
   common.run(app_run, { input = io.stdin, output = io.stdout,
     error = io.stderr, env = os.getenv })
end

return _M

 

2、 wsapi.sapi 脚本实现, 启动cgilua执行的逻辑:


local response = require "wsapi.response"

local _M = {}

function _M.run(wsapi_env)
  _G.CGILUA_APPS = _G.CGILUA_APPS or wsapi_env.DOCUMENT_ROOT .. "/cgilua"
  _G.CGILUA_CONF = _G.CGILUA_CONF or wsapi_env.DOCUMENT_ROOT .. "/cgilua"
  _G.CGILUA_TMP = _G.CGILUA_TMP or os.getenv("TMP") or os.getenv("TEMP") or "/tmp"
  _G.CGILUA_ISDIRECT = true

  local res = response.new()

  _G.SAPI = {
    Info =  {
      _COPYRIGHT = "Copyright (C) 2007 Kepler Project",
      _DESCRIPTION = "WSAPI SAPI implementation",
      _VERSION = "WSAPI SAPI 1.0",
      ispersistent = false,
    },
    Request = {
      servervariable = function (name) return wsapi_env[name] end,
      getpostdata = function (n) return wsapi_env.input:read(n) end
    },
    Response = {
      contenttype = function (header)
        res:content_type(header)
      end,
      errorlog = function (msg, errlevel)
        wsapi_env.error:write (msg)
      end,
      header = function (header, value)
        if res.headers[header] then
          if type(res.headers[header]) == "table" then
            table.insert(res.headers[header], value)
          else
            res.headers[header] = { res.headers[header], value }
          end
        else
          res.headers[header] = value
        end
      end,
      redirect = function (url)
        res.status = 302
        res.headers["Location"] = url
      end,
      write = function (...)
        res:write({...})
      end,
    },
  }
  local cgilua = require "cgilua"
  cgilua.main()
  return res:finish()
end

return _M

 

至此, thttpd到cgilua的调用流程已经明确。

 

诚然, cgi运行模式, 为启动子进程处理请求, 对于每一个请求, 都会启动单独的cgi执行,执行完毕退出。

这样会有效率问题, 对于静态资源, 例如纯html和css图片等, 都不应该走cgi程序。

解决此问题的方法:

1、 将cgi脚本的处理 固定在 thttpd进程中处理。(是不是openresty是这种模式?)

2、 使用fastcgi代替。(下阶段研究)

以上是关于thttpd和cgilua安装与运行流程分析的主要内容,如果未能解决你的问题,请参考以下文章

thttpd源代码解析 定时器模块

市场分析与数据挖掘分别的分析流程

lighttpd与fastcgi+cgilua

Struts运行流程分析与声明式验证

activiti流程引擎 表结构分析

基础的爬虫框架及运行流程