ZMQ源码分析--编码器和解码器

Posted 2022-12-06 子曰帅

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了ZMQ源码分析--编码器和解码器相关的知识，希望对你有一定的参考价值。

zmq的编码器和解码器负责和stream_engine合作收发网络数据，zmtp3.0使用v2_decoder和v2_encoder进行收发数据，本文也只对该版本进行分析。

解码器

zmq中v1和v2解码器都继承自decoder_base_t，raw_decoder则直接继承自i_decoder：

  template <typename T> class decoder_base_t : public i_decoder
    
    public:
        inline decoder_base_t (size_t bufsize_) :
            next (NULL),
            read_pos (NULL),
            to_read (0),
            bufsize (bufsize_)
        
            buf = (unsigned char*) malloc (bufsize_);
            alloc_assert (buf);
        

        //  The destructor doesn't have to be virtual. It is mad virtual
        //  just to keep ICC and code checking tools from complaining.
        inline virtual ~decoder_base_t ()
        
            free (buf);
        

        //  Returns a buffer to be filled with binary data.
        inline void get_buffer (unsigned char **data_, size_t *size_)
        
            //  If we are expected to read large message, we'll opt for zero-
            //  copy, i.e. we'll ask caller to fill the data directly to the
            //  message. Note that subsequent read(s) are non-blocking, thus
            //  each single read reads at most SO_RCVBUF bytes at once not
            //  depending on how large is the chunk returned from here.
            //  As a consequence, large messages being received won't block
            //  other engines running in the same I/O thread for excessive
            //  amounts of time.
            if (to_read >= bufsize) 
                *data_ = read_pos;
                *size_ = to_read;
                return;
            

            *data_ = buf;
            *size_ = bufsize;
        

        //  Processes the data in the buffer previously allocated using
        //  get_buffer function. size_ argument specifies nemuber of bytes
        //  actually filled into the buffer. Function returns 1 when the
        //  whole message was decoded or 0 when more data is required.
        //  On error, -1 is returned and errno set accordingly.
        //  Number of bytes processed is returned in byts_used_.
        inline int decode (const unsigned char *data_, size_t size_,
                           size_t &bytes_used_)
        
            bytes_used_ = 0;

            //  In case of zero-copy simply adjust the pointers, no copying
            //  is required. Also, run the state machine in case all the data
            //  were processed.
            if (data_ == read_pos) 
                zmq_assert (size_ <= to_read);
                read_pos += size_;
                to_read -= size_;
                bytes_used_ = size_;

                while (!to_read) 
                    const int rc = (static_cast <T*> (this)->*next) ();
                    if (rc != 0)
                        return rc;
                
                return 0;
            

            while (bytes_used_ < size_) 
                //  Copy the data from buffer to the message.
                const size_t to_copy = std::min (to_read, size_ - bytes_used_);
                memcpy (read_pos, data_ + bytes_used_, to_copy);
                read_pos += to_copy;
                to_read -= to_copy;
                bytes_used_ += to_copy;
                //  Try to get more space in the message to fill in.
                //  If none is available, return.
                while (to_read == 0) 
                    const int rc = (static_cast <T*> (this)->*next) ();
                    if (rc != 0)
                        return rc;
                
            

            return 0;
        

    protected:

        //  Prototype of state machine action. Action should return false if
        //  it is unable to push the data to the system.
        typedef int (T::*step_t) ();

        //  This function should be called from derived class to read data
        //  from the buffer and schedule next state machine action.
        inline void next_step (void *read_pos_, size_t to_read_, step_t next_)
        
            read_pos = (unsigned char*) read_pos_;
            to_read = to_read_;
            next = next_;
        

    private:

        //  Next step. If set to NULL, it means that associated data stream
        //  is dead. Note that there can be still data in the process in such
        //  case.
        step_t next;

        //  Where to store the read data.
        unsigned char *read_pos;

        //  How much data to read before taking next step.
        size_t to_read;

        //  The duffer for data to decode.
        size_t bufsize;
        unsigned char *buf;

        decoder_base_t (const decoder_base_t&);
        const decoder_base_t &operator = (const decoder_base_t&);
    ;

解码器的next函数指针同样是一个状态机，每次调用状态机都会重置read_pos和to_read两个变量，表示下一步需要把数据读到什么位置以及需要读取的数据的大小。get_buffer方法主要是返回一个可以读取数据的缓存以及该缓存的大小。如果是小数据，则先使用解码器自带的缓存buf，该缓存的大小为bufsize。如果是大数据，则直接向next返回的read_pos中读取数据，这样可以避免一次数据拷贝。decode同样分为两种情况，如果是之前没有使用自带缓存，则直接移动指针即可。如果是小数据，则需要把数据从缓存中考入到read_pos位置。如果to_read为0，说明当前状态下的所有数据已经处理完毕，需要移动到下一个状态，调用next重置read_pos和to_read。
下面看一下v2_decoder_t的实现：

    //  Decoder for ZMTP/2.x framing protocol. Converts data stream into messages.
    class v2_decoder_t : public decoder_base_t <v2_decoder_t>
    
    public:

        v2_decoder_t (size_t bufsize_, int64_t maxmsgsize_);
        virtual ~v2_decoder_t ();

        //  i_decoder interface.
        virtual msg_t *msg ()  return &in_progress; 

    private:

        int flags_ready ();
        int one_byte_size_ready ();
        int eight_byte_size_ready ();
        int message_ready ();

        unsigned char tmpbuf [8];
        unsigned char msg_flags;
        msg_t in_progress;

        const int64_t maxmsgsize;

        v2_decoder_t (const v2_decoder_t&);
        void operator = (const v2_decoder_t&);
    ;

v2_decoder_t有四个状态机方法分别对应四种状态，同时有一个8字节的缓存，in_progress是解码器正在处理的消息。解码器解析出来的msg都保存在这里。maxmsgsize是一个最大消息长度的阀值。下面看着四种状态的转换关系：

zmq::v2_decoder_t::v2_decoder_t (size_t bufsize_, int64_t maxmsgsize_) :
    decoder_base_t <v2_decoder_t> (bufsize_),
    msg_flags (0),
    maxmsgsize (maxmsgsize_)

    int rc = in_progress.init ();
    errno_assert (rc == 0);

    //  At the beginning, read one byte and go to flags_ready state.
    next_step (tmpbuf, 1, &v2_decoder_t::flags_ready);


zmq::v2_decoder_t::~v2_decoder_t ()

    int rc = in_progress.close ();
    errno_assert (rc == 0);


int zmq::v2_decoder_t::flags_ready ()

    msg_flags = 0;
    if (tmpbuf [0] & v2_protocol_t::more_flag)
        msg_flags |= msg_t::more;
    if (tmpbuf [0] & v2_protocol_t::command_flag)
        msg_flags |= msg_t::command;

    //  The payload length is either one or eight bytes,
    //  depending on whether the 'large' bit is set.
    if (tmpbuf [0] & v2_protocol_t::large_flag)
        next_step (tmpbuf, 8, &v2_decoder_t::eight_byte_size_ready);
    else
        next_step (tmpbuf, 1, &v2_decoder_t::one_byte_size_ready);

    return 0;


int zmq::v2_decoder_t::one_byte_size_ready ()

    //  Message size must not exceed the maximum allowed size.
    if (maxmsgsize >= 0)
        if (unlikely (tmpbuf [0] > static_cast <uint64_t> (maxmsgsize))) 
            errno = EMSGSIZE;
            return -1;
        

    //  in_progress is initialised at this point so in theory we should
    //  close it before calling zmq_msg_init_size, however, it's a 0-byte
    //  message and thus we can treat it as uninitialised...
    int rc = in_progress.init_size (tmpbuf [0]);
    if (unlikely (rc)) 
        errno_assert (errno == ENOMEM);
        rc = in_progress.init ();
        errno_assert (rc == 0);
        errno = ENOMEM;
        return -1;
    

    in_progress.set_flags (msg_flags);
    next_step (in_progress.data (), in_progress.size (),
        &v2_decoder_t::message_ready);

    return 0;


int zmq::v2_decoder_t::eight_byte_size_ready ()

    //  The payload size is encoded as 64-bit unsigned integer.
    //  The most significant byte comes first.
    const uint64_t msg_size = get_uint64 (tmpbuf);

    //  Message size must not exceed the maximum allowed size.
    if (maxmsgsize >= 0)
        if (unlikely (msg_size > static_cast <uint64_t> (maxmsgsize))) 
            errno = EMSGSIZE;
            return -1;
        

    //  Message size must fit into size_t data type.
    if (unlikely (msg_size != static_cast <size_t> (msg_size))) 
        errno = EMSGSIZE;
        return -1;
    

    //  in_progress is initialised at this point so in theory we should
    //  close it before calling init_size, however, it's a 0-byte
    //  message and thus we can treat it as uninitialised.
    int rc = in_progress.init_size (static_cast <size_t> (msg_size));
    if (unlikely (rc)) 
        errno_assert (errno == ENOMEM);
        rc = in_progress.init ();
        errno_assert (rc == 0);
        errno = ENOMEM;
        return -1;
    

    in_progress.set_flags (msg_flags);
    next_step (in_progress.data (), in_progress.size (),
        &v2_decoder_t::message_ready);

    return 0;


int zmq::v2_decoder_t::message_ready ()

    //  Message is completely read. Signal this to the caller
    //  and prepare to decode next message.
    next_step (tmpbuf, 1, &v2_decoder_t::flags_ready);
    return 1;

在构造函数中调用

next_step (tmpbuf, 1, &v2_decoder_t::flags_ready)

代表接下来想tmpbuf中读入一个字节的数据，下一个状态机状态是flags_ready方法。flags_ready中会分析这条数据是否为长消息，如果是说明接下来的八个字节是消息长度，如果不是说明截下来一个字节是消息长度。这是zmtp规定的数据格式。

    if (tmpbuf [0] & v2_protocol_t::large_flag)
        next_step (tmpbuf, 8, &v2_decoder_t::eight_byte_size_ready);
    else
        next_step (tmpbuf, 1, &v2_decoder_t::one_byte_size_ready);

以长消息为例，截下来向tmpbuf中读入8字节长度数据，读取之后进入到eight_byte_size_ready状态。eight_byte_size_ready中已经知道了消息的长度，则用该长度初始化in_progress的大小，下一个状态是

next_step (in_progress.data (), in_progress.size (),&v2_decoder_t::message_ready)

代表向in_progress读入之前得到的数据长度，下一个状态设置成message_ready。当调用message_ready时候说明一条完整的msg已经处理完成了。message_ready方法把状态及设置成初始状态来读取下一条msg。message_ready返回1表明一条完整数据已经读取，其他状态都返回0。
v2_decoder_t主要用于stream_engine中的in_event方法中

void zmq::stream_engine_t::in_event ()

    zmq_assert (!io_error);

    //  If still handshaking, receive and process the greeting message.
    if (unlikely (handshaking))
        if (!handshake ())
            return;

    zmq_assert (decoder);

    //  If there has been an I/O error, stop polling.
    if (input_stopped) 
        rm_fd (handle);
        io_error = true;
        return;
    

    //  If there's no data to process in the buffer...
    if (!insize) 

        //  Retrieve the buffer and read as much data as possible.
        //  Note that buffer can be arbitrarily large. However, we assume
        //  the underlying TCP layer has fixed buffer size and thus the
        //  number of bytes read will be always limited.
        size_t bufsize = 0;
        decoder->get_buffer (&inpos, &bufsize);

        const int rc = tcp_read (s, inpos, bufsize);
        if (rc == 0) 
            error (connection_error);
            return;
        
        if (rc == -1) 
            if (errno != EAGAIN)
                error (connection_error);
            return;
        

        //  Adjust input size
        insize = static_cast <size_t> (rc);
    

    int rc = 0;
    size_t processed = 0;

    while (insize > 0) 
        rc = decoder->decode (inpos, insize, processed);
        zmq_assert (processed <= insize);
        inpos += processed;
        insize -= processed;
        if (rc == 0 || rc == -1)
            break;
        rc = (this->*process_msg) (decoder->msg ());
        if (rc == -1)
            break;
    

    //  Tear down the connection if we have failed to decode input data
    //  or the session has rejected the message.
    if (rc == -1) 
        if (errno != EAGAIN) 
            error (protocol_error);
            return;
        
        input_stopped = true;
        reset_pollin (handle);
    

    session->flush ();

如果insize是0，则调用get_buffer，把inpos指向v2_decoder_t的缓存或者是直接指向v2_decoder_t中的in_progress（数据长度大于v2_decoder_t的缓存长度，默认是8192），然后调用tcp_read读入数据。while循环处理当前的读入的数据，如果独到一条完整的消息，则交给process_msg处理，如果剩下的数据不足一条msg，则跳出循环，等待下一次in_event的调用。出错的话则停止监听数据。

编码器

zmq中v1和v2编码器都继承自encoder_base_t，raw_encoder则直接继承自i_encoder：

    template <typename T> class encoder_base_t : public i_encoder
    
    public:

        inline encoder_base_t (size_t bufsize_) :
            bufsize (bufsize_),
            in_progress (NULL)
        
            buf = (unsigned char*) malloc (bufsize_);
            alloc_assert (buf);
        

        //  The destructor doesn't have to be virtual. It is made virtual
        //  just to keep ICC and code checking tools from complaining.
        inline virtual ~encoder_base_t ()
        
            free (buf);
        

        //  The function returns a batch of binary data. The data
        //  are filled to a supplied buffer. If no buffer is supplied (data_
        //  points to NULL) decoder object will provide buffer of its own.
        inline size_t encode (unsigned char **data_, size_t size_)
        
            unsigned char *buffer = !*data_ ? buf : *data_;
            size_t buffersize = !*data_ ? bufsize : size_;

            if (in_progress == NULL)
                return 0;

            size_t pos = 0;
            while (pos < buffersize) 

                //  If there are no more data to return, run the state machine.
                //  If there are still no data, return what we already have
                //  in the buffer.
                if (!to_write) 
                    if (new_msg_flag) 
                        int rc = in_progress->close ();
                        errno_assert (rc == 0);
                        rc = in_progress->init ();
                        errno_assert (rc == 0);
                        in_progress = NULL;
                        break;
                    
                    (static_cast <T*> (this)->*next) ();
                

                //  If there are no data in the buffer yet and we are able to
                //  fill whole buffer in a single go, let's use zero-copy.
                //  There's no disadvantage to it as we cannot stuck multiple
                //  messages into the buffer anyway. Note that subsequent
                //  write(s) are non-blocking, thus each single write writes
                //  at most SO_SNDBUF bytes at once not depending on how large
                //  is the chunk returned from here.
                //  As a consequence, large messages being sent won't block
                //  other engines running in the same I/O thread for excessive
                //  amounts of time.
                if (!pos && !*data_ && to_write >= buffersize) 
                    *data_ = write_pos;
                    pos = to_write;
                    write_pos = NULL;
                    to_write = 0;
                    return pos;
                

                //  Copy data to the buffer. If the buffer is full, return.
                size_t to_copy = std::min (to_write, buffersize - pos);
                memcpy (buffer + pos, write_pos, to_copy);
                pos += to_copy;
                write_pos += to_copy;
                to_write -= to_copy;
            

            *data_ = buffer;
            return pos;
        

        void load_msg (msg_t *msg_)
        
            zmq_assert (in_progress == NULL);
            in_progress = msg_;
            (static_cast <T*> (this)->*next) ();
        

    protected:

        //  Prototype of state machine action.
        typedef void (T::*step_t) ();

        //  This function should be called from derived class to write the data
        //  to the buffer and schedule next state machine action.
        inline void next_step (void *write_pos_, size_t to_write_,
            step_t next_, bool new_msg_flag_)
        
            write_pos = (unsigned char*) write_pos_;
            to_write = to_write_;
            next = next_;
            new_msg_flag = new_msg_flag_;
        

    private:

        //  Where to get the data to write from.
        unsigned char *write_pos;

        //  How much data to write before next step should be executed.
        size_t to_write;

        //  Next step. If set to NULL, it means that associated data stream
        //  is dead.
        step_t next;

        bool new_msg_flag;

        //  The buffer for encoded data.
        size_t bufsize;
        unsigned char *buf;

        encoder_base_t (const encoder_base_t&);
        void operator = (const encoder_base_t&);

    protected:

        msg_t *in_progress;

encoder_base_t比decoder_base_t逻辑稍微复杂一些，但也是使用状态机实现的。encoder_base_t最重要的是encode方法，在分析encode方法之前，先看一下encoder_base_t的使用方式，它主要使用在stream_engine的out_event中：

void zmq::stream_engine_t::out_event ()

    zmq_assert (!io_error);

    //  If write buffer is empty, try to read new data from the encoder.
    if (!outsize) 

        //  Even when we stop polling as soon as there is no
        //  data to send, the poller may invoke out_event one
        //  more time due to 'speculative write' optimisation.
        if (unlikely (encoder == NULL)) 
            zmq_assert (handshaking);
            return;
        

        outpos = NULL;
        outsize = encoder->encode (&outpos, 0);

        while (outsize < out_batch_size) 
            if ((this->*next_msg) (&tx_msg) == -1)
                break;
            encoder->load_msg (&tx_msg);
            unsigned char *bufptr = outpos + outsize;
            size_t n = encoder->encode (&bufptr, out_batch_size - outsize);
            zmq_assert (n > 0);
            if (outpos == NULL)
                outpos = bufptr;
            outsize += n;
        

        //  If there is no data to send, stop polling for output.
        if (outsize == 0) 
            output_stopped = true;
            reset_pollout (handle);
            return;
        
    

    //  If there are any data to write in write buffer, write as much as
    //  possible to the socket. Note that amount of data to write can be
    //  arbitrarily large. However, we assume that underlying TCP layer has
    //  limited transmission buffer and thus the actual number of bytes
    //  written should be reasonably modest.
    const int nbytes = tcp_write (s, outpos, outsize);

    //  IO error has occurred. We stop waiting for output events.
    //  The engine is not terminated until we detect input error;
    //  this is necessary to prevent losing incoming messages.
    if (nbytes == -1) 
        reset_pollout (handle);
        return;
    

    outpos += nbytes;
    outsize -= nbytes;

    //  If we are still handshaking and there are no data
    //  to send, stop polling for output.
    if (unlikely (handshaking))
        if (outsize == 0)
            reset_pollout (handle);

每次调用该方法会先判断outsize是否为0，如果是0，说明之前的数据已经全部发送出去。if语句中首先调用

        outpos = NULL;
        outsize = encoder->encode (&outpos, 0);

将oupos指向encoder的缓存，然后不断从next_msg中读出需要发送的msg，之后调用encoder的load_msg将新的msg存入到encoder中，最后调用

 size_t n = encoder->encode (&bufptr, out_batch_size - outsize);

将刚刚存入的msg写入缓存，encode不一定处理整条消息，如果空间不够可以处理部分消息。如果缓存已满或者没有新的msg可以写则调用tcp_write。out_event的设计可以使一次tcp_write发送多条msg，减少系统调用，提高效率。如果msg没有处理完整，则下次再次进入到if语句中时

outsize = encoder->encode (&outpos, 0);

会继续编码剩下的数据。

看完stream_engine是怎么样使用encoder之后，再回头看encoder的encode方法，该方法每次把buff指向自己的缓存或者是传入进来的指针，接着encoder向buff中写入数据，首先判断to_write是否为0，如果是则运行状态机，这里同样有一个避免拷贝的优化，当to_read比自带buffer大并且传入进来的＊data是null，当前的pos也为0（证明之前的数据已经全部发送出去，不会造成数据混乱），则可以直接将发送缓存的指针指向msg的数据部分，这里也不会存在线程安全问题。
v2_encoder的状态机和v2_decoder相比比较简单，只有两个状态：

zmq::v2_encoder_t::v2_encoder_t (size_t bufsize_) :
    encoder_base_t <v2_encoder_t> (bufsize_)

    //  Write 0 bytes to the batch and go to message_ready state.
    next_step (NULL, 0, &v2_encoder_t::message_ready, true);


zmq::v2_encoder_t::~v2_encoder_t ()



void zmq::v2_encoder_t::message_ready ()

    //  Encode flags.
    unsigned char &protocol_flags = tmpbuf [0];
    protocol_flags = 0;
    if (in_progress->flags () & msg_t::more)
        protocol_flags |= v2_protocol_t::more_flag;
    if (in_progress->size () > 255)
        protocol_flags |= v2_protocol_t::large_flag;
    if (in_progress->flags () & msg_t::command)
        protocol_flags |= v2_protocol_t::command_flag;

    //  Encode the message length. For messages less then 256 bytes,
    //  the length is encoded as 8-bit unsigned integer. For larger
    //  messages, 64-bit unsigned integer in network byte order is used.
    const size_t size = in_progress->size ();
    if (unlikely (size > 255)) 
        put_uint64 (tmpbuf + 1, size);
        next_step (tmpbuf, 9, &v2_encoder_t::size_ready, false);
    
    else 
        tmpbuf [1] = static_cast <uint8_t> (size);
        next_step (tmpbuf, 2, &v2_encoder_t::size_ready, false);
    


void zmq::v2_encoder_t::size_ready ()

    //  Write message body into the buffer.
    next_step (in_progress->data (), in_progress->size (),
        &v2_encoder_t::message_ready, true);

以上就是v2编码器和解码器的工作原理。
除了v1和v2编码器，zmq还提供raw_decode/encode 方式，这种方式比较简单，这里就不做分析了。

以上是关于ZMQ源码分析--编码器和解码器的主要内容，如果未能解决你的问题，请参考以下文章