如何将::序列化为 sqlite::blob?

Posted

技术标签:

【中文标题】如何将::序列化为 sqlite::blob?【英文标题】:How to boost::serialize into a sqlite::blob? 【发布时间】:2013-12-05 20:21:23 【问题描述】:

我正在从事一个需要多种程序能力的科学项目。在四处寻找可用工具后,我决定使用 Boost 库,它为我提供了 C++ 标准库不提供的所需功能,例如日期/时间管理等。

我的项目是一组命令行,它处理来自旧的、自制的、基于纯文本文件的数据库中的一堆数据:导入、转换、分析、报告。

现在我已经到了需要坚持的地步。所以我加入了我发现非常有用的 boost::serialization。我能够存储和恢复“中等”数据集(不是那么大但也不是那么小),它们大约是 (7000,48,15,10)-dataset。

我还使用 SQLite C API 来存储和管理命令默认值、输出设置和变量元信息(单位、比例、限制)。

我想到了一些事情:序列化为 blob 字段而不是单独的文件。可能有一些我还没有看到的缺点(总是有),但我认为这可能是一个很好的解决方案,可以满足我的需求。

我能够将文本序列化为 std::string,所以我可以这样做:没有困难,因为它只使用普通字符。但我想二进制序列化成一个blob。

在填写我的 INSERT 查询时,我应该如何继续使用标准流?

【问题讨论】:

【参考方案1】:

哈。我以前从未使用过 sqlite3 C API。而且我从未编写过输出streambuf 实现。但是看到我将来可能会如何在 c++ 代码库中使用 sqlite3,我想我已经花了一些时间在

http://www.sqlite.org/c3ref/funclist.html

cppreference http://en.cppreference.com/w/cpp/io/basic_streambuf

原来你可以open a blob field for incremental IO。但是,尽管您可以读/写 BLOB,但不能更改大小(通过单独的 UPDATE 语句除外)。

所以,我的演示步骤变成了:

    将记录插入表中,绑定特定(固定)大小的“零块” 在新插入的记录中打开 blob 字段 将 blob 句柄包装在派生自 std::basic_streambuf<> 的自定义 blob_buf 对象中,并可与 std::ostream 一起使用以写入该 blob 将一些数据序列化到ostream 刷新 销毁/清理

它有效:)

main中的代码:

int main()

    sqlite3 *db = NULL;
    int rc = sqlite3_open_v2("test.sqlite3", &db, SQLITE_OPEN_READWRITE, NULL);
    if (rc != SQLITE_OK) 
        std::cerr << "database open failed: " << sqlite3_errmsg(db) << "\n";
        exit(255);
    

    // 1. insert a record into a table, binding a "zero-blob" of a certain (fixed) size
    sqlite3_int64 inserted = InsertRecord(db);

    
        // 2. open the blob field in the newly inserted record
        // 3. wrap the blob handle in a custom `blob_buf` object that derives from `std::basic_streambuf<>` and can be used with `std::ostream` to write to that blob
        blob_buf buf(OpenBlobByRowId(db, inserted));
        std::ostream writer(&buf); // this stream now writes to the blob!

        // 4. serialize some data into the `ostream`
        auto payload = CanBeSerialized  "hello world",  1, 2, 3.4, 1e7, -42.42  ;

        boost::archive::text_oarchive oa(writer);
        oa << payload;

#if 0   // used for testing with larger data
        std::ifstream ifs("test.cpp");
        writer << ifs.rdbuf();
#endif

        // 5. flush
        writer.flush();

        // 6. destruct/cleanup 
    

    sqlite3_close(db);
    // ==7653== HEAP SUMMARY:
    // ==7653==     in use at exit: 0 bytes in 0 blocks
    // ==7653==   total heap usage: 227 allocs, 227 frees, 123,540 bytes allocated
    // ==7653== 
    // ==7653== All heap blocks were freed -- no leaks are possible

您会认出概述的步骤。

为了测试它,假设你创建了一个新的 sqlite 数据库:

sqlite3 test.sqlite3 <<< "CREATE TABLE DEMO(ID INTEGER PRIMARY KEY AUTOINCREMENT, FILE BLOB);"

现在,一旦你运行了程序,你就可以查询它了:

sqlite3 test.sqlite3 <<< "SELECT * FROM DEMO;"
1|22 serialization::archive 10 0 0 11 hello world 5 0 1 2 3.3999999999999999 10000000 -42.420000000000002

如果您启用测试代码(放置的数据超过 blob_size 允许的数量),您将看到 blob 被截断:

contents truncated at 256 bytes

完整程序

#include <sqlite3.h>
#include <string>
#include <iostream>
#include <ostream>
#include <fstream>
#include <boost/serialization/vector.hpp>
#include <boost/archive/text_oarchive.hpp>

template<typename CharT, typename TraitsT = std::char_traits<CharT> >
class basic_blob_buf : public std::basic_streambuf<CharT, TraitsT> 

    sqlite3_blob* _blob; // owned
    int max_blob_size;

    typedef std::basic_streambuf<CharT, TraitsT> base_type;
    enum  BUFSIZE = 10 ; // Block size - tuning?
    char buf[BUFSIZE+1/*for the overflow character*/];

    size_t cur_offset;
    std::ostream debug;

    // no copying
    basic_blob_buf(basic_blob_buf const&)             = delete;
    basic_blob_buf& operator= (basic_blob_buf const&) = delete;
public:
    basic_blob_buf(sqlite3_blob* blob, int max_size = -1) 
        : _blob(blob), 
        max_blob_size(max_size), 
        buf 0, 
        cur_offset(0),
        // debug(std::cerr.rdbuf()) // or just use `nullptr` to suppress debug output
        debug(nullptr)
    
        debug.setf(std::ios::unitbuf);
        if (max_blob_size == -1) 
            max_blob_size = sqlite3_blob_bytes(_blob);
            debug << "max_blob_size detected: " << max_blob_size << "\n";
        
        this->setp(buf, buf + BUFSIZE);
    

    int overflow (int c = base_type::traits_type::eof())
    
        auto putpointer = this->pptr();
        if (c!=base_type::traits_type::eof())
        
            // add the character - even though pptr might be epptr
            *putpointer++ = c;
        

        if (cur_offset >= size_t(max_blob_size))
            return base_type::traits_type::eof(); // signal failure

        size_t n = std::distance(this->pbase(), putpointer);
        debug << "Overflow " << n << " bytes at " << cur_offset << "\n";
        if (cur_offset+n > size_t(max_blob_size))
        
            std::cerr << "contents truncated at " << max_blob_size << " bytes\n";
            n = size_t(max_blob_size) - cur_offset;
        

        if (SQLITE_OK != sqlite3_blob_write(_blob, this->pbase(), n, cur_offset))
        
            debug << "sqlite3_blob_write reported an error\n";
            return base_type::traits_type::eof(); // signal failure
        

        cur_offset += n;

        if (this->pptr() > (this->pbase() + n))
        
            debug << "pending data has not been written";
            return base_type::traits_type::eof(); // signal failure
        

        // reset buffer
        this->setp(buf, buf + BUFSIZE);

        return base_type::traits_type::not_eof(c);
    

    int sync()
    
        return base_type::traits_type::eof() != overflow();
    

    ~basic_blob_buf()  
        sqlite3_blob_close(_blob);
    
;

typedef basic_blob_buf<char> blob_buf;

struct CanBeSerialized

    std::string sometext;
    std::vector<double> a_vector;

    template<class Archive>
    void serialize(Archive & ar, const unsigned int version)
    
        ar & boost::serialization::make_nvp("sometext", sometext);
        ar & boost::serialization::make_nvp("a_vector", a_vector);
    
;

#define MAX_BLOB_SIZE 256

sqlite3_int64 InsertRecord(sqlite3* db)

    sqlite3_stmt *stmt = NULL;
    int rc = sqlite3_prepare_v2(db, "INSERT INTO DEMO(ID, FILE) VALUES(NULL, ?)", -1, &stmt, NULL);

    if (rc != SQLITE_OK) 
        std::cerr << "prepare failed: " << sqlite3_errmsg(db) << "\n";
        exit(255);
     else 
        rc = sqlite3_bind_zeroblob(stmt, 1, MAX_BLOB_SIZE);
        if (rc != SQLITE_OK) 
            std::cerr << "bind_zeroblob failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        
        rc = sqlite3_step(stmt);
        if (rc != SQLITE_DONE)
        
            std::cerr << "execution failed: " << sqlite3_errmsg(db) << "\n";
            exit(255);
        
    
    rc = sqlite3_finalize(stmt);
    if (rc != SQLITE_OK)
    
        std::cerr << "finalize stmt failed: " << sqlite3_errmsg(db) << "\n";
        exit(255);
    

    return sqlite3_last_insert_rowid(db);


sqlite3_blob* OpenBlobByRowId(sqlite3* db, sqlite3_int64 rowid)

    sqlite3_blob* pBlob = NULL;
    int rc = sqlite3_blob_open(db, "main", "DEMO", "FILE", rowid, 1/*rw*/, &pBlob);

    if (rc != SQLITE_OK) 
        std::cerr << "blob_open failed: " << sqlite3_errmsg(db) << "\n";
        exit(255);
    
    return pBlob;


int main()

    sqlite3 *db = NULL;
    int rc = sqlite3_open_v2("test.sqlite3", &db, SQLITE_OPEN_READWRITE, NULL);
    if (rc != SQLITE_OK) 
        std::cerr << "database open failed: " << sqlite3_errmsg(db) << "\n";
        exit(255);
    

    // 1. insert a record into a table, binding a "zero-blob" of a certain (fixed) size
    sqlite3_int64 inserted = InsertRecord(db);

    
        // 2. open the blob field in the newly inserted record
        // 3. wrap the blob handle in a custom `blob_buf` object that derives from `std::basic_streambuf<>` and can be used with `std::ostream` to write to that blob
        blob_buf buf(OpenBlobByRowId(db, inserted));
        std::ostream writer(&buf); // this stream now writes to the blob!

        // 4. serialize some data into the `ostream`
        auto payload = CanBeSerialized  "hello world",  1, 2, 3.4, 1e7, -42.42  ;

        boost::archive::text_oarchive oa(writer);
        oa << payload;

#if 0   // used for testing with larger data
        std::ifstream ifs("test.cpp");
        writer << ifs.rdbuf();
#endif

        // 5. flush
        writer.flush();

        // 6. destruct/cleanup 
    

    sqlite3_close(db);

PS。我一直在处理错误……非常粗糙。您可能需要引入一个辅助函数来检查 sqlite3 错误代码并转换为异常。 :)

【讨论】:

您的回复只是强调了我错过的所有内容:(i)没有办法构建包含原始二进制数据的经典查询字符串(我应该怎么错过它,很明显),(ii)blob 有预定义大小(STL 容器让您忘记您必须关心数据实际占用的位置),(iii)SQlite C++ API 中有专用函数与 blob 字段交互(不知道从哪里开始,这密切相关(i))。我阅读了您的参考资料,我将在本周末实施。您的回答对我帮助很大,感谢您回复我的第一篇 Stack Overflow 帖子。 我今天早上做了睾丸,它就像一个魅力,开箱即用。再次感谢您,我现在将开始在我的界面中实现它。

以上是关于如何将::序列化为 sqlite::blob?的主要内容,如果未能解决你的问题,请参考以下文章

如何将对象序列化为字符串

如何将属性序列化为 json 对象?

如何将 SqlAlchemy 结果序列化为 JSON?

如何将远程 crate 的枚举序列化和反序列化为数字?

将小数序列化为 JSON,如何四舍五入?

如何使用 kotlinx 序列化将值数组反序列化为集合