HDF5Cpp 扩展复合数据集 Hyperslab 问题

Posted

技术标签:

【中文标题】HDF5Cpp 扩展复合数据集 Hyperslab 问题【英文标题】:HDF5Cpp Extending Compound Dataset Hyperslab problem 【发布时间】:2020-10-22 15:35:09 【问题描述】:

我有一个简单的程序,它扩展了两个整数的结构。该结构似乎没问题但是,当我将第二个结构写入扩展内存时,它会覆盖第一个结构数据,而扩展数据只是垃圾。有人知道我哪里出错了吗?我认为它在内存设置中。 非常感谢任何帮助,

'''

typedef struct s1_t 
    int    a;
    int    b;
 s1_t;


int main(void)

    hid_t           file, space, dset, dcpl, filetype;    /* Handles */
    herr_t          status;
      
   

    s1_t s1;

    s1.a = 19;
    s1.b = 67;

   
    //Create a new file using the default properties.
    file = H5Fcreate(FILE, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);

    //Create compound datatype
    filetype = H5Tcreate(H5T_COMPOUND, sizeof(s1_t));
    H5Tinsert(filetype, "a", HOFFSET(s1_t, a), H5T_NATIVE_INT);
    H5Tinsert(filetype, "b", HOFFSET(s1_t, b), H5T_NATIVE_INT);
    

    const hsize_t ndims = 1;
    const hsize_t ncols = 1;

    hsize_t dims[ndims] =  1 ;
    hsize_t max_dims[ndims] =  H5S_UNLIMITED ;
    hid_t file_space = H5Screate_simple(ndims, dims, max_dims);
    std::cout << "- Dataspace created" << std::endl;


    hid_t plist = H5Pcreate(H5P_DATASET_CREATE);
    H5Pset_layout(plist, H5D_CHUNKED);
    hsize_t chunk_dims[ndims] =  1 ;
    H5Pset_chunk(plist, ndims, chunk_dims);
    std::cout << "- Property list created" << std::endl;

    //Create the unlimited dataset.
    dset = H5Dcreate(file, DATASET, filetype, file_space, H5P_DEFAULT, plist, H5P_DEFAULT);
    std::cout << "- Dataset 'dset1' created " << dset<<std::endl;


    status = H5Dwrite(dset, filetype, H5S_ALL, file_space, H5P_DEFAULT, &s1);

    

   
    //read back the data, extend the dataset,
    //and write new data to the extended portions.

    //Open file and get the dataset
    H5::H5File* file2 = new H5::H5File(FILE, H5F_ACC_RDWR, H5P_DEFAULT);
   
    H5::DataSet* dataset = new H5::DataSet( file2->openDataSet(DATASET));
        
    //new data to add to the dataset
    s1_t s2;
    s2.a = 98;
    s2.b = 55;
 
        
    dims[0] = 2;
    //dims[1] = ncols;
    hid_t mem_space = H5Screate_simple(ndims, dims, NULL);
    std::cout << "- Memory dataspace created" << std::endl;
    H5Dset_extent(dset, dims);
    std::cout << "- Dataset extended" << std::endl;
    
    file_space = H5Dget_space(dset);
    hsize_t start[2] =  0, 0 ;// Start of hyperslab
    hsize_t count[2] =  2, ncols ;// Block count
    
    H5Sselect_hyperslab(file_space, H5S_SELECT_SET, start, NULL, count, NULL);
    std::cout << "- First hyperslab selected" << std::endl;
    H5Dwrite(dset, filetype, mem_space, file_space, H5P_DEFAULT, &s2);
    std::cout << "- First buffer written" << std::endl;
    
    return 0;
   '''

【问题讨论】:

【参考方案1】:

如果您未绑定到特定库,请查看HDFql,因为它可以帮助您摆脱 HDF5 低级细节。

您的用例可以使用 C++ 中的这个库解决如下(即创建一个可扩展的数据集并逐行填充它,而不删除已存储的数据):

#include "HDFql.hpp"

typedef struct s1_t

    int a;
    int b;
s1_t;

int main(int argc, char *argv[])


    // declare variables
    s1_t s1;
    char script[1024];
    int number;

    // create an HDF5 file named 'example.h5' and use (i.e. open) it
    HDFql::execute("CREATE AND USE FILE example.h5");

    // create a dataset named 'dset' of data type compound composed of two members named 'a' and 'b'. The dataset starts
    // with one row and can grow (i.e. be extended) in an unlimited fashion
    sprintf(script, "CREATE DATASET dset AS COMPOUND(a AS INT OFFSET %d, b AS INT OFFSET %d)(UNLIMITED)", offsetof(s1_t, a), offsetof(s1_t, b));
    HDFql::execute(script);

    // register variable 's1' for subsequent use (by HDFql)
    number = HDFql::variableRegister(&s1);

    // populate variable 's1' with certain values
    s1.a = 19;
    s1.b = 67;

    // insert (i.e. write) data from variable 's1' into the last row of dataset 'dset' (thanks to a point selection)
    sprintf(script, "INSERT INTO dset(-1) VALUES FROM MEMORY %d", number);
    HDFql::execute(script);

    // alter (i.e. change) dimension of dataset 'dset' to +1 (i.e. add a new row at the end of 'dset')
    HDFql::execute("ALTER DIMENSION dset TO +1");

    // populate variable 's1' with certain values
    s1.a = 98;
    s1.b = 55;

    // insert (i.e. write) data from variable 's1' into the last row of dataset 'dset' (thanks to a point selection)
    sprintf(script, "INSERT INTO dset(-1) VALUES FROM MEMORY %d", number);
    HDFql::execute(script);

    return EXIT_SUCCESS;


有关更多信息,请查看 HDFql reference manual 并查看一些有关如何使用它的示例 here。

【讨论】:

以上是关于HDF5Cpp 扩展复合数据集 Hyperslab 问题的主要内容,如果未能解决你的问题,请参考以下文章

使用复合键对数据集进行聚合

如何在 HDF5 C# 上创建复合数据集

通过 h5py (HDF5) 写入具有可变长度字符串的复合数据集

如何在 C++ 中将 stl::string 写入 HDF5 文件

科研进展人工智能和机器学习可以分析数据集,为新的复合材料提供组合

为啥事实表中的维度成员集通常用作复合键?