HDF5:如何将数据附加到数据集(可扩展数组)
Posted
技术标签:
【中文标题】HDF5:如何将数据附加到数据集(可扩展数组)【英文标题】:HDF5: How to append data to a dataset (extensible array) 【发布时间】:2014-07-19 00:58:20 【问题描述】:通过关注this tutorial,我尝试扩展我的 HDF5 数据集。代码如下,但是数据未正确写入数据集(数据集具有正确的最终大小,但仅包含零)。与教程的唯一区别是我必须使用动态数组。有什么想法吗?
int main()
hsize_t dims[1], max_dims[1], newdims[1], chunk_dims[1], offset[1];
hid_t file, file_space, plist, dataset, mem_space;
int32_t *buffer1, *buffer2;
file = H5Fcreate("test.h5", H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT);
// Create dataspace with initial dim = 0 and final = UNLIMITED
dims[0] = 0;
max_dims[0] = H5S_UNLIMITED;
file_space = H5Screate_simple(RANK, dims, max_dims);
// Create dataset creation property list to have chunks
plist = H5Pcreate(H5P_DATASET_CREATE);
H5Pset_layout(plist, H5D_CHUNKED);
chunk_dims[0] = 2;
H5Pset_chunk(plist, RANK, chunk_dims);
// Create the dataset
dataset = H5Dcreate(file, "Test", H5T_NATIVE_INT32, file_space, H5P_DEFAULT, plist, H5P_DEFAULT);
H5Pclose(plist);
H5Sclose(file_space);
//## FIRST BUFFER
int length = 6;
buffer1 = new int32_t[length];
for (hsize_t i = 0; i < length; i++)
buffer1[i] = i;
// Extend the dataset by getting previous size and adding current length
file_space = H5Dget_space(dataset);
H5Sget_simple_extent_dims(file_space, dims, NULL);
newdims[0] = dims[0] + length;
H5Dset_extent(dataset, newdims);
// Select hyperslab on the file dataset
offset[0] = dims[0];
dims[0] = length;
H5Sselect_hyperslab(file_space, H5S_SELECT_SET, offset, NULL, dims, NULL);
// Dataspace for buffer in memory
mem_space = H5Screate_simple(RANK, dims, NULL);
// Append buffer to dataset
H5Dwrite(dataset, H5T_NATIVE_INT32, mem_space, file_space, H5P_DEFAULT, buffer1);
H5Sclose(file_space);
H5Sclose(mem_space);
//## SECOND BUFFER
length = 4;
buffer2 = new int32_t[length];
for (hsize_t i = 0; i < length; i++)
buffer2[i] = i;
// Extend the dataset by getting previous size and adding current length
file_space = H5Dget_space(dataset);
H5Sget_simple_extent_dims(file_space, dims, NULL);
newdims[0] = dims[0] + length;
H5Dset_extent(dataset, newdims);
// Select hyperslab on the file dataset
offset[0] = dims[0];
dims[0] = length;
H5Sselect_hyperslab(file_space, H5S_SELECT_SET, offset, NULL, dims, NULL);
// Dataspace for buffer in memory
mem_space = H5Screate_simple(RANK, dims, NULL);
// Append buffer to dataset
H5Dwrite(dataset, H5T_NATIVE_INT32, mem_space, file_space, H5P_DEFAULT, buffer2);
H5Sclose(file_space);
H5Sclose(mem_space);
H5Dclose(dataset);
H5Fclose(file);
delete[] buffer1;
delete[] buffer2;
【问题讨论】:
【参考方案1】:我已经找到了解决问题的方法。它与动态数组无关。问题是,在调用H5Sget_simple_extent_dims
之后,数据空间 id 会以某种方式失效(我不明白为什么......一个错误?)并且您需要在重用它之前再次获取它,例如在选择 hyperslab 之前:
// Select hyperslab on the file dataset
offset[0] = dims[0];
dims[0] = length;
H5Dclose(file_space); // --ADDED-- CLOSE THE PREVIOUSLY OPENED
file_space = H5Dget_space(dataset); // --ADDED-- REOPEN
H5Sselect_hyperslab(file_space, H5S_SELECT_SET, offset, NULL, dims, NULL);
【讨论】:
根据文档(我没有使用 C API),H5Dget_space 会复制数据集。 H5Dset_extent 适用于原始数据集,而不是副本。在获取新文件之前,您应该 H5Sclose(file_space)。 @mpez0 在我的原始帖子中我确实在重新打开第二个缓冲区之前关闭了它,但它仍然无法正常工作。事实是,您必须在调用 H5Sget_simple_extent_dims 之后再次重新打开,当然在重新打开之前,正如您所说的我们需要关闭它(我将此行添加到我的答案中) 这是一个非常重要的注释,绝对应该在文档中更好地突出显示。以上是关于HDF5:如何将数据附加到数据集(可扩展数组)的主要内容,如果未能解决你的问题,请参考以下文章
将浮点数组写入和附加到 C++ 中 hdf5 文件中的唯一数据集