如何将自定义类型的列表/数组写入 HDF5 文件?

Posted

技术标签:

【中文标题】如何将自定义类型的列表/数组写入 HDF5 文件?【英文标题】:How do I write a list/array of custom types into HDF5 file? 【发布时间】:2015-12-13 05:24:52 【问题描述】:

我需要使用 C# 将一些数据写入 HDF5 文件。我没有绑定到任何特定的库,但我能找到的唯一可访问的库是 HDF5DotNet (http://hdf5.net/)。我有一个包含一些数据的简单类(只需使用默认值初始化即可轻松测试它,通常从传感器读取数据):

public class mData

    public double temp = 123.456789;
    public double humid = 223.456789;
    public int chamberId = 5;

如何将数组或列表写入 HDF5 文件? HDF5 似乎太棒了,不能成为特例,但我找不到任何合适的文档。这是我尝试过的,下面的例子:

List<mData> mdl = new List<mData>();
mdl.Add(new mData()); // create some dummy data
mdl.Add(new mData());
mdl.Add(new mData());
string filename = "test.h5";

const int DATA_ARRAY_LENGTH = 3;
const int RANK = 1; // <- unsure about this one

H5FileId fileId = H5F.create(filename, H5F.CreateMode.ACC_TRUNC);
H5GroupId groupId = H5G.create(fileId, "/myGroup");

long[] dims = new long[RANK]; // <- unsure, extrapolated from example
dims[0] = DATA_ARRAY_LENGTH;

H5DataSpaceId spaceId = H5S.create_simple(RANK, dims);
H5DataTypeId typeId = H5T.copy(H5T.H5Type.NATIVE_DOUBLE); // <- NATIVE_DOUBLE is definitely wrong but I dont know which type to use in this case

int typeSize = H5T.getSize(typeId);
H5DataSetId dataSetId = H5D.create(fileId, "/myDataset", typeId, spaceId);

H5D.write(dataSetId, new H5DataTypeId(H5T.H5Type.NATIVE_DOUBLE), 
  new H5Array<double>(mdl)); // <- NATIVE_DOUBLE and H5Array<double> are definitely wrong but again unsure which type to use in both cases

H5G.close(groupId);
H5F.close(fileId);

非常感谢您对其他图书馆的任何帮助或指导!

【问题讨论】:

我会使用compound data type。这是example 该示例是 c++,并且似乎确实不能直接移植到 C#,因为它涉及大量内存管理和不同的库。 HDF5 对 c# 的人来说真的那么难以访问吗? 【参考方案1】:

(轻松)解决将自定义类型(即复合)列表/数组写入 C# 中的 HDF5 数据集的问题的一种方法是使用 HDFql。在 HDFql 中,可以这样解决:

// use HDFql namespace (make sure it can be found by the C# compiler)
using AS.HDFql;

using System;

using System.Runtime.InteropServices;

public class Test


    [StructLayout(LayoutKind.Sequential, Pack=1)]
    struct Sensor
    
        public double temp;
        public double humid;
        public int chamberId;
    

    public static void Main(string []args)
    
        // declare variable 'data'
        Sensor []data = new Sensor[3];

        // create an HDF5 file named 'test.h5' and use (i.e. open) it
        HDFql.Execute("create and use file test.h5");

        // create a dataset named 'myDataset' of data type compound with three members (temp, humid and chamberId)
        HDFql.Execute("create dataset myDataset as compound(temp as double, humid as double, chamberId as int)(3)");

        // populate variable 'data' with values
        data[0].temp = 10;
        data[0].humid = 12.1;
        data[0].chamberId = 14;
        data[1].temp = 20;
        data[1].humid = 22.5;
        data[1].chamberId = 27;
        data[2].temp = 30;
        data[2].humid = 32.3;
        data[2].chamberId = 39;

        // insert (i.e. write) content of variable 'data' into dataset 'myDataset'
        HDFql.Execute("insert into myDataset values from memory " + HDFql.VariableRegister(data));
    

【讨论】:

【参考方案2】:

使用H5T.H5Type.STD_REF_OBJ 代替H5T.H5Type.NATIVE_DOUBLE 有效。

这个测试代码,

static void Main(string[] args)

        SampleH5Modified sh5 = new SampleH5Modified("TestFile01.h5", 5);
        sh5.Run();

将生成此输出:

Creating H5 file TestFile01.h5...
H5 file TestFile01.h5 created successfully!
Reading H5 file TestFile01.h5...
chamberId=1 humid=100 temp=80
chamberId=2 humid=101 temp=81
chamberId=3 humid=102 temp=82
chamberId=4 humid=103 temp=83
chamberId=5 humid=104 temp=84
Processing complete!
public class SampleH5Modified

    private string filename;
    private int count;
    const string dataSetName = "/SampleDataSet";

    public SampleH5Modified(string filename, int count)
    
        this.filename = filename;
        this.count = count;
    

    public void Run()
    
        List<mData> data1 = new List<mData>();

        for (int i = 0; i < count; i++)
            data1.Add(new mData(i + 80, i + 100, i + 1));

        WriteData(data1);

        mData[] data2 = ReadData();

        foreach (mData d in data2)
            Console.WriteLine("chamberId=0 humid=1 temp=2", d.chamberId, d.humid, d.temp);

        Console.WriteLine("Processing complete!");
    

    private void WriteData(List<mData> data)
    
        Console.WriteLine("Creating H5 file 0...", filename);

        const int RANK = 1;
        long[] dims = new long[RANK];
        dims[0] = count;

        H5FileId fileId = H5F.create(filename, H5F.CreateMode.ACC_TRUNC);

        H5DataSpaceId spaceId = H5S.create_simple(RANK, dims);

        H5DataTypeId typeId = H5T.copy(H5T.H5Type.STD_REF_OBJ);

        int typeSize = H5T.getSize(typeId);

        H5DataSetId dataSetId = H5D.create(fileId, dataSetName, typeId, spaceId);

        H5D.write(dataSetId, new H5DataTypeId(H5T.H5Type.STD_REF_OBJ), new H5Array<mData>(data.ToArray()));

        H5D.close(dataSetId);
        H5F.close(fileId);

        Console.WriteLine("H5 file 0 created successfully!", filename);
    

    private mData[] ReadData()
    
        Console.WriteLine("Reading H5 file 0...", filename);

        H5FileId fileId = H5F.open(filename, H5F.OpenMode.ACC_RDONLY);

        H5DataSetId dataSetId = H5D.open(fileId, dataSetName);

        mData[] readDataBack = new mData[count];

        H5D.read(dataSetId, new H5DataTypeId(H5T.H5Type.STD_REF_OBJ), new H5Array<mData>(readDataBack));

        H5D.close(dataSetId);
        H5F.close(fileId);

        return readDataBack;
    

【讨论】:

嗨 jstreet,我已经运行了您的示例,当我在 HDF 视图中打开文件时,我只得到一个 5 行 x 1 列的表,其中包含非常大的数字。我认为它只是将数据的引用存储在内存中,当程序停止运行时它会丢失。有没有办法将实际数据存储在文件中?

以上是关于如何将自定义类型的列表/数组写入 HDF5 文件?的主要内容,如果未能解决你的问题,请参考以下文章

如何在 C 中将动态分配的 3D 数组写入 hdf5 文件?

如何使用 C++ API 在 HDF5 文件中写入/读取锯齿状数组?

如何使用 Fortran API 将字符串数组写入 HDF5 数据集?

将hdf5文件中的uint32数据写入java中的数组的最简单方法是啥?

将浮点数组写入和附加到 C++ 中 hdf5 文件中的唯一数据集

如何在 C++ 中将 stl::string 写入 HDF5 文件