将嵌套集合转换为平面数据表

Posted

技术标签:

【中文标题】将嵌套集合转换为平面数据表【英文标题】:Converting nested collection into flat Data Table 【发布时间】:2020-12-01 16:23:25 【问题描述】:

我有一个嵌套的值集合(ID 和名称),它们存储为XML。我想将这个XML 转换为单个数据表,同时保持父子关系。

<Apartment id="1A" name="Apartment 1A">
    <ApartmentComponent id="300" name="Living Room" />
    <ApartmentComponent id="301" name="Bathroom">
        <ApartmentComponent id="2698" name="Tub" />
        <ApartmentComponent id="8204" name="Sink" />
    </ApartmentComponent>
</Apartment>
<Apartment id="2A" name="Apartment 2A">
    <ApartmentComponent id="302" name="Dining Room">
        <ApartmentComponent id="2635" name="Table" />
        <ApartmentComponent id="2746" name="Cabinet" />
    </ApartmentComponent>
    <ApartmentComponent id="301" name="Bathroom">
        <ApartmentComponent id="8204" name="Sink">
            <ApartmentComponent id="56352" name="Drain Plug" />
        </ApartmentComponent>
    </ApartmentComponent>
</Apartment>

生成的表格如下所示...

+-------+--------------+-----------+--------------+
|   ID  |     Value    | Parent ID | Parent Value |
+-------+--------------+-----------+--------------+
|   1A  | Apartment 1A |           |              |
+-------+--------------+-----------+--------------+
|  300  |  Living Room |     1A    | Apartment 1A |
+-------+--------------+-----------+--------------+
|  301  |   Bathroom   |     1A    | Apartment 1A |
+-------+--------------+-----------+--------------+
|  2698 |      Tub     |    301    |   Bathroom   |
+-------+--------------+-----------+--------------+
|  8204 |     Sink     |    301    |   Bathroom   |
+-------+--------------+-----------+--------------+
|   2A  | Apartment 2A |           |              |
+-------+--------------+-----------+--------------+
|  302  |  Dining Room |     2A    | Apartment 2A |
+-------+--------------+-----------+--------------+
|  2635 |     Table    |    302    |  Dining Room |
+-------+--------------+-----------+--------------+
|  2746 |    Cabinet   |    302    |  Dining Room |
+-------+--------------+-----------+--------------+
|  301  |   Bathroom   |     2A    | Apartment 2A |
+-------+--------------+-----------+--------------+
|  8204 |     Sink     |    301    |   Bathroom   |
+-------+--------------+-----------+--------------+
| 56352 |  Drain Plug  |    8204   |     Sink     |
+-------+--------------+-----------+--------------+

我通常会从使用XDocument 解析XML 开始。这会给我一个可以使用LINQ 查询的集合。但是我不确定如何将结果集合解析为平面表结构。我需要使用递归函数吗?

【问题讨论】:

看看Descendants() 【参考方案1】:

就像上面评论的 madreflection 一样,XContainer.Descendants() 将在不需要递归的情况下执行此操作 - 或者更确切地说,它会为您处理递归,一直到排水塞甚至更远。

假设有一个&lt;root&gt; 元素围绕着这个sn-p,所以XDocument will parse it normally,

foreach (var element in xdoc.Root.Descendants())

    (string, string, string, string) values = (
        element.Attribute("id").Value,
        element.Attribute("name").Value,
        element.Parent?.Attribute("id")?.Value,
        element.Parent?.Attribute("name")?.Value
    );
    Console.WriteLine(values);

【讨论】:

【参考方案2】:

您也可以选择这种方法来获得所需的解决方案。

第 1 阶段:反序列化 类对象 注意:为了理解 ChildItemCollection 类,这是一篇很好的文章,用于在父子关系完整的情况下对 XML 进行序列化。你可以在这里查看C# Parent/child relationship and XML serialization

[XmlRoot("root")]
public class Apartment

    public Apartment()
    
        this.Children = new ChildItemCollection<Apartment, ApartmentComponent>(this);
    

    [XmlAttribute("id")]
    public string Id  get; set; 

    [XmlAttribute("name")]
    public string Name  get; set; 

    [XmlElement("ApartmentComponent")]
    public ChildItemCollection<Apartment, ApartmentComponent> Children  get; private set; 


public class ApartmentComponent : IChildItem<Apartment>, IChildItem<ApartmentComponent>

    public ApartmentComponent()
    
        this.Children = new ChildItemCollection<ApartmentComponent, ApartmentComponent>(this);
    

    [XmlAttribute("id")]
    public string Id  get; set; 

    [XmlAttribute("name")]
    public string Name  get; set; 

    [XmlElement("ApartmentComponent")]
    public ChildItemCollection<ApartmentComponent, ApartmentComponent> Children  get; private set; 

    [XmlIgnore]
    public Apartment ParentApartment  get; internal set; 

    Apartment IChildItem<Apartment>.Parent
    
        get
        
            return this.ParentApartment;
        
        set
        
            this.ParentApartment = value;
        
    

    [XmlIgnore]
    public ApartmentComponent ParentComponent  get; internal set; 
    ApartmentComponent IChildItem<ApartmentComponent>.Parent
    
        get
        
            return this.ParentComponent;
        
        set
        
            this.ParentComponent = value;
        
    

实施

XmlSerializer serializer = new XmlSerializer(typeof(List<Apartment>), new XmlRootAttribute("root"));
        using (FileStream fileStream = new FileStream("XMLFile1.xml", FileMode.Open))
        
            var result = (List<Apartment>)serializer.Deserialize(fileStream);
        

第 2 阶段:扁平化为扁平数据。 这很复杂,这是我可以采用的方法。但希望还有其他更好的选择。

类对象

public class ResultItem

    public string Id  get; set; 
    public string Value  get; set; 
    public string ParentId  get; set; 
    public string ParentValue  get; set; 

实施

 XmlSerializer serializer = new XmlSerializer(typeof(List<Apartment>), new XmlRootAttribute("root"));
        using (FileStream fileStream = new FileStream("XMLFile1.xml", FileMode.Open))
        
            var result = (List<Apartment>)serializer.Deserialize(fileStream);
            var items = result.Select(item => new ResultItem 
                Id = item.Id,
                Value = item.Name
            ).ToList();

            result.ForEach(apartment =>
            
                apartment.Children.ToList().ForEach(component => 
                    setItemResult(component, items);
                );
            );

            dataGridView1.DataSource = items;
        
private void setItemResult(ApartmentComponent apartmentComponent, List<ResultItem> items)
    
        if (apartmentComponent.Children.Count > 0)
        
            apartmentComponent.Children.ToList().ForEach(component => 
                setItemResult(component, items);
            );
        

        var item = new ResultItem
        
            Id = apartmentComponent.Id,
            Value = apartmentComponent.Name
        ;

        var parentApartment = apartmentComponent.ParentApartment;
        var parentComponent = apartmentComponent.ParentComponent;

        if (parentApartment != null)
        
            item.ParentId = parentApartment.Id;
            item.ParentValue = parentApartment.Name;
        

        if (parentComponent != null)
        
            item.ParentId = parentComponent.Id;
            item.ParentValue = parentComponent.Name;
        

        items.Add(item);
    

输出: 编码愉快,干杯!

【讨论】:

以上是关于将嵌套集合转换为平面数据表的主要内容,如果未能解决你的问题,请参考以下文章

我如何将平面数据框转换为 spark(scala 或 java)中的嵌套 json

将平面对象数组转换为嵌套对象数组[重复]

将平面集合转换为一对多 C#/Linq

将嵌套 JSON 转换为平面 JSON

将 MongoDB 集合转换为嵌套对象数组

使用 python 将嵌套 json 转换为平面 json 时遇到困难