C# Linq 对重复值的完全外连接

Posted

技术标签:

【中文标题】C# Linq 对重复值的完全外连接【英文标题】:C# Linq full outer join on repetitive values 【发布时间】:2018-05-30 08:37:20 【问题描述】:

我有两个具有这种类型的 IQueryable 集合

public class Property  
  
   public string Name get; set;  

集合 1,具有以下名称值:

A  
A  
A  
B  

集合 2,具有以下名称值:

A  
B  
B

我想得到的是第三个集合,其中集合 1 和 2 的名称值匹配,如果不匹配,则为 null (empty),如下所示:

Result Collection:  

A     A
A     null  
A     null  
B     B  
null  B

如何使用 C#、LINQ 实现这一点?

【问题讨论】:

请试用答案.. 【参考方案1】:

我认为,最好的选择就是使用loop

        var listA = new List<Property>
        
            new Property Name = "A" ,
            new Property Name = "A" ,
            new Property Name = "A" ,
            new Property Name = "B" 
        ;
        var listB = new List<Property>
        
            new Property Name = "A" ,
            new Property Name = "B" ,
            new Property Name = "B" 
        ;
        var joinedList = new List<JoinedProperty>();
        for (int i = 0; i < listA.Count; i++)
        
            var property = new JoinedProperty
            
                AName = listA[i].Name,
                BName = null
            ;
            if (listB.Count < i + 1)
            
                continue;
            
            if (listA[i].Name == listB[i].Name)
            
                property.BName = listA[i].Name;
            
            joinedList.Add(property);
        
        for (int i = 0; i < listB.Count; i++)
        
            var property = new JoinedProperty
            
                AName = null,
                BName = listB[i].Name
            ;
            if (listA.Count < i + 1)
            
                continue;
            
            if (listB[i].Name == listA[i].Name)
            
                property.AName = listB[i].Name;
            
            joinedList.Add(property);
        

        public class JoinedProperty
        
             public string AName  get; set; 
             public string BName  get; set; 
        

另外,我认为您的输出示例缺少一个元素;

null B

输出;

A     A
A     null  
A     null  
B     B  
null  B
null  B

【讨论】:

【参考方案2】:
public class Property

    public string Name  get; set; 


var list1 = new List<Property>

    new Property  Name ="A" ,
    new Property  Name ="A" ,
    new Property  Name ="A" ,
    new Property  Name ="B" 
;

var list2 = new List<Property>

    new Property  Name ="A" ,
    new Property  Name ="B" ,
    new Property  Name ="B" 
;

var r = new List<string>();
int x1 = 0, x2 = 0;
int count1 = list1.Count, count2 = list2.Count;

while (true)

    if (x1 == count1 && x2 == count2) break;

    if (x1 < count1 && x2 == count2)
    
        r.Add($"list1[x1].Name\tNULL");
        ++x1;
    
    else if (x1 == count1 && x2 < count2)
    
        r.Add($"NULL\tlist2[x2].Name");
        ++x2;
    
    else
    
        if (list1[x1].Name == list2[x2].Name)
        
            r.Add($"list1[x1].Name\tlist2[x2].Name");
            ++x1; ++x2;
        
        else
        
            r.Add($"list1[x1].Name\tNULL");
            ++x1;
        
    

说明

我们的想法是管理列表中的职位 - 即我们是否应该提升职位。查找完所有位置后循环退出。

【讨论】:

如果你运行它,你会得到一个 ArgumentOutOfRangeException。 @DaveBarnett 我将结果直接放入格式化的字符串中,但 OP 可以改为使用包含连接结果的类创建集合。对于代码,这并不重要:)【参考方案3】:
using System;
using System.Collections.Generic;
using System.Linq;    

namespace Testing

    public class Property
    
        public string Name  get; set; 

        public override bool Equals(object obj)
        
            var item = obj as Property;

            if (item == null)
            
                return false;
            
            return item.Name == Name;
        

        public override int GetHashCode()
        
            return Name.GetHashCode();
        
    

    public class JoinedProperty
    
        public Property Name1  get; set; 
        public Property Name2  get; set; 

        public override string ToString()
        
            return (Name1 == null ? "" : Name1.Name)
                + (Name2 == null ? "" : Name2.Name);
        
    

    class Program
    
        static void Main(string[] args)
        
            var list1 = new List<Property>
            
                new Property Name = "A" ,
                new Property Name = "A" ,
                new Property Name = "A" ,
                new Property Name = "B" 
            ;

            var list2 = new List<Property>
            
                new Property Name = "A" ,
                new Property Name = "B" ,
                new Property Name = "B" 
            ;

            var allLetters = list1.Union(list2).Distinct().ToList();

            var result = new List<JoinedProperty>();

            foreach (var letter in allLetters)
            
                var list1Count = list1.Count(l => l.Name == letter.Name);
                var list2Count = list2.Count(l => l.Name == letter.Name);

                var matchCount = Math.Min(list1Count, list2Count);

                addValuesToResult(result, letter, letter, matchCount);

                var difference = list1Count - list2Count;

                if(difference > 0)
                
                    addValuesToResult(result, letter, null, difference);                   
                
                else
                
                    difference = difference * -1;
                    addValuesToResult(result,null, letter, difference);                   
                
            
            foreach(var res in result)
            
                Console.WriteLine(res.ToString());
            
            Console.ReadLine();                
        

        private static void addValuesToResult(List<JoinedProperty> result, Property letter1, Property letter2, int count)
        
            for (int i = 0; i < count; i++)
            
                result.Add(new JoinedProperty
                
                    Name1 = letter1,
                    Name2 = letter2
                );
            
        
    

运行这个,你会得到结果

AA
A
A
BB
B

结果列表的内容就是你所追求的。

编辑:更新了我的答案以使用指定的属性。

【讨论】:

戴夫,该解决方案适用于输入集合中元素(A 和 B)的任何顺序。非常感谢! 查看我的其他答案以获得更通用的解决方案【参考方案4】:

似乎对这个问题很感兴趣,所以我试图提出一个更通用的解决方案。我从这个链接https://www.codeproject.com/Articles/488643/LinQ-Extended-Joins获得了灵感。

我创建了一个 fullouterjoin 扩展方法,它可以满足操作的要求。不确定 fullouterjoin 是否是正确的名称。

我已经用我的扩展方法解决了ops问题。

using System;
using System.Collections.Generic;
using System.Linq;


namespace Testing



    public class Property
    
        public string Name  get; set; 
    

    public class JoinedProperty
    
        public Property Name1  get; set; 
        public Property Name2  get; set; 

        public override string ToString()
        
            return (Name1 == null ? "" : Name1.Name)
                + (Name2 == null ? "" : Name2.Name);
          
    

    class Program
    
        static void Main(string[] args)
        
            var list1 = new List<Property>
        
            new Property Name = "A" ,
            new Property Name = "A" ,
            new Property Name = "A" ,
            new Property Name = "B" 
        ;

            var list2 = new List<Property>
        
            new Property Name = "A" ,
            new Property Name = "B" ,
            new Property Name = "B" 
        ;



            var result = list1.FullOuterJoin(
                list2,
                p1 => p1.Name,
                p2 => p2.Name,
                (p1, p2) => new JoinedProperty
                
                    Name1 = p1,
                    Name2 = p2
                ).ToList();


            foreach (var res in result)
            
                Console.WriteLine(res.ToString());
            
            Console.ReadLine();

        

    

    public static class MyExtensions
    



        public static IEnumerable<TResult>
            FullOuterJoin<TSource, TInner, TKey, TResult>(this IEnumerable<TSource> source,
                                IEnumerable<TInner> inner,
                                Func<TSource, TKey> pk,
                                Func<TInner, TKey> fk,
                                Func<TSource, TInner, TResult> result)
            where TSource : class where TInner : class
        

            var fullList = source.Select(s => new Tuple<TSource, TInner>(s, null))
                .Concat(inner.Select(i => new Tuple<TSource, TInner>(null, i)));


            var joinedList = new List<Tuple<TSource, TInner>>();

            foreach (var item in fullList)
            
                var matchingItem = joinedList.FirstOrDefault
                    (
                        i => matches(i, item, pk, fk)
                    );

                if(matchingItem != null)
                
                    joinedList.Remove(matchingItem);
                    joinedList.Add(combinedMatchingItems(item, matchingItem));
                
                else
                
                    joinedList.Add(item);
                
            
            return joinedList.Select(jl => result(jl.Item1, jl.Item2)).ToList();

        

        private static Tuple<TSource, TInner> combinedMatchingItems<TSource, TInner>(Tuple<TSource, TInner> item1, Tuple<TSource, TInner> item2)
            where TSource : class
            where TInner : class
        
            if(item1.Item1 == null && item2.Item2 == null && item1.Item2 != null && item2.Item1 !=null)
            
                return new Tuple<TSource, TInner>(item2.Item1, item1.Item2);
            

            if(item1.Item2 == null && item2.Item1 == null && item1.Item1 != null && item2.Item2 != null)
            
                return new Tuple<TSource, TInner>(item1.Item1, item2.Item2);
            

            throw new InvalidOperationException("2 items cannot be combined");
        

        public static bool matches<TSource, TInner, TKey>(Tuple<TSource, TInner> item1, Tuple<TSource, TInner> item2, Func<TSource, TKey> pk, Func<TInner, TKey> fk)
            where TSource : class
            where TInner : class
                  

            if (item1.Item1 != null && item1.Item2 == null && item2.Item2 != null && item2.Item1 == null && pk(item1.Item1).Equals(fk(item2.Item2)))
            
                return true;
            

            if (item1.Item2 != null && item1.Item1 == null && item2.Item1 != null && item2.Item2 == null && fk(item1.Item2).Equals(pk(item2.Item1)))
            
                return true;
            

            return false;

        

    

【讨论】:

【参考方案5】:

你要求一个 LINQ 函数,其实没有,但你可以扩展它,所以它可以用于你想要这个技巧的任何两个序列。

您所要做的就是编写一个与所有其他 LINQ 函数类似的 IEnumerable 扩展函数。

见Extension Methods Demystified

public static class MyEnumerableExtensions

    public IEnumerable<System.Tuple<T, T>> EqualityZip<T>(this IEnumerable<T> sourceA,
        IEnumerable<T> sourceB)
    
        // TODO: check for parameters null

        var enumeratorA = sourceA.GetEnumerator();
        var enumeratorB = sourceB.GetEnumerator();

        // enumerate as long as we have elements in A and in B:
        bool aAvailable = enumeratorA.MoveNext();
        bool bAvailable = enumeratorB.MoveNext();
        while (aAvailable && bAvailable)
           // we have an A element and a B element
            T a = enumeratorA.Current;
            T b = enumeratorB.Current;

            // compare the two elements:
            if (a == b)
               // equal: return tuple (a, b)
                yield return Tuple.Create(a, b)
            
            else
               // not equal, return (a, null)
                yield return Tuple.Create(a, (T)null)
            

            // move to the next element
            aAvailable = enumeratorA.MoveNext();
            bAvailable = enumeratorB.MoveNext();
        
        // now either we are out of A or out of B

        while (aAvailable)
           // we still have A but no B, return (A, null)
            T A = enumeratorA.Current;
            yield return Tuple.Create(A, (T)null);
            aAvailable = enumeratorA.MoveNext();
        
        while (bAvailable)
           // we don't have A, but there are still B, return (null, B)
            T B = enumeratorB.Current;
            yield return Tuple.Create((T)null, B);
            bAvailable = enumeratorB.MoveNext();
        

        // if there are still A elements without B element: return (a, null)
        while (enumaratorA.Nex
    

用法:

var sequenceA = ...
var sequenceB = ...
var result = sequenceA.EqualityZip(sequenceB);

TODO:使函数更好,可以比较两个不同的类,KeySelectors 为 A 和 B 选择比较键和一个 IEqualityCompare:

public static IEnumerable<Tuple<TA, TB> EqualityZip<TA, TB, TKey>(
    this IEnumerable<TA> sourceA,   // the first sequence
    this IEnumerable<TB> sourceB,   // the second sequence
    Func<TA, TKey> keySelectorA,    // the property of sourceA to take
    Func<TB, TKey> keySelectorB,    // the property of sourceB to take
    IEqualityComparer<TKey> comparer)

    // TODO: ArgumentNullException if arguments null
    if (comparer==null) comparer = EqualityCompare<TKey>.Default;

     var enumeratorA = sourceA.GetEnumerator();
        var enumeratorB = sourceB.GetEnumerator();

        // enumerate as long as we have elements in A and in B:
        bool aAvailable = enumeratorA.MoveNext();
        bool bAvailable = enumeratorB.MoveNext();
        while (aAvailable && bAvailable)
           // we have an A element and a B element
            TKey keyA = keySelectorA(enumeratorA.Current);
            TKey keyB = keySelectorB(enumeratorB.Current);
            if (comparer.Equals(keyA, keyB)
            
                yield return Tuple.Create(Ta, Tb)
            
            else

等等

【讨论】:

以上是关于C# Linq 对重复值的完全外连接的主要内容,如果未能解决你的问题,请参考以下文章

如何在 C# 中进行完全外连接? [复制]

LINQ查询中的左外连接[重复]

如何在 C# 中使用 LINQ 应用右外连接?

使用 LINQ 查询语法 EF Core C# 的左外连接

C# - 多个属性上的动态 Linq 左外连接

LINQ中左外连接的等价物[重复]