C# Linq 对重复值的完全外连接
Posted
技术标签:
【中文标题】C# Linq 对重复值的完全外连接【英文标题】:C# Linq full outer join on repetitive values 【发布时间】:2018-05-30 08:37:20 【问题描述】:我有两个具有这种类型的 IQueryable 集合
public class Property
public string Name get; set;
集合 1,具有以下名称值:
A
A
A
B
集合 2,具有以下名称值:
A
B
B
我想得到的是第三个集合,其中集合 1 和 2 的名称值匹配,如果不匹配,则为 null (empty)
,如下所示:
Result Collection:
A A
A null
A null
B B
null B
如何使用 C#、LINQ 实现这一点?
【问题讨论】:
请试用答案.. 【参考方案1】:我认为,最好的选择就是使用loop
;
var listA = new List<Property>
new Property Name = "A" ,
new Property Name = "A" ,
new Property Name = "A" ,
new Property Name = "B"
;
var listB = new List<Property>
new Property Name = "A" ,
new Property Name = "B" ,
new Property Name = "B"
;
var joinedList = new List<JoinedProperty>();
for (int i = 0; i < listA.Count; i++)
var property = new JoinedProperty
AName = listA[i].Name,
BName = null
;
if (listB.Count < i + 1)
continue;
if (listA[i].Name == listB[i].Name)
property.BName = listA[i].Name;
joinedList.Add(property);
for (int i = 0; i < listB.Count; i++)
var property = new JoinedProperty
AName = null,
BName = listB[i].Name
;
if (listA.Count < i + 1)
continue;
if (listB[i].Name == listA[i].Name)
property.AName = listB[i].Name;
joinedList.Add(property);
public class JoinedProperty
public string AName get; set;
public string BName get; set;
另外,我认为您的输出示例缺少一个元素;
null B
输出;
A A
A null
A null
B B
null B
null B
【讨论】:
【参考方案2】:public class Property
public string Name get; set;
var list1 = new List<Property>
new Property Name ="A" ,
new Property Name ="A" ,
new Property Name ="A" ,
new Property Name ="B"
;
var list2 = new List<Property>
new Property Name ="A" ,
new Property Name ="B" ,
new Property Name ="B"
;
var r = new List<string>();
int x1 = 0, x2 = 0;
int count1 = list1.Count, count2 = list2.Count;
while (true)
if (x1 == count1 && x2 == count2) break;
if (x1 < count1 && x2 == count2)
r.Add($"list1[x1].Name\tNULL");
++x1;
else if (x1 == count1 && x2 < count2)
r.Add($"NULL\tlist2[x2].Name");
++x2;
else
if (list1[x1].Name == list2[x2].Name)
r.Add($"list1[x1].Name\tlist2[x2].Name");
++x1; ++x2;
else
r.Add($"list1[x1].Name\tNULL");
++x1;
说明
我们的想法是管理列表中的职位 - 即我们是否应该提升职位。查找完所有位置后循环退出。
【讨论】:
如果你运行它,你会得到一个 ArgumentOutOfRangeException。 @DaveBarnett 我将结果直接放入格式化的字符串中,但 OP 可以改为使用包含连接结果的类创建集合。对于代码,这并不重要:)【参考方案3】:using System;
using System.Collections.Generic;
using System.Linq;
namespace Testing
public class Property
public string Name get; set;
public override bool Equals(object obj)
var item = obj as Property;
if (item == null)
return false;
return item.Name == Name;
public override int GetHashCode()
return Name.GetHashCode();
public class JoinedProperty
public Property Name1 get; set;
public Property Name2 get; set;
public override string ToString()
return (Name1 == null ? "" : Name1.Name)
+ (Name2 == null ? "" : Name2.Name);
class Program
static void Main(string[] args)
var list1 = new List<Property>
new Property Name = "A" ,
new Property Name = "A" ,
new Property Name = "A" ,
new Property Name = "B"
;
var list2 = new List<Property>
new Property Name = "A" ,
new Property Name = "B" ,
new Property Name = "B"
;
var allLetters = list1.Union(list2).Distinct().ToList();
var result = new List<JoinedProperty>();
foreach (var letter in allLetters)
var list1Count = list1.Count(l => l.Name == letter.Name);
var list2Count = list2.Count(l => l.Name == letter.Name);
var matchCount = Math.Min(list1Count, list2Count);
addValuesToResult(result, letter, letter, matchCount);
var difference = list1Count - list2Count;
if(difference > 0)
addValuesToResult(result, letter, null, difference);
else
difference = difference * -1;
addValuesToResult(result,null, letter, difference);
foreach(var res in result)
Console.WriteLine(res.ToString());
Console.ReadLine();
private static void addValuesToResult(List<JoinedProperty> result, Property letter1, Property letter2, int count)
for (int i = 0; i < count; i++)
result.Add(new JoinedProperty
Name1 = letter1,
Name2 = letter2
);
运行这个,你会得到结果
AA
A
A
BB
B
结果列表的内容就是你所追求的。
编辑:更新了我的答案以使用指定的属性。
【讨论】:
戴夫,该解决方案适用于输入集合中元素(A 和 B)的任何顺序。非常感谢! 查看我的其他答案以获得更通用的解决方案【参考方案4】:似乎对这个问题很感兴趣,所以我试图提出一个更通用的解决方案。我从这个链接https://www.codeproject.com/Articles/488643/LinQ-Extended-Joins获得了灵感。
我创建了一个 fullouterjoin 扩展方法,它可以满足操作的要求。不确定 fullouterjoin 是否是正确的名称。
我已经用我的扩展方法解决了ops问题。
using System;
using System.Collections.Generic;
using System.Linq;
namespace Testing
public class Property
public string Name get; set;
public class JoinedProperty
public Property Name1 get; set;
public Property Name2 get; set;
public override string ToString()
return (Name1 == null ? "" : Name1.Name)
+ (Name2 == null ? "" : Name2.Name);
class Program
static void Main(string[] args)
var list1 = new List<Property>
new Property Name = "A" ,
new Property Name = "A" ,
new Property Name = "A" ,
new Property Name = "B"
;
var list2 = new List<Property>
new Property Name = "A" ,
new Property Name = "B" ,
new Property Name = "B"
;
var result = list1.FullOuterJoin(
list2,
p1 => p1.Name,
p2 => p2.Name,
(p1, p2) => new JoinedProperty
Name1 = p1,
Name2 = p2
).ToList();
foreach (var res in result)
Console.WriteLine(res.ToString());
Console.ReadLine();
public static class MyExtensions
public static IEnumerable<TResult>
FullOuterJoin<TSource, TInner, TKey, TResult>(this IEnumerable<TSource> source,
IEnumerable<TInner> inner,
Func<TSource, TKey> pk,
Func<TInner, TKey> fk,
Func<TSource, TInner, TResult> result)
where TSource : class where TInner : class
var fullList = source.Select(s => new Tuple<TSource, TInner>(s, null))
.Concat(inner.Select(i => new Tuple<TSource, TInner>(null, i)));
var joinedList = new List<Tuple<TSource, TInner>>();
foreach (var item in fullList)
var matchingItem = joinedList.FirstOrDefault
(
i => matches(i, item, pk, fk)
);
if(matchingItem != null)
joinedList.Remove(matchingItem);
joinedList.Add(combinedMatchingItems(item, matchingItem));
else
joinedList.Add(item);
return joinedList.Select(jl => result(jl.Item1, jl.Item2)).ToList();
private static Tuple<TSource, TInner> combinedMatchingItems<TSource, TInner>(Tuple<TSource, TInner> item1, Tuple<TSource, TInner> item2)
where TSource : class
where TInner : class
if(item1.Item1 == null && item2.Item2 == null && item1.Item2 != null && item2.Item1 !=null)
return new Tuple<TSource, TInner>(item2.Item1, item1.Item2);
if(item1.Item2 == null && item2.Item1 == null && item1.Item1 != null && item2.Item2 != null)
return new Tuple<TSource, TInner>(item1.Item1, item2.Item2);
throw new InvalidOperationException("2 items cannot be combined");
public static bool matches<TSource, TInner, TKey>(Tuple<TSource, TInner> item1, Tuple<TSource, TInner> item2, Func<TSource, TKey> pk, Func<TInner, TKey> fk)
where TSource : class
where TInner : class
if (item1.Item1 != null && item1.Item2 == null && item2.Item2 != null && item2.Item1 == null && pk(item1.Item1).Equals(fk(item2.Item2)))
return true;
if (item1.Item2 != null && item1.Item1 == null && item2.Item1 != null && item2.Item2 == null && fk(item1.Item2).Equals(pk(item2.Item1)))
return true;
return false;
【讨论】:
【参考方案5】:你要求一个 LINQ 函数,其实没有,但你可以扩展它,所以它可以用于你想要这个技巧的任何两个序列。
您所要做的就是编写一个与所有其他 LINQ 函数类似的 IEnumerable 扩展函数。
见Extension Methods Demystified
public static class MyEnumerableExtensions
public IEnumerable<System.Tuple<T, T>> EqualityZip<T>(this IEnumerable<T> sourceA,
IEnumerable<T> sourceB)
// TODO: check for parameters null
var enumeratorA = sourceA.GetEnumerator();
var enumeratorB = sourceB.GetEnumerator();
// enumerate as long as we have elements in A and in B:
bool aAvailable = enumeratorA.MoveNext();
bool bAvailable = enumeratorB.MoveNext();
while (aAvailable && bAvailable)
// we have an A element and a B element
T a = enumeratorA.Current;
T b = enumeratorB.Current;
// compare the two elements:
if (a == b)
// equal: return tuple (a, b)
yield return Tuple.Create(a, b)
else
// not equal, return (a, null)
yield return Tuple.Create(a, (T)null)
// move to the next element
aAvailable = enumeratorA.MoveNext();
bAvailable = enumeratorB.MoveNext();
// now either we are out of A or out of B
while (aAvailable)
// we still have A but no B, return (A, null)
T A = enumeratorA.Current;
yield return Tuple.Create(A, (T)null);
aAvailable = enumeratorA.MoveNext();
while (bAvailable)
// we don't have A, but there are still B, return (null, B)
T B = enumeratorB.Current;
yield return Tuple.Create((T)null, B);
bAvailable = enumeratorB.MoveNext();
// if there are still A elements without B element: return (a, null)
while (enumaratorA.Nex
用法:
var sequenceA = ...
var sequenceB = ...
var result = sequenceA.EqualityZip(sequenceB);
TODO:使函数更好,可以比较两个不同的类,KeySelectors 为 A 和 B 选择比较键和一个 IEqualityCompare:
public static IEnumerable<Tuple<TA, TB> EqualityZip<TA, TB, TKey>(
this IEnumerable<TA> sourceA, // the first sequence
this IEnumerable<TB> sourceB, // the second sequence
Func<TA, TKey> keySelectorA, // the property of sourceA to take
Func<TB, TKey> keySelectorB, // the property of sourceB to take
IEqualityComparer<TKey> comparer)
// TODO: ArgumentNullException if arguments null
if (comparer==null) comparer = EqualityCompare<TKey>.Default;
var enumeratorA = sourceA.GetEnumerator();
var enumeratorB = sourceB.GetEnumerator();
// enumerate as long as we have elements in A and in B:
bool aAvailable = enumeratorA.MoveNext();
bool bAvailable = enumeratorB.MoveNext();
while (aAvailable && bAvailable)
// we have an A element and a B element
TKey keyA = keySelectorA(enumeratorA.Current);
TKey keyB = keySelectorB(enumeratorB.Current);
if (comparer.Equals(keyA, keyB)
yield return Tuple.Create(Ta, Tb)
else
等等
【讨论】:
以上是关于C# Linq 对重复值的完全外连接的主要内容,如果未能解决你的问题,请参考以下文章