您最喜欢的非内置 LINQ to Objects 运算符是啥？ [关闭]

Posted 2023-03-31

技术标签:

【中文标题】您最喜欢的非内置 LINQ to Objects 运算符是啥？ [关闭]【英文标题】：What's your favorite LINQ to Objects operator which is not built-in? [closed]您最喜欢的非内置 LINQ to Objects 运算符是什么？ [关闭] 【发布时间】：2010-09-05 09:49:20 【问题描述】：

通过扩展方法，我们可以编写方便的 LINQ 运算符来解决一般问题。

我想了解您在 System.Linq 命名空间中缺少哪些方法或重载，以及您是如何实现它们的。

简洁优雅的实现，可能使用现有的方法，是首选。

【问题讨论】：

看起来你目前的大多数实现都选择减少开销而不是清洁和优雅，但对我个人而言，这让它们更有用。在这个页面上能够折叠所有代码块真的很有用☺ 很多在extensionmethod.net - VB和C#示例。这个问题应该被锁定了，比如***.com/q/101268/344286 【参考方案1】：

追加和前置

（自编写此答案以来，这些已添加到 .NET。）

/// <summary>Adds a single element to the end of an IEnumerable.</summary>
/// <typeparam name="T">Type of enumerable to return.</typeparam>
/// <returns>IEnumerable containing all the input elements, followed by the
/// specified additional element.</returns>
public static IEnumerable<T> Append<T>(this IEnumerable<T> source, T element)

    if (source == null)
        throw new ArgumentNullException("source");
    return concatIterator(element, source, false);


/// <summary>Adds a single element to the start of an IEnumerable.</summary>
/// <typeparam name="T">Type of enumerable to return.</typeparam>
/// <returns>IEnumerable containing the specified additional element, followed by
/// all the input elements.</returns>
public static IEnumerable<T> Prepend<T>(this IEnumerable<T> tail, T head)

    if (tail == null)
        throw new ArgumentNullException("tail");
    return concatIterator(head, tail, true);


private static IEnumerable<T> concatIterator<T>(T extraElement,
    IEnumerable<T> source, bool insertAtStart)

    if (insertAtStart)
        yield return extraElement;
    foreach (var e in source)
        yield return e;
    if (!insertAtStart)
        yield return extraElement;

【讨论】：

您可以通过添加一个枚举来确定值的放置位置，然后添加“介于”。我需要在集合中的所有值之间注入一个值，有点像 String.Join 类型的功能，只有泛型。 @Lasse V. Karlsen：我将posted an InsertBetween method 作为单独的答案。您可以将 'Append` 实现缩短为单行：return source.Concat(Enumerable.Repeat(element, 1));。追加和前置也可以用AsEnumerable实现：head.AsEnumerable().Concat(source)/source.Concat(element.AsEnumerable()) 好一个 +1，但我会将其从 T 更改为 params T[]，以便您可以将一项或多项附加到末尾。【参考方案2】：

我很惊讶还没有人提到MoreLINQ project。它由Jon Skeet 发起，并在此过程中获得了一些开发人员。从项目页面：

LINQ to Objects 缺少一些理想的功能。

该项目将增强 LINQ 以具有额外方法的对象，在保持精神的方式 LINQ。

查看Operators Overview wiki 页面以获取已实现的运算符列表。

这当然是从一些干净优雅的源代码中学习的好方法。

【讨论】：

【参考方案3】：

每个

对于纯粹主义者来说没什么，但它很有用！

 public static void Each<T>(this IEnumerable<T> items, Action<T> action)
 
   foreach (var i in items)
      action(i);

【讨论】：

Parallel.ForEach 会做同样的事情并且能够并行执行。不是吗。。采用 func 而不是 action 的重载并 yield 返回结果也很明显。 @Nappy：那个叫Select，是内置的。它是 .NET (Rx) 响应式扩展的 System.Interactive.dll 的一部分，称为 Do：“针对序列中每个值的副作用调用操作。” @Nappy: Do 不等同于示例中的方法；它必须跟在 Run() 之后，它也有一个采用 Action 的重载。后者相当于示例。【参考方案4】：

ToQueue 和 ToStack

/// <summary>Creates a <see cref="Queue&lt;T&gt;"/> from an enumerable
/// collection.</summary>
public static Queue<T> ToQueue<T>(this IEnumerable<T> source)

    if (source == null)
        throw new ArgumentNullException("source");
    return new Queue<T>(source);


/// <summary>Creates a <see cref="Stack&lt;T&gt;"/> from an enumerable
/// collection.</summary>
public static Stack<T> ToStack<T>(this IEnumerable<T> source)

    if (source == null)
        throw new ArgumentNullException("source");
    return new Stack<T>(source);

【讨论】：

var myQueue = new Queue<ObjectType>(myObj); 有什么问题？仅针对一行，这并不是一个真正值得扩展的... @ck：您可以将相同的逻辑应用于内置扩展 ToList()，这些扩展也很好地补充了 ToArray() 扩展。我更喜欢流利的var myQueue = a.SelectMany(...).Where(...).OrderBy(...).ToQueue()，而不是更传统的语法。 @Martin (&TimwI) - 我可以看到将大量运算符链接在一起时的重点，它更更整洁。 +1。 @cjk 我看到的最大优势是没有指定类型参数。如果编译器可以推断出来，我不想在那里写<ObjectType>。【参考方案5】：

IsEmpty

public static bool IsEmpty<T>(this IEnumerable<T> source)

    return !source.Any();

【讨论】：

+1 我不知道为什么这被否决了。对我来说 source.IsEmpty() 比 !source.Any() 更易读。我总是尽量避免使用！操作员在我看来，在快速扫描代码时很容易跳过它。 None 更类似于Any，而不是IsEmpty。【参考方案6】：

在和不在

其他两个众所周知的 SQL 构造的 C# 等效项

/// <summary>
/// Determines if the source value is contained in the list of possible values.
/// </summary>
/// <typeparam name="T">The type of the objects</typeparam>
/// <param name="value">The source value</param>
/// <param name="values">The list of possible values</param>
/// <returns>
///     <c>true</c> if the source value matches at least one of the possible values; otherwise, <c>false</c>.
/// </returns>
public static bool In<T>(this T value, params T[] values)

    if (values == null)
        return false;

    if (values.Contains<T>(value))
        return true;

    return false;


/// <summary>
/// Determines if the source value is contained in the list of possible values.
/// </summary>
/// <typeparam name="T">The type of the objects</typeparam>
/// <param name="value">The source value</param>
/// <param name="values">The list of possible values</param>
/// <returns>
///     <c>true</c> if the source value matches at least one of the possible values; otherwise, <c>false</c>.
/// </returns>
public static bool In<T>(this T value, IEnumerable<T> values)

    if (values == null)
        return false;

    if (values.Contains<T>(value))
        return true;

    return false;


/// <summary>
/// Determines if the source value is not contained in the list of possible values.
/// </summary>
/// <typeparam name="T">The type of the objects</typeparam>
/// <param name="value">The source value</param>
/// <param name="values">The list of possible values</param>
/// <returns>
///     <c>false</c> if the source value matches at least one of the possible values; otherwise, <c>true</c>.
/// </returns>
public static bool NotIn<T>(this T value, params T[] values)

    return In(value, values) == false;


/// <summary>
/// Determines if the source value is not contained in the list of possible values.
/// </summary>
/// <typeparam name="T">The type of the objects</typeparam>
/// <param name="value">The source value</param>
/// <param name="values">The list of possible values</param>
/// <returns>
///     <c>false</c> if the source value matches at least one of the possible values; otherwise, <c>true</c>.
/// </returns>
public static bool NotIn<T>(this T value, IEnumerable<T> values)

    return In(value, values) == false;

【讨论】：

我认为它应该抛出异常而不是if (values == null) return false;。默默地吞下错误条件从来都不是好事。这取决于你如何看待它。事实上，一个元素永远不会包含在一个空的值列表中。一件事，有助于 DRY 但增加了调用堆栈；作为数组的 params 数组是 IEnumerable，因此您的 params 重载可以简单地调用 IEnumerable 重载。 return values.Contains(value); 就够了。【参考方案7】：

AsIEnumerable

/// <summary>
/// Returns a sequence containing one element.
/// </summary>
public static IEnumerable<T> AsIEnumerable<T>(this T obj)

    yield return obj;

用法：

var nums = new[] 12, 20, 6;
var numsWith5Prepended = 5.AsIEnumerable().Concat(nums);

【讨论】：

我倾向于编写 EnumerableEx.Return(5).Concat(nums) 而不是膨胀任何对象 IntelliSense。我更喜欢使用Append and Prepend。纯粹出于性能考虑，我建议改为return new T[] obj ;。这样，编译器就不必为了产生一个值而构建整个状态机类。我发现这个实现很危险。你对new[]1, 2, 3, 4.AsIEnumerable() 有什么期望？我希望是 1,2,3,4，而不是 [1,2,3,4]。 @larsm：这就是大多数库使用EnumerableEx.Return(new[]1, 2, 3, 4) 的原因。你是对的，“As”意味着正在进行一些转换，并且因为数组已经实现了 IEnumerable，所以你期望没有任何改变。【参考方案8】：

连接字符串

与string.Join基本相同，但是：

能够在任何集合上使用它，而不仅仅是字符串集合（在每个元素上调用 ToString）

能够为每个字符串添加前缀和后缀。

作为扩展方法。我觉得 string.Join 很烦人，因为它是静态的，这意味着在一系列操作中，它的词法顺序不正确。

/// <summary>
/// Turns all elements in the enumerable to strings and joins them using the
/// specified string as the separator and the specified prefix and suffix for
/// each string.
/// <example>
///   <code>
///     var a = (new[]  "Paris", "London", "Tokyo" ).JoinString(", ", "[", "]");
///     // a contains "[Paris], [London], [Tokyo]"
///   </code>
/// </example>
/// </summary>
public static string JoinString<T>(this IEnumerable<T> values,
    string separator = null, string prefix = null, string suffix = null)

    if (values == null)
        throw new ArgumentNullException("values");

    using (var enumerator = values.GetEnumerator())
    
        if (!enumerator.MoveNext())
            return "";
        StringBuilder sb = new StringBuilder();
        sb.Append(prefix).Append(enumerator.Current.ToString()).Append(suffix);
        while (enumerator.MoveNext())
            sb.Append(separator).Append(prefix)
              .Append(enumerator.Current.ToString()).Append(suffix);
        return sb.ToString();

【讨论】：

非常有用，虽然我个人会取出所有的格式化代码，并且能够使用分隔符加入IEnumerable<string>。在调用此方法之前，您可以随时将数据投影到IEnumerable<string>。 @Timwi：一些小问题：在附加到StringBuilder 之前，您不需要检查null。您需要处理枚举器。您可以去掉while 循环上方的Append 调用，并将循环替换为do-while。最后，除非您想避免支付创建StringBuilder 的成本，否则您不需要将第一个元素视为特殊情况：new StringBuilder().ToString() 将返回string.Empty。 String.Join in .NET 4 需要一个 IEnumerable。这会做我们在Aggregate 运算符中没有的事情吗？这是一个不寻常的使用，但绝对可以用于连接对象列表。 @Kirk Broadhurst：首先，如果您在每个阶段连接字符串而不是使用 StringBuilder，Aggregate 会更慢。但是对于两个，即使我想为此使用Aggregate，我仍然会将其包装到带有此签名的JoinString 方法中，因为它使使用它的代码更加清晰。一旦我有了它，我还不如通过使用 StringBuilder 来编写更快的方法。【参考方案9】：

订单

/// <summary>Sorts the elements of a sequence in ascending order.</summary>
public static IEnumerable<T> Order<T>(this IEnumerable<T> source)

    return source.OrderBy(x => x);

【讨论】：

我宁愿将此方法称为“排序”。 @Steven：“排序”会导致与 List<T>.Sort() 产生歧义现在不会了，因为 C# 编译器总是会在扩展方法之前选择实例方法。 @Steven：没错，但无论如何，阅读代码的人仍然会模棱两可。区别很重要，因为List<T>.Sort 就位。这不是需要一个需要 T 来实现 IComparable 的 GTC 吗？【参考方案10】：

随机播放

public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> items)

    var random = new Random();
    return items.OrderBy(x => random.Next());

编辑：上面的实现似乎有几个问题。这是基于@LukeH 的代码和来自@ck 和@Strilanc 的cmets 的改进版本。

private static Random _rand = new Random();
public static IEnumerable<T> Shuffle<T>(this IEnumerable<T> source)

    var items = source == null ? new T[]   : source.ToArray();
    var count = items.Length;
    while(count > 0)
    
        int toReturn = _rand.Next(0, count);
        yield return items[toReturn];
        items[toReturn] = items[count - 1];
        count--;

【讨论】：

我建议将其重命名为 Randomize 或 Shuffle。使用静态 Random 对象可能会更好。你的实现是错误的。更糟糕的是，这是非常错误的。首先，它有一个隐藏的全局依赖：随机源（更糟糕的是，如果快速调用多次，您选择的源将给出相同的随机播放！）。其次，使用的算法不好。它不仅比 Fisher-Yates 渐进地慢，而且不统一（分配相同键的元素保持相同的相对顺序）。我可能会添加随机源作为参数。这有两个优点：您不会在启动时创建多个随机源，或者至少它是开发人员正确初始化它们的响应能力，其次，如果始终如一地应用它是一个很好的指标，该方法每个返回不同/随机的结果时间。 @Nappy 不幸的是，.NET 中的大多数随机源都没有相同的外部接口。将 Math.Random 与 System.Security.Cryptography.RandomNumberGenerator 的实现进行对比。您可以编写适配器和/或接受一个简单的Func<int>，但必须有人做一些工作来简化该方法获取其 PRN 的方式。【参考方案11】：

循环

这是我刚想到的一个很酷的。（如果我只是想到它，也许它没那么有用？但我想到它是因为我有它的用处。）重复循环一个序列以生成一个无限序列。这完成了类似于Enumerable.Range 和Enumerable.Repeat 给你的东西，除了它可以用于任意（不像Range）序列（不像@987654324 @):

public static IEnumerable<T> Loop<T>(this IEnumerable<T> source)

    while (true)
    
        foreach (T item in source)
        
            yield return item;

用法：

var numbers = new[]  1, 2, 3 ;
var looped = numbers.Loop();

foreach (int x in looped.Take(10))

    Console.WriteLine(x);

输出：

1 2 3 1 2 3 1 2 3 1

注意：我想您也可以通过以下方式完成此操作：

var looped = Enumerable.Repeat(numbers, int.MaxValue).SelectMany(seq => seq);

...但我认为Loop 更清晰。

【讨论】：

这个实现有一个有趣的特性，它每次都会创建一个新的枚举器，所以它可能会产生意想不到的结果。 @Gabe：如果不每次都创建一个新的枚举器，真的没有办法做到这一点。我的意思是，你可以在枚举器上调用Reset；但我很确定这在 90% 的 IEnumerator<T> 实现中不受支持。所以我猜你可能会得到“意想不到的”结果，但前提是你期待一些不可行的东西。换句话说：对于任何有序序列（如T[]、List<T>、Queue<T>等），顺序都会是稳定的；对于任何无序序列，你不应该期望它是（在我看来）。 @Gabe：我想，另一种选择是接受一个可选的bool 参数，指定方法是否应该缓存第一个枚举的顺序，然后循环遍历它。 Dan：如果您每次都需要从循环中获得相同的结果，您可以只使用x.Memoize().Loop()，但这当然意味着您需要先有一个Memoize 函数。我同意可选的布尔值，或者像LoopSame 或LoopDeterministic 这样的单独方法。【参考方案12】：

最小元素

Min 只返回指定表达式返回的最小值，而不是给出这个最小值元素的原始元素。

/// <summary>Returns the first element from the input sequence for which the
/// value selector returns the smallest value.</summary>
public static T MinElement<T, TValue>(this IEnumerable<T> source,
        Func<T, TValue> valueSelector) where TValue : IComparable<TValue>

    if (source == null)
        throw new ArgumentNullException("source");
    if (valueSelector == null)
        throw new ArgumentNullException("valueSelector");
    using (var enumerator = source.GetEnumerator())
    
        if (!enumerator.MoveNext())
            throw new InvalidOperationException("source contains no elements.");
        T minElem = enumerator.Current;
        TValue minValue = valueSelector(minElem);
        while (enumerator.MoveNext())
        
            TValue value = valueSelector(enumerator.Current);
            if (value.CompareTo(minValue) < 0)
            
                minValue = value;
                minElem = enumerator.Current;
            
        
        return minElem;

【讨论】：

最好让valueSelector 返回IComparable 或将valueSelector 更改为Func<T, T, bool> lessThan 之类的东西，这样就可以比较字符串或小数等内容。【参考方案13】：

索引

/// <summary>
/// Returns the index of the first element in this <paramref name="source"/>
/// satisfying the specified <paramref name="condition"/>. If no such elements
/// are found, returns -1.
/// </summary>
public static int IndexOf<T>(this IEnumerable<T> source, Func<T, bool> condition)

    if (source == null)
        throw new ArgumentNullException("source");
    if (condition == null)
        throw new ArgumentNullException("condition");
    int index = 0;
    foreach (var v in source)
    
        if (condition(v))
            return index;
        index++;
    
    return -1;

【讨论】：

你应该调用这个FindIndex来匹配List<T>和Array上做同样事情的方法。我还会考虑检查source 是否是已经实现它的那些东西之一，并调用本机FindIndex 函数（尽管这不会对性能产生太大影响，因为你没有重载这需要一个起始索引）。【参考方案14】：

块

返回特定大小的块。 x.Chunks(2) of 1,2,3,4,5 将返回两个数组，分别是 1,2 和 3,4。 x.Chunks(2,true) 将返回 1,2、3,4 和 5。

public static IEnumerable<T[]> Chunks<T>(this IEnumerable<T> xs, int size, bool returnRest = false)

    var curr = new T[size];

    int i = 0;

    foreach (var x in xs)
    
        if (i == size)
        
            yield return curr;
            i = 0;
            curr = new T[size];
        

        curr[i++] = x;
    

    if (returnRest)
        yield return curr.Take(i).ToArray();

【讨论】：

@Timwi 感谢您指出这一点。我通常使用一个返回列表的方法，但我将其更改为返回我认为最多的数组。现在更正它:) 另一个名字是Buffer 为什么要返回一个数组？我更喜欢 IEnumerable> @Nappy: It is an IEnumerable<IEnumerable<T>>. @Nappy 纯IEnumerable 版本会更慢（至少对于小块）。但如前所述，我使用了上述的许多版本。如果您愿意，您可以随时将数组用作IEnumerable，因为数组继承IEnumerable。【参考方案15】：

ToHashSet

public static HashSet<T> ToHashSet<T>(this IEnumerable<T> items)

    return new HashSet<T>(items);

【讨论】：

我会让它返回一个 ISet 并将其称为 ToSet。这更好地隐藏了内部实现。【参考方案16】：

FirstOrDefault 指定了默认值

/// <summary>
/// Returns the first element of a sequence, or a default value if the
/// sequence contains no elements.
/// </summary>
/// <typeparam name="T">The type of the elements of
/// <paramref name="source"/>.</typeparam>
/// <param name="source">The <see cref="IEnumerable&lt;T&gt;"/> to return
/// the first element of.</param>
/// <param name="default">The default value to return if the sequence contains
/// no elements.</param>
/// <returns><paramref name="default"/> if <paramref name="source"/> is empty;
/// otherwise, the first element in <paramref name="source"/>.</returns>
public static T FirstOrDefault<T>(this IEnumerable<T> source, T @default)

    if (source == null)
        throw new ArgumentNullException("source");
    using (var e = source.GetEnumerator())
    
        if (!e.MoveNext())
            return @default;
        return e.Current;
    


/// <summary>
/// Returns the first element of a sequence, or a default value if the sequence
/// contains no elements.
/// </summary>
/// <typeparam name="T">The type of the elements of
/// <paramref name="source"/>.</typeparam>
/// <param name="source">The <see cref="IEnumerable&lt;T&gt;"/> to return
/// the first element of.</param>
/// <param name="predicate">A function to test each element for a
/// condition.</param>
/// <param name="default">The default value to return if the sequence contains
/// no elements.</param>
/// <returns><paramref name="default"/> if <paramref name="source"/> is empty
/// or if no element passes the test specified by <paramref name="predicate"/>;
/// otherwise, the first element in <paramref name="source"/> that passes
/// the test specified by <paramref name="predicate"/>.</returns>
public static T FirstOrDefault<T>(this IEnumerable<T> source,
    Func<T, bool> predicate, T @default)

    if (source == null)
        throw new ArgumentNullException("source");
    if (predicate == null)
        throw new ArgumentNullException("predicate");
    using (var e = source.GetEnumerator())
    
        while (true)
        
            if (!e.MoveNext())
                return @default;
            if (predicate(e.Current))
                return e.Current;

【讨论】：

我突然想到这可以称为FirstOr，就像val = list.FirstOr(defaultVal)一样我更喜欢FirstOrFallback这个名字【参考方案17】：

插入之间

在每对连续元素之间插入一个元素。

/// <summary>Inserts the specified item in between each element in the input
/// collection.</summary>
/// <param name="source">The input collection.</param>
/// <param name="extraElement">The element to insert between each consecutive
/// pair of elements in the input collection.</param>
/// <returns>A collection containing the original collection with the extra
/// element inserted. For example, new[]  1, 2, 3 .InsertBetween(0) returns
///  1, 0, 2, 0, 3 .</returns>
public static IEnumerable<T> InsertBetween<T>(
    this IEnumerable<T> source, T extraElement)

    return source.SelectMany(val => new[]  extraElement, val ).Skip(1);

【讨论】：

虽然这看起来不太复杂，但我看不到这方面的用例。你能说出一些吗？谢谢。【参考方案18】：

EmptyIfNull

这是一个有争议的问题；我相信很多纯粹主义者会反对null成功的“实例方法”。

/// <summary>
/// Returns an IEnumerable<T> as is, or an empty IEnumerable<T> if it is null
/// </summary>
public static IEnumerable<T> EmptyIfNull<T>(this IEnumerable<T> source)

    return source ?? Enumerable.Empty<T>();

用法：

foreach(var item in myEnumerable.EmptyIfNull())

  Console.WriteLine(item);

【讨论】：

【参考方案19】：

解析

这涉及到一个自定义委托（可以使用IParser<T> 接口，但我选择了一个委托，因为它更简单），用于将字符串序列解析为值序列，跳过解析失败的元素。

public delegate bool TryParser<T>(string text, out T value);

public static IEnumerable<T> Parse<T>(this IEnumerable<string> source,
                                      TryParser<T> parser)

    source.ThrowIfNull("source");
    parser.ThrowIfNull("parser");

    foreach (string str in source)
    
        T value;
        if (parser(str, out value))
        
            yield return value;

用法：

var strings = new[]  "1", "2", "H3llo", "4", "five", "6", "se7en" ;
var numbers = strings.Parse<int>(int.TryParse);

foreach (int x in numbers)

    Console.WriteLine(x);

输出：

1 2 4 6

这个名字很难命名。我不确定Parse 是否是最好的选择（至少它很简单），或者像ParseWhereValid 这样的东西是否会更好。

【讨论】：

TryParse 或 ParseWhereValid 最适合 imo。 ;) @Nappy：是的，我喜欢TryParse；我唯一担心的是，有人可能期望它返回 bool 并填充 out IEnumerable<T> 参数（仅在解析 every 项目时返回 true）。也许ParseWhereValid 是最好的。【参考方案20】：

压缩合并

这是我的Zip 版本，它的工作原理就像一个真正的拉链。它不会将两个值合二为一，而是返回一个组合的 IEnumerable。重载、跳过右尾和/或左尾是可能的。

public static IEnumerable<TSource> ZipMerge<TSource>(
        this IEnumerable<TSource> first,
        IEnumerable<TSource> second)

    using (var secondEnumerator = second.GetEnumerator())
    
        foreach (var item in first)
        
            yield return item;

            if (secondEnumerator.MoveNext())
                yield return secondEnumerator.Current;
        

        while (secondEnumerator.MoveNext())
            yield return secondEnumerator.Current;

【讨论】：

有用，但也许应该称为内置 Zip 以外的其他名称？（我知道参数足以区分，但是为了代码的可读性...） @Timwi 对另一个名字有什么建议吗？也许是 ZipMerge？这不会跳过second 的第一个元素，还是我误读了代码？ @Aistina：我认为你对MoveNext 有误解。第一个调用告诉您枚举器是否为空。 Current 包含第一个元素。 @realbart Thats not true: "如果 MoveNext 传递到集合的末尾，则枚举器定位在集合中的最后一个元素之后，并且 MoveNext 返回 false。当枚举器在这个位置，对 MoveNext 的后续调用也会返回 false，直到调用 Reset。”见msdn.microsoft.com/library/…【参考方案21】：

随机样本

如果您有一个中等规模的数据集（例如，超过 100 个项目）并且您只想查看其中的随机样本，这里有一个简单的函数非常有用。

public static IEnumerable<T> RandomSample<T>(this IEnumerable<T> source,
                                             double percentage)

    source.ThrowIfNull("source");

    var r = new Random();
    return source.Where(x => (r.NextDouble() * 100.0) < percentage);

用法：

List<DataPoint> data = GetData();

// Sample roughly 3% of the data
var sample = data.RandomSample(3.0);

// Verify results were correct for this sample
foreach (DataPoint point in sample)

    Console.WriteLine("0 => 1", point, DoCalculation(point));

注意事项：

不太适合小集合，因为返回的项目数量是概率性的（在小序列上很容易返回零）。不太适合大型集合或数据库查询，因为它涉及枚举序列中的每个项目。

【讨论】：

很有趣，尽管通常要求 X 个随机元素比说“随机给我大约 X% 的元素”更有用。为此，您应该这样做：source.OrderBy(r.NextDouble()).Take(x);【参考方案22】：

断言计数

有效地确定 IEnumerable<T> 是否包含至少/恰好/最多一定数量的元素。

public enum CountAssertion

    AtLeast,
    Exact,
    AtMost


/// <summary>
/// Asserts that the number of items in a sequence matching a specified predicate satisfies a specified CountAssertion.
/// </summary>
public static bool AssertCount<T>(this IEnumerable<T> source, int countToAssert, CountAssertion assertion, Func<T, bool> predicate)

    if (source == null)
        throw new ArgumentNullException("source");

    if (predicate == null)
        throw new ArgumentNullException("predicate");

    return source.Where(predicate).AssertCount(countToAssert, assertion);


/// <summary>
/// Asserts that the number of elements in a sequence satisfies a specified CountAssertion.
/// </summary>
public static bool AssertCount<T>(this IEnumerable<T> source, int countToAssert, CountAssertion assertion)

    if (source == null)
        throw new ArgumentNullException("source");

    if (countToAssert < 0)
        throw new ArgumentOutOfRangeException("countToAssert");    

    switch (assertion)
    
        case CountAssertion.AtLeast:
            return AssertCountAtLeast(source, GetFastCount(source), countToAssert);

        case CountAssertion.Exact:
            return AssertCountExact(source, GetFastCount(source), countToAssert);

        case CountAssertion.AtMost:
            return AssertCountAtMost(source, GetFastCount(source), countToAssert);

        default:
            throw new ArgumentException("Unknown CountAssertion.", "assertion");
    



private static int? GetFastCount<T>(IEnumerable<T> source)

    var genericCollection = source as ICollection<T>;
    if (genericCollection != null)
        return genericCollection.Count;

    var collection = source as ICollection;
    if (collection != null)
        return collection.Count;

    return null;


private static bool AssertCountAtMost<T>(IEnumerable<T> source, int? fastCount, int countToAssert)

    if (fastCount.HasValue)
        return fastCount.Value <= countToAssert;

    int countSoFar = 0;

    foreach (var item in source)
    
        if (++countSoFar > countToAssert) return false;
    

    return true;


private static bool AssertCountExact<T>(IEnumerable<T> source, int? fastCount, int countToAssert)

    if (fastCount.HasValue)
        return fastCount.Value == countToAssert;

    int countSoFar = 0;

    foreach (var item in source)
    
        if (++countSoFar > countToAssert) return false;
    

    return countSoFar == countToAssert;


private static bool AssertCountAtLeast<T>(IEnumerable<T> source, int? fastCount, int countToAssert)

    if (countToAssert == 0)
        return true;

    if (fastCount.HasValue)
        return fastCount.Value >= countToAssert;

    int countSoFar = 0;

    foreach (var item in source)
    
        if (++countSoFar >= countToAssert) return true;
    

    return false;

用法：

var nums = new[]  45, -4, 35, -12, 46, -98, 11 ;
bool hasAtLeast3Positive = nums.AssertCount(3, CountAssertion.AtLeast, i => i > 0); //true
bool hasAtMost1Negative = nums.AssertCount(1, CountAssertion.AtMost, i => i < 0); //false
bool hasExactly2Negative = nums.AssertCount(2, CountAssertion.Exact, i => i < 0); //false

【讨论】：

@Timwi：不幸的是，如果Enumerable.Count<T>() 无法以这种方式确定计数，它将枚举整个序列。这会破坏这个扩展的要点，它可以快速失败和快速通过。您应该包含类似CountAtMost 这样的内容，这样我就可以根据是否有无、单数或复数元素来更改我的操作，而无需创建 3 个枚举器或计算所有元素。 @Gabe：抱歉，我不太明白。有一个CountAssertion.AtMost。还是我错过了什么？顺便说一句，我不喜欢Assertion 命名法，因为Assert 这个词对我来说意味着它在条件为假时会引发异常。 Ani：我可能有 3 种不同的方法：CountIs、CountIsAtMost 和 CountIsAtLeast。【参考方案23】：

窗口

枚举长度为size 包含最新值的数组（“windows”）。 0, 1, 2, 3 变为 [0, 1], [1, 2], [2, 3] 。

例如，我正在使用它通过连接两个点来绘制折线图。

public static IEnumerable<TSource[]> Window<TSource>(
    this IEnumerable<TSource> source)

    return source.Window(2);


public static IEnumerable<TSource[]> Window<TSource>(
    this IEnumerable<TSource> source, int size)

    if (size <= 0)
        throw new ArgumentOutOfRangeException("size");

    return source.Skip(size).WindowHelper(size, source.Take(size));


private static IEnumerable<TSource[]> WindowHelper<TSource>(
    this IEnumerable<TSource> source, int size, IEnumerable<TSource> init)

    Queue<TSource> q = new Queue<TSource>(init);

    yield return q.ToArray();

    foreach (var value in source)
    
        q.Dequeue();
        q.Enqueue(value);
        yield return q.ToArray();

【讨论】：

【参考方案24】：

一个、两个、MoreThanOne、AtLeast、AnyAtAll

public static bool One<T>(this IEnumerable<T> enumerable)

    using (var enumerator = enumerable.GetEnumerator())
        return enumerator.MoveNext() && !enumerator.MoveNext();


public static bool Two<T>(this IEnumerable<T> enumerable)

    using (var enumerator = enumerable.GetEnumerator())
        return enumerator.MoveNext() && enumerator.MoveNext() && !enumerator.MoveNext();


public static bool MoreThanOne<T>(this IEnumerable<T> enumerable)

    return enumerable.Skip(1).Any();


public static bool AtLeast<T>(this IEnumerable<T> enumerable, int count)

    using (var enumerator = enumerable.GetEnumerator())
        for (var i = 0; i < count; i++)
            if (!enumerator.MoveNext())
                return false;
    return true;


public static bool AnyAtAll<T>(this IEnumerable<T> enumerable)

    return enumerable != null && enumerable.Any();

【讨论】：

不要忘记将枚举数包含在 using 语句中。我冒昧地删除了ToEnumerable，因为它似乎与其他人无关，而且它已经发布在另一个答案中。好吧，Timwi 如果您更仔细地检查了 ToEnumerable 的实现，您会发现它们并不相同。我的版本采用任意集合并变成可枚举的。这与其他人发布的功能不同。但可以肯定......继续吧，放肆吧。 +1 我喜欢这些方法。我在自己的 utils 库中定义了类似的方法。如果您将 IMO AnyAtAll 倒置并将其命名为 IsNullOrEmpty，那么 IMO AnyAtAll 会更容易理解。我自己的扩展方法库中有一个非常相似的“One”函数。重载这些以接受投影也可能很有用，就像 Any、First 等一样。【参考方案25】：

SkipLast 和 TakeLast

/// <summary>
/// Enumerates the items of this collection, skipping the last
/// <paramref name="count"/> items. Note that the memory usage of this method
/// is proportional to <paramref name="count"/>, but the source collection is
/// only enumerated once, and in a lazy fashion. Also, enumerating the first
/// item will take longer than enumerating subsequent items.
/// </summary>
public static IEnumerable<T> SkipLast<T>(this IEnumerable<T> source, int count)

    if (source == null)
        throw new ArgumentNullException("source");
    if (count < 0)
        throw new ArgumentOutOfRangeException("count",
            "count cannot be negative.");
    if (count == 0)
        return source;
    return skipLastIterator(source, count);

private static IEnumerable<T> skipLastIterator<T>(IEnumerable<T> source,
    int count)

    var queue = new T[count];
    int headtail = 0; // tail while we're still collecting, both head & tail
                      // afterwards because the queue becomes completely full
    int collected = 0;

    foreach (var item in source)
    
        if (collected < count)
        
            queue[headtail] = item;
            headtail++;
            collected++;
        
        else
        
            if (headtail == count) headtail = 0;
            yield return queue[headtail];
            queue[headtail] = item;
            headtail++;
        
    


/// <summary>
/// Returns a collection containing only the last <paramref name="count"/>
/// items of the input collection. This method enumerates the entire
/// collection to the end once before returning. Note also that the memory
/// usage of this method is proportional to <paramref name="count"/>.
/// </summary>
public static IEnumerable<T> TakeLast<T>(this IEnumerable<T> source, int count)

    if (source == null)
        throw new ArgumentNullException("source");
    if (count < 0)
        throw new ArgumentOutOfRangeException("count",
            "count cannot be negative.");
    if (count == 0)
        return new T[0];

    var queue = new Queue<T>(count + 1);
    foreach (var item in source)
    
        if (queue.Count == count)
            queue.Dequeue();
        queue.Enqueue(item);
    
    return queue.AsEnumerable();

【讨论】：

我相信您对TakeLast 的实现可以用于SkipLast，与yield return queue.Dequeue();。【参考方案26】：

重复

与 Ani 的 AssertCount 方法（我使用一种称为 CountAtLeast 的方法）结合使用，可以很容易地在一个序列中找到多次出现的元素：

public static IEnumerable<T> Duplicates<T, TKey>(this IEnumerable<T> source,
    Func<T, TKey> keySelector = null, IEqualityComparer<TKey> comparer = null)

    source.ThrowIfNull("source");
    keySelector = keySelector ?? new Func<T, TKey>(x => x);
    comparer = comparer ?? EqualityComparer<TKey>.Default;

    return source.GroupBy(keySelector, comparer)
        .Where(g => g.CountAtLeast(2))
        .SelectMany(g => g);

【讨论】：

我认为你可以在“内置”LINQ 中将g.CountAtLeast(2) 写为g.Skip(1).Any()。 @Timwi：这正是我写它的方式；）我使用的几个扩展方法实际上只是非常简单的功能包装器，这些功能已经可以简洁地编写（另一个例子：SkipNulls()，它是只是Where(x => x != null)）。我使用它们不是因为它们做了很多额外的事情，而是因为我发现它们使代码更具可读性（并且将几个方法调用包装成一个适合代码重用的方法调用并没有那么糟糕，无论如何）。丹：SkipNulls<T>() 真的只是OfType<T>()。 @Gabe：你可以做到OfType<T> 或Where<T>；无论哪种方式，它都只是一个微不足道的包装器。我的意思是 SkipNulls 这个名字让它更有目的性。【参考方案27】：

如果

IEnumerable 和 IQueryable 上的可选 Where 子句。在为查询构建谓词和 lambda 时避免使用 if 语句。当您在编译时不知道是否应该应用过滤器时很有用。

public static IEnumerable<TSource> WhereIf<TSource>(
            this IEnumerable<TSource> source, bool condition,
            Func<TSource, bool> predicate)

    return condition ? source.Where(predicate) : source;

用途：

var custs = Customers.WhereIf(someBool, x=>x.EyeColor=="Green");

LINQ WhereIf At ExtensionMethod.NET 借用自Andrew's blog。

【讨论】：

有趣。您可以使用它来将复选框链接到 AJAX-ish 搜索结果页面中的过滤器； mySearchResults.WhereIf(chkShowOnlyUnapproved.Checked, x=>!x.IsApproved)【参考方案28】：

具有初始容量的 ToList 和 ToDictionary

ToList 和 ToDictionary 重载暴露了基础集合类的初始容量。当源长度已知或有界时偶尔有用。

public static List<TSource> ToList<TSource>(
    this IEnumerable<TSource> source, 
    int capacity)

    if (source == null)
    
        throw new ArgumentNullException("source");
    
    var list = new List<TSource>(capacity);
    list.AddRange(source);
    return list;
     

public static Dictionary<TKey, TSource> ToDictionary<TSource, TKey>(
    this IEnumerable<TSource> source, 
    Func<TSource, TKey> keySelector, 
    int capacity,
    IEqualityComparer<TKey> comparer = null)

    return source.ToDictionary<TSource, TKey, TSource>(
                  keySelector, x => x, capacity, comparer);


public static Dictionary<TKey, TElement> ToDictionary<TSource, TKey, TElement>(
    this IEnumerable<TSource> source, 
    Func<TSource, TKey> keySelector, 
    Func<TSource, TElement> elementSelector,
    int capacity,
    IEqualityComparer<TKey> comparer = null)

    if (source == null)
    
        throw new ArgumentNullException("source");
    
    if (keySelector == null)
    
        throw new ArgumentNullException("keySelector");
    
    if (elementSelector == null)
    
        throw new ArgumentNullException("elementSelector");
    
    var dictionary = new Dictionary<TKey, TElement>(capacity, comparer);
    foreach (TSource local in source)
    
        dictionary.Add(keySelector(local), elementSelector(local));
    
    return dictionary;

【讨论】：

【参考方案29】：

CountUpTo

static int CountUpTo<T>(this IEnumerable<T> source, int maxCount)

    if (maxCount == 0)
        return 0;

    var genericCollection = source as ICollection<T>; 
    if (genericCollection != null) 
        return Math.Min(maxCount, genericCollection.Count);

    var collection = source as ICollection; 
    if (collection != null)
        return Math.Min(maxCount, collection.Count);

    int count = 0;
    foreach (T item in source)
        if (++count >= maxCount)
            break;
    return count;

【讨论】：

这在语义上和collection.Take(maxCount).Count()是一样的，对吧？【参考方案30】：

合并

public static T Coalesce<T>(this IEnumerable<T> items) 
   return items.Where(x => x != null && !x.Equals(default(T))).FirstOrDefault();
   // return items.OfType<T>().FirstOrDefault(); // Gabe's take

【讨论】：

既然你是实际使用的，那我为你简化一下吧。不用了，谢谢。由于您的版本仅适用于对象类型（不是 DateTime 等），并且您对 OfType 的使用并不典型，因为大多数人都知道它是一种强制转换方法而不是非空过滤器，并且因为它不是在可读性或性能方面添加任何内容，我会坚持使用我的原始内容。但 OfType 确实适用于 DateTime，不是吗？另外，我确实认为 OfType 是一种非空过滤方法，它还能如何工作？也许只有我和加布…… OfType 当然适用于值类型。尽管 Gabe 的用法改变了 Coalesce 方法，因此它不能像我希望的那样用于值类型。就 OfType 而言，我认为它更多地用于要按类型过滤的异构或多态集合（msdn.microsoft.com/en-us/library/bb360913.aspx）。 MSDN 文章甚至没有提到过滤掉空值。从 VB 转换到 C# 时犯了一个经典错误...Nothing 不等于null...不得不修改 C# 实现。

以上是关于您最喜欢的非内置 LINQ to Objects 运算符是啥？ [关闭]的主要内容，如果未能解决你的问题，请参考以下文章