使用 LINQ 进行字母数字排序

Posted

技术标签:

【中文标题】使用 LINQ 进行字母数字排序【英文标题】:Alphanumeric sorting using LINQ 【发布时间】:2011-07-02 21:43:22 【问题描述】:

我有一个string[],其中每个元素都以某个数值结尾。

string[] partNumbers = new string[] 
 
    "ABC10", "ABC1","ABC2", "ABC11","ABC10", "AB1", "AB2", "Ab11" 
;

我正在尝试使用LINQ 对上述数组进行如下排序,但我没有得到预期的结果。

var result = partNumbers.OrderBy(x => x);

实际结果:

AB1 Ab11 AB2 ABC1 ABC10 ABC10 ABC11 ABC2

预期结果

AB1 AB2 AB11 ..

【问题讨论】:

a helpful article 关于字母数字排序(预期结果)与 ASCII 排序(实际结果)的比较 【参考方案1】:

这是因为字符串的默认排序是标准的字母数字字典(字典)排序,而 ABC11 将排在 ABC2 之前,因为排序总是从左到右进行。

为了得到你想要的,你需要在你的 order by 子句中填充数字部分,比如:

 var result = partNumbers.OrderBy(x => PadNumbers(x));

PadNumbers 可以定义为:

public static string PadNumbers(string input)

    return Regex.Replace(input, "[0-9]+", match => match.Value.PadLeft(10, '0'));

这会为输入字符串中出现的任何数字(或多个数字)填充零,以便OrderBy 看到:

ABC0000000010
ABC0000000001
...
AB0000000011

填充仅发生在用于比较的键上。结果中保留了原始字符串(没有填充)。

请注意,此方法假定输入中数字的最大位数。

【讨论】:

@geek:没有该名称的预定义函数。我建议您使用正则表达式或类似方法来实现具有我描述的行为的函数。函数名称仅用于说明目的。 我继续添加了一个简单的函数来进行填充。 这节省了我的时间 最后一个简短的解决方案。效果很好,非常感谢! 完美运行【参考方案2】:

如果您想使用 LINQ 和自定义比较器(如 Dave Koelle 的比较器)按特定属性对对象列表进行排序,您可以执行以下操作:

...

items = items.OrderBy(x => x.property, new AlphanumComparator()).ToList();

...

您还必须更改 Dave 的类以继承自 System.Collections.Generic.IComparer<object> 而不是基本的 IComparer,因此类签名变为:

...

public class AlphanumComparator : System.Collections.Generic.IComparer<object>


    ...

就个人而言,我更喜欢James McCormack 的实现,因为它实现了 IDisposable,尽管我的基准测试显示它稍微慢一些。

【讨论】:

【参考方案3】:

您可以使用 PInvoke 获得快速且良好的结果:

class AlphanumericComparer : IComparer<string>

    [DllImport("shlwapi.dll", CharSet = CharSet.Unicode)]
    static extern int StrCmpLogicalW(string s1, string s2);

    public int Compare(string x, string y) => StrCmpLogicalW(x, y);

您可以像上面答案中的AlphanumComparatorFast 一样使用它。

【讨论】:

【参考方案4】:

您可以PInvokeStrCmpLogicalW(windows 功能)来执行此操作。见这里:Natural Sort Order in C#

【讨论】:

【参考方案5】:
public class AlphanumComparatorFast : IComparer

    List<string> GetList(string s1)
    
        List<string> SB1 = new List<string>();
        string st1, st2, st3;
        st1 = "";
        bool flag = char.IsDigit(s1[0]);
        foreach (char c in s1)
        
            if (flag != char.IsDigit(c) || c=='\'')
            
                if(st1!="")
                SB1.Add(st1);
                st1 = "";
                flag = char.IsDigit(c);
            
            if (char.IsDigit(c))
            
                st1 += c;
            
            if (char.IsLetter(c))
            
                st1 += c;
            


        
        SB1.Add(st1);
        return SB1;
    



    public int Compare(object x, object y)
    
        string s1 = x as string;
        if (s1 == null)
        
            return 0;
        
        string s2 = y as string;
        if (s2 == null)
        
            return 0;
        
        if (s1 == s2)
        
            return 0;
        
        int len1 = s1.Length;
        int len2 = s2.Length;
        int marker1 = 0;
        int marker2 = 0;

        // Walk through two the strings with two markers.
        List<string> str1 = GetList(s1);
        List<string> str2 = GetList(s2);
        while (str1.Count != str2.Count)
        
            if (str1.Count < str2.Count)
            
                str1.Add("");
            
            else
            
                str2.Add("");
            
        
        int x1 = 0; int res = 0; int x2 = 0; string y2 = "";
        bool status = false;
        string y1 = ""; bool s1Status = false; bool s2Status = false;
        //s1status ==false then string ele int;
        //s2status ==false then string ele int;
        int result = 0;
        for (int i = 0; i < str1.Count && i < str2.Count; i++)
        
            status = int.TryParse(str1[i].ToString(), out res);
            if (res == 0)
            
                y1 = str1[i].ToString();
                s1Status = false;
            
            else
            
                x1 = Convert.ToInt32(str1[i].ToString());
                s1Status = true;
            

            status = int.TryParse(str2[i].ToString(), out res);
            if (res == 0)
            
                y2 = str2[i].ToString();
                s2Status = false;
            
            else
            
                x2 = Convert.ToInt32(str2[i].ToString());
                s2Status = true;
            
            //checking --the data comparision
            if(!s2Status && !s1Status )    //both are strings
            
                result = str1[i].CompareTo(str2[i]);
            
            else if (s2Status && s1Status) //both are intergers
            
                if (x1 == x2)
                
                    if (str1[i].ToString().Length < str2[i].ToString().Length)
                    
                        result = 1;
                    
                    else if (str1[i].ToString().Length > str2[i].ToString().Length)
                        result = -1;
                    else
                        result = 0;
                
                else
                
                    int st1ZeroCount=str1[i].ToString().Trim().Length- str1[i].ToString().TrimStart(new char[]'0').Length;
                    int st2ZeroCount = str2[i].ToString().Trim().Length - str2[i].ToString().TrimStart(new char[]  '0' ).Length;
                    if (st1ZeroCount > st2ZeroCount)
                        result = -1;
                    else if (st1ZeroCount < st2ZeroCount)
                        result = 1;
                    else
                    result = x1.CompareTo(x2);

                
            
            else
            
                result = str1[i].CompareTo(str2[i]);
            
            if (result == 0)
            
                continue;
            
            else
                break;

        
        return result;
    

这个类的用法:

    List<string> marks = new List<string>();
                marks.Add("M'00Z1");
                marks.Add("M'0A27");
                marks.Add("M'00Z0");
marks.Add("0000A27");
                marks.Add("100Z0");

    string[] Markings = marks.ToArray();

                Array.Sort(Markings, new AlphanumComparatorFast());

【讨论】:

【参考方案6】:

不管是小写还是大写,看起来它都在进行字典排序。

您可以尝试在该 lambda 中使用一些自定义表达式来做到这一点。

【讨论】:

【参考方案7】:

在 .NET 中没有自然的方法可以做到这一点,but have a look at this blog post on natural sorting

您可以将其放入扩展方法中并使用该方法代替 OrderBy

【讨论】:

【参考方案8】:

对于那些喜欢通用方法的人,将 AlphanumComparator 调整为 Dave Koelle :AlphanumComparator。

第一步(我将类重命名为非缩写并采用 TCompareType 泛型类型参数):

 public class AlphanumericComparator<TCompareType> : IComparer<TCompareType>

接下来的调整是导入以下命名空间:

using System.Collections.Generic;

我们将 Compare 方法的签名从 object 更改为 TCompareType:

    public int Compare(TCompareType x, TCompareType y)
     .... no further modifications

现在我们可以为 AlphanumericComparator 指定正确的类型。 (我认为它实际上应该称为 AlphanumericComparer),当我们使用它时。

我的代码中的示例用法:

   if (result.SearchResults.Any()) 
            result.SearchResults = result.SearchResults.OrderBy(item => item.Code, new AlphanumericComparator<string>()).ToList();
        

现在你有一个字母数字比较器(比较器),它接受通用参数并且可以用于不同的类型。

还有一个使用比较器的扩展方法:

            /// <summary>
        /// Returns an ordered collection by key selector (property expression) using alpha numeric comparer
        /// </summary>
        /// <typeparam name="T">The item type in the ienumerable</typeparam>
        /// <typeparam name="TKey">The type of the key selector (property to order by)</typeparam>
        /// <param name="coll">The source ienumerable</param>
        /// <param name="keySelector">The key selector, use a member expression in lambda expression</param>
        /// <returns></returns>
        public static IEnumerable<T> OrderByMember<T, TKey>(this IEnumerable<T> coll, Func<T, TKey> keySelector)
        
            var result = coll.OrderBy(keySelector, new AlphanumericComparer<TKey>());
            return result;
        

【讨论】:

RE:“(我认为它实际上应该称为 AlphanumericComparer)” 有趣的是,Comparator 是正确的命名,因为 Comparator 是用于将对象与另一个对象进行比较的机制或设备。一开始听起来很奇怪,但这是正确的术语。【参考方案9】:

由于开头的字符数是可变的,因此正则表达式会有所帮助:

var re = new  Regex(@"\d+$"); // finds the consecutive digits at the end of the string
var result = partNumbers.OrderBy(x => int.Parse(re.Match(x).Value));

如果有固定数量的前缀字符,则可以使用Substring方法从相关字符开始提取:

// parses the string as a number starting from the 5th character
var result = partNumbers.OrderBy(x => int.Parse(x.Substring(4)));

如果数字可能包含小数分隔符或千位分隔符,则正则表达式也需要允许这些字符:

var re = new Regex(@"[\d,]*\.?\d+$");
var result = partNumbers.OrderBy(x => double.Parse(x.Substring(4)));

如果正则表达式或Substring 返回的字符串可能无法被int.Parse / double.Parse 解析,则使用相关的TryParse 变体:

var re = new  Regex(@"\d+$"); // finds the consecutive digits at the end of the string
var result = partNumbers.OrderBy(x => 
    int? parsed = null;
    if (int.TryParse(re.Match(x).Value, out var temp)) 
        parsed = temp;
    
    return parsed;
);

【讨论】:

我喜欢一般的想法,但我认为它不能很好地处理场景(认为它可以通过一些调整)。如果我正确阅读代码,这将 only 按字符串的数字部分排序 - OP 将同时按字符部分和数字部分排序。【参考方案10】:

在这里扩展@Nathan 的answer。

var maxStringLength = partNumbers.Max(x => x).Count();
var result = partNumbers.OrderBy(x => PadNumbers(x, maxStringLength));

然后将参数传递给 PadNumbers 函数将是动态的。

public static string PadNumbers(string input, int maxStringLength)

    return Regex.Replace(input, "[0-9]+", match => match.Value.PadLeft(maxStringLength, '0'));

【讨论】:

【参考方案11】:

看起来 Dave Koelle 的代码链接已失效。我从 WebArchive 获得了最新版本。

/*
 * The Alphanum Algorithm is an improved sorting algorithm for strings
 * containing numbers.  Instead of sorting numbers in ASCII order like
 * a standard sort, this algorithm sorts numbers in numeric order.
 *
 * The Alphanum Algorithm is discussed at http://www.DaveKoelle.com
 *
 * Based on the Java implementation of Dave Koelle's Alphanum algorithm.
 * Contributed by Jonathan Ruckwood <jonathan.ruckwood@gmail.com>
 *
 * Adapted by Dominik Hurnaus <dominik.hurnaus@gmail.com> to
 *   - correctly sort words where one word starts with another word
 *   - have slightly better performance
 *
 * Released under the MIT License - https://opensource.org/licenses/MIT
 *
 * Permission is hereby granted, free of charge, to any person obtaining
 * a copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice shall be included
 * in all copies or substantial portions of the Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
 * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
 * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
 * IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM,
 * DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR
 * OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE
 * USE OR OTHER DEALINGS IN THE SOFTWARE.
 *
 */
using System;
using System.Collections;
using System.Text;

/*
 * Please compare against the latest Java version at http://www.DaveKoelle.com
 * to see the most recent modifications
 */
namespace AlphanumComparator

    public class AlphanumComparator : IComparer
    
        private enum ChunkType Alphanumeric, Numeric;
        private bool InChunk(char ch, char otherCh)
        
            ChunkType type = ChunkType.Alphanumeric;

            if (char.IsDigit(otherCh))
            
                type = ChunkType.Numeric;
            

            if ((type == ChunkType.Alphanumeric && char.IsDigit(ch))
                || (type == ChunkType.Numeric && !char.IsDigit(ch)))
            
                return false;
            

            return true;
        

        public int Compare(object x, object y)
        
            String s1 = x as string;
            String s2 = y as string;
            if (s1 == null || s2 == null)
            
                return 0;
            

            int thisMarker = 0, thisNumericChunk = 0;
            int thatMarker = 0, thatNumericChunk = 0;

            while ((thisMarker < s1.Length) || (thatMarker < s2.Length))
            
                if (thisMarker >= s1.Length)
                
                    return -1;
                
                else if (thatMarker >= s2.Length)
                
                    return 1;
                
                char thisCh = s1[thisMarker];
                char thatCh = s2[thatMarker];

                StringBuilder thisChunk = new StringBuilder();
                StringBuilder thatChunk = new StringBuilder();

                while ((thisMarker < s1.Length) && (thisChunk.Length==0 ||InChunk(thisCh, thisChunk[0])))
                
                    thisChunk.Append(thisCh);
                    thisMarker++;

                    if (thisMarker < s1.Length)
                    
                        thisCh = s1[thisMarker];
                    
                

                while ((thatMarker < s2.Length) && (thatChunk.Length==0 ||InChunk(thatCh, thatChunk[0])))
                
                    thatChunk.Append(thatCh);
                    thatMarker++;

                    if (thatMarker < s2.Length)
                    
                        thatCh = s2[thatMarker];
                    
                

                int result = 0;
                // If both chunks contain numeric characters, sort them numerically
                if (char.IsDigit(thisChunk[0]) && char.IsDigit(thatChunk[0]))
                
                    thisNumericChunk = Convert.ToInt32(thisChunk.ToString());
                    thatNumericChunk = Convert.ToInt32(thatChunk.ToString());

                    if (thisNumericChunk < thatNumericChunk)
                    
                        result = -1;
                    

                    if (thisNumericChunk > thatNumericChunk)
                    
                        result = 1;
                    
                
                else
                
                    result = thisChunk.ToString().CompareTo(thatChunk.ToString());
                

                if (result != 0)
                
                    return result;
                
            

            return 0;
        
    

【讨论】:

【参考方案12】:

我不知道如何在 LINQ 中做到这一点,但也许你喜欢这样:

Array.Sort(partNumbers, new AlphanumComparatorFast());

//显示结果

foreach (string h in partNumbers )

Console.WriteLine(h);

【讨论】:

以上是关于使用 LINQ 进行字母数字排序的主要内容,如果未能解决你的问题,请参考以下文章

Linq to SQL - 如何将数字作为字符串排序?

clojure字母数字排序

按升序或降序对数字或字母进行通用排序

Linq 查询在 ASP.NET-Core 3.0 及更高版本中对数字等字符串进行排序

在数字前使用字母对 Java ArrayList 进行排序

除数字外,如何按字母顺序对对象数组进行排序?