如何在 C# 中计算“五的中位数”?

Posted

技术标签:

【中文标题】如何在 C# 中计算“五的中位数”?【英文标题】:How do I calculate the "median of five" in C#? 【发布时间】:2010-10-03 14:48:24 【问题描述】:

5 的中位数有时被用作算法设计的练习,并且已知仅使用 6 次比较即可计算。

在 C# 中实现这个“使用 6 次比较的 5 的中位数” 的最佳方法是什么?我所有的尝试似乎都导致了笨拙的代码:(我需要漂亮且可读的代码,同时仍然只使用 6 次比较。

public double medianOfFive(double a, double b, double c, double d, double e)
    //
    // return median
    //
    return c;

注意:我想我也应该在这里提供“算法”:

我发现自己无法像 Azereal 在他的论坛帖子中那样清楚地解释算法。所以我会在这里参考他的帖子。来自http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1061827085

好吧,我被一次性提出了这个问题 我的任务,我转向这个 论坛寻求帮助,但没有帮助。 我最终找到了方法。

    使用前 4 个元素开始合并排序,并对每一对(2 比较)

    比较每对中较低的两个,并从 可能性(3 个比较)

    将预留的第 5 个数字与没有配对的数字相加并进行比较 两者(4 个比较)

    比较两个新对中最低的两个并消除较低的一对 (5 次比较)

    比较一个本身和最后一对中较低的和较低的 数字是中位数

    可能的中位数在 亲子关系

(54321)

5:4 3:2 2 次比较

(4

4:2 3 次比较

2(4

1:3 4 次比较

2(4

4:1 5 次比较

1,2(4

4:3 6 次比较

1,2(3)4,5

三是中位数

这是我编写的 C++ 代码,用于求中位数为 5。不要介意它的尴尬:

double StageGenerator::MedianOfFive(double n1, double n2, double n3, double n4, double n5)
    double *a = &n1, *b = &n2, *c = &n3, *d = &n4, *e = &n5;
    double *tmp;

    // makes a < b and b < d
    if(*b < *a)
        tmp = a; a = b; b = tmp;
    

    if(*d < *c)
        tmp = c; c = d; d = tmp;
    

    // eleminate the lowest
    if(*c < *a)
        tmp = b; b = d; d = tmp; 
        c = a;
    

    // gets e in
    a = e;

    // makes a < b and b < d
    if(*b < *a)
        tmp = a; a = b; b = tmp;
    

    // eliminate another lowest
    // remaing: a,b,d
    if(*a < *c)
        tmp = b; b = d; d = tmp; 
        a = c;
    

    if(*d < *a)
        return *d;
    else
        return *a;

 

它应该更紧凑,不是吗?


正如@pablito 在他的回答中指出的那样,内置的List.Sort() 无法满足此要求,因为它最多使用 13 次比较:]

【问题讨论】:

嗯..这是某种家庭作业吗? 没有。如果你真的想看的话,我有我自己的 C++ 版本 :) 我很好奇,如果可读性如此重要,那么为什么要保持仅使用 6 个比较的约束呢? ;) 这种噱头通常不符合可读性。 是否有理由不只使用可排序列表? @Vojislav 这就是让它很难做到的原因。我看到这个 med of 5 的“伪代码”使用 6 次比较,但从来没有看到任何语言的良好实现代码。如果 SO 人能给出一个很好的实现,那就太好了。 【参考方案1】:

我发现这篇文章很有趣,作为一个练习,我创建了这个只做 6 次比较,没有别的:

static double MedianOfFive(double a, double b, double c, double d, double e)

    return b < a ? d < c ? b < d ? a < e ? a < d ? e < d ? e : d
                                                 : c < a ? c : a
                                         : e < d ? a < d ? a : d
                                                 : c < e ? c : e
                                 : c < e ? b < c ? a < c ? a : c
                                                 : e < b ? e : b
                                         : b < e ? a < e ? a : e
                                                 : c < b ? c : b
                         : b < c ? a < e ? a < c ? e < c ? e : c
                                                 : d < a ? d : a
                                         : e < c ? a < c ? a : c
                                                 : d < e ? d : e
                                 : d < e ? b < d ? a < d ? a : d
                                                 : e < b ? e : b
                                         : b < e ? a < e ? a : e
                                                 : d < b ? d : b
                 : d < c ? a < d ? b < e ? b < d ? e < d ? e : d
                                                 : c < b ? c : b
                                         : e < d ? b < d ? b : d
                                                 : c < e ? c : e
                                 : c < e ? a < c ? b < c ? b : c
                                                 : e < a ? e : a
                                         : a < e ? b < e ? b : e
                                                 : c < a ? c : a
                         : a < c ? b < e ? b < c ? e < c ? e : c
                                                 : d < b ? d : b
                                         : e < c ? b < c ? b : c
                                                 : d < e ? d : e
                                 : d < e ? a < d ? b < d ? b : d
                                                 : e < a ? e : a
                                         : a < e ? b < e ? b : e
                                                 : d < a ? d : a;

【讨论】:

酷!很高兴你发现这个问题很有用:) 非常好,但编译时不一定是最快的。无论如何,为了记录,对于 unsigned char 数据类型,SSE implementation(用于图像处理)结果证明在 i7 处理器上快了大约 60 倍。【参考方案2】:

这基本上只是从您的 C++ 示例中排除了交换和排序代码:

private static void Swap(ref double a, ref double b) 
    double t = a;
    a = b;
    b = t;


private static void Sort(ref double a, ref double b) 
    if (a > b) 
        double t = a;
        a = b;
        b = t;
    


private static double MedianOfFive(double a, double b, double c, double d, double e)
    // makes a < b and c < d
    Sort(ref a, ref b);
    Sort(ref c, ref d);

    // eleminate the lowest
    if (c < a) 
        Swap(ref b, ref d);
        c = a;
    

    // gets e in
    a = e;

    // makes a < b
    Sort(ref a, ref b);

    // eliminate another lowest
    // remaing: a,b,d
    if (a < c) 
        Swap(ref b, ref d);
        a = c;
    

    return Math.Min(d, a);

【讨论】:

清晰性的显着提升,如果在紧密循环中使用,任何有价值的编译器都会内联 Swap 和 Sort(这是我能看到实现如此具体的东西的唯一原因)。 我发现您的代码是此处最紧凑的代码,并将其标记为已接受的答案。其他代码也很有帮助,我都投了赞成票:) 这段代码是正确的,但是注释// makes a &lt; b and b &lt; d是错误的;如果您在排序后检查,您有时会看到b &gt; d。例如,当您调用 MedianOfFive(1, 2, 3, 4, 5) 时会发生这种情况,并且通常只要 e 是五个参数中最大的一个。 @Jason,很好。我只是从 OP 的代码中复制了 cmets,没有检查它们是否有意义。【参考方案3】:

谢谢。我知道你的帖子已经很老了,但它对我的问题很有用。

我需要一种方法来计算 5 个 SSE/AVX 寄存器的中位数(4 个浮点数/8 个浮点数,或 2 个双精度数/4 个双精度数):

没有任何条件跳转

仅适用于最小/最大指令

如果 min/max 函数是为带有条件跳转的标量寄存器编程的,那么我的代码在比较方面并不是最优的。 但是如果min/max函数用对应的CPU指令编码,我的代码是非常有效的,因为CPU在执行的时候没有条件跳转。

    template<class V> 
    inline V median(const V &a, const V &b, const V &c)
    
      return max(min(a,b),min(c,max(a,b))); 
     

    template<class V> 
    inline V median(const V &a, const V &b, const V &c, const V &d, const V &e)
    
      V f=max(min(a,b),min(c,d)); // discards lowest from first 4
      V g=min(max(a,b),max(c,d)); // discards biggest from first 4
      return median(e,f,g);
     

【讨论】:

正是我想要的!我被困在如何用 min max 计算三个值的中位数,太糟糕了,它需要 4 次操作! @Antonio: Fastest way of finding the middle value of a triple? 这是median5(),它使用最小/最大 CPU 指令(结合了这个和 @Gyorgy Szekely code)。测试median5() in Python(以防万一) @jfs 太棒了!对我来说看起来不错,您应该在此处发布答案。我已经编写了显式矢量化代码,但无法发布代码。【参考方案4】:

这里有一个有趣的话题:

http://www.ocf.berkeley.edu/~wwu/cgi-bin/yabb/YaBB.cgi?board=riddles_cs;action=display;num=1061827085

引自话题:

    将数字放入数组中。

    使用三个比较并随机排列数字,以便 a[1]

    如果a[3] > a[2],那么问题就相当简单了。如果 a[2]

    所以 a[3] a[4],则解是 a[3] 和 a[5] 中较小的一个。否则,解为 a[2] 和 a[4] 中较小的一个。

【讨论】:

这看起来很棒。我找到的解决方案也来自同一个线程。 C# 中的可靠实现也可以:D【参考方案5】:

这很丑陋,可以使用一些重构,但它明确地遍历所有比较和交换,以便您可以看到发生了什么。

public double medianOfFive(double a, double b, double c, double d, double e)
    double median;
    // sort a and b
    if(a > b) // comparison # 1
    
        double temp = a;
        a = b;
        b = temp;
    

    // sort c and d
    if(c > d)  // comparison # 2
    
        double temp = c;
        c = d;
        d = temp;
    

    // replace the lower of a and c with e
    // because the lowest of the first four cannot be the median
    if(a < c) // comparison # 3
    
        a = e;
        // re-sort a and b
        if(a > b) // comparison # 4
        
            double temp = a;
            a = b;
            b = temp;
        
    
    else
    
        c = e;
        // re-sort c and d
        if(c > d)  // comparison # 4
        
            double temp = c;
            c = d;
            d = temp;
        
    

    // eliminate a or c, because the lowest
    // of the remaining four can't be the median either
    if(a < c) // comparison #5
    
         if(b < c) // comparison #6
         
              median = c;
         
         else
         
              median = b;
         
    
    else
    
         if(d < a) // comparison #6
         
              median = a;
         
         else
         
              median = d;
         
    
    return median;

【讨论】:

【参考方案6】:

只是为了检查有多少比较:

    class MyComparable : IComparable


    public static int NumberOfComparisons = 0;

    public int NumPart  get; set; 

    #region IComparable Members

    public int CompareTo(object obj)
    
        NumberOfComparisons++; //I know, not thread safe but just for the sample
        MyComparable mc = obj as MyComparable;
        if (mc == null)
            return -1;
        else
            return NumPart.CompareTo(mc.NumPart);
    

    #endregion


class Program

    static void Main(string[] args)
    
        List<MyComparable> list = new List<MyComparable>();
        list.Add(new MyComparable()  NumPart = 5 );
        list.Add(new MyComparable()  NumPart = 4 );
        list.Add(new MyComparable()  NumPart = 3 );
        list.Add(new MyComparable()  NumPart = 2 );
        list.Add(new MyComparable()  NumPart = 1 );
        list.Sort();


        Console.WriteLine(MyComparable.NumberOfComparisons);
    

结果是 13。

【讨论】:

【参考方案7】:

为了完整起见,问题是sorting network 的一个具体案例,Knuth (Art of Computer Programming, vol 3) 对此进行了详细介绍。关于这个主题的classic paper by K.E. Batcher很简短,值得一读。

【讨论】:

【参考方案8】:

应该这样做

private Double medianofFive(double[] input)

    Double temp;
if (input[0] > input[1])//#1 - sort First and Second

    temp = input[0];
    input[0] = input[1];
    input[1] = temp;

if (input[2] > input[3])//#2 sort Third and Fourth

    temp = input[2];
    input[2] = input[3];
    input[3] = temp;


// replace the smaller of first and third with 5th, then sort
int smallerIndex = input[0] < input[2] ? 0 : 2;//#3
input[smallerIndex] = input[4];

//sort the new pair
if(input[smallerIndex]>input[smallerIndex+1])//#4

    temp = input[smallerIndex];
    input[smallerIndex] = input[smallerIndex+1];
    input[smallerIndex+1] = temp;


//compare the two smaller numbers
// then compare the smaller of the two's partner with larger of the two
// the smaller of THOSE two is the median
if (input[2] > input[0])
//#5

    temp = input[2] > input[1] ? input[1] : input[2];//#6

else

    temp = input[0] > input[3] ? input[3] : input[0];//#6

    return temp;

【讨论】:

不知道怎么回事,但预览中看起来不错,但在主页上被截断了 已修复。使用降价编辑器,而不是你自己的降价形式。【参考方案9】:

这是其他答案的一些变体,与 6 次比较相比,它平均提高了 3.33% 到 66.67%,并且将 5 个元素围绕其中值完全划分,作为奖励,无需额外费用。

可以在平均 5.8 次比较(所有排列的平均值)中找到 5 个不同元素的中位数,方法是使用 median-of-3 和快速选择,使用 median-of-3 从 3 个样本中选择枢轴5个元素中。 Median-of-3 对这三个元素进行分区,在对其余 2 个元素进行分区时,无需将其与枢轴进行重新比较。到目前为止,这是围绕三个中间元素之一划分 5 个元素的 4-5 次比较(3 的中位数不能是 5 的最小值或最大值)。可能需要最多 3 次比较才能将 5 个元素围绕其中位数进行划分(严格来说,这比仅仅找到中位数需要更多的工作),总共进行 4 到 7 次比较,(如前所述)平均超过 5.8 5 个不同元素的所有可能排列(如果元素不不同,则比较少)。请注意,这与通常的 always-6-comparisons 解决方案不同,因为少数不同输入的情况可能需要多达 7 次比较,但另一方面,不同输入的大多数排列需要不超过 6 次,并且通常更少;此外,为非不同输入保存比较是相当容易的(如果所有输入都相等,则只需要 2 次比较;使用通常的 6 比较方法在输入不不同时保存比较的代码变得相当复杂(尝试一下!),没有它,即使所有输入都相等,它仍然需要进行 6 次比较)。

除中位数以外的订单统计数据可以通过这种方式找到:平均而言,第二小或第二大的统计数据略多一些(5.81666...比较),当然也可以通过仅找到最小值或最大值4 次比较。

基于此,根据要求,这里有一个注释很多的函数,它使用可变参数比较函数返回指向 5 个元素的中位数的指针。它是用 C 语言编写的,但它应该在 quadrathorpe-y 异常或 sea ploose ploose 中同样有效。请注意,这仅返回指向中值元素的指针;它不会对元素进行分区(实际上它不会移动任何元素)。

/* a virtual swap, preserving both pointers for later use */
#define VSWAP(ma,mb,mt) mt=ma,ma=mb,mb=mt
/* a virtual swap only preserving the first pointer */
#define VSWAP2(ma,mb,munused) ma=mb
/* virtual rotation to the left; reverse the first 3 args for right rotation */
#define ROTATE(ma,mb,mc,mt) (mt)=(ma),(ma)=(mb),(mb)=(mc),(mc)=(mt)


/* median of 5 elements, no data movement, minimum (average 5.8) comparisons */
/* This implementation minimizes the number of comparisons made (min. 2) when
   elements compare equal.
*/
/* As no elements are moved, the elements are of course not partitioned around
   the element pointed to by the returned pointer (this differs from selection
   of the median via quickselect).
*/
/* Results are biased toward the middle: pc, then pb or pd, last pa or pe. */
/* The implementation is based on a simulation of quickselect using partition3
   to select a pivot from the middle three elements, with partitioning by
   skipping over the 3 partitioned elements.  For distinct inputs, it uses on
   average 5.8 comparisons (averaged over all permutations of 5 distinct
   elements); fewer for inputs with equal elements.  That's an improvement of
   about 3% over the usual 6-comparison implementation of median-of-5.
*/
void *fmed5(void *pa, void *pb, void *pc, void *pd, void *pe,
    int(*compar)(const void *,const void *))

    void *t;
    int c, d;
    /* First compare the 3 middle elements to determine their relative
       ordering; pb will point to the minimum, pc to the median, and pd to
       the maximum of the three.
    */
    /* Ternary median-of-3, biased toward middle element if possible, else
       first element.  Average 8/3 comparisons for 3 elements (distinct values)
       = 0.889 comparisons/element
    */
    c=compar(pb,pc); /* 1 */
    if (0<c)  /* *pb,*pc out-of-order: *pb>*pc */
        /* Before doing anything about pb,pc, compare *pc,*pd. */
        d=compar(pc,pd); /* 2a */
        if (0>d)  /* *pc<*pd: strictly in order */
            /* But *pb might be either larger than or smaller than (or equal
               to) *pd, so they may (i.e. unless it's known from the earlier
               comparison of original *pc and *pd that *pb is larger than
               both) have to be compared,  Possibilities:
               *pc<*pb<=*pd (virtual swap of pb,pc corrects relationship)
               *pc<*pd<*pb (virtual rotation to the left corrects it)
            */
            c=compar(pb,pd); /* 3a (if needed) */
            if (0<c)  /* *pc < *pd < *pb */
                ROTATE(pb,pc,pd,t);
             else  /* *pc < *pb <= *pd */
                VSWAP(pb,pc,t);
            
         else  /* *pd==*pc<*pb or reversed ordering: *pd<*pc<*pb */
            VSWAP(pb,pd,t); /* one (virtual) swap takes care of it */
         /* *pc:*pd comparisons results if-else */
        /* Note that if pc,pd compare equal, pc remains as the chosen
           median (biased toward the middle element).
        */
     else if (0==c)  /* *pb,*pc compare equal */
        /* Either pb or pc can be taken as the median; bias here is towards
           pc, which is already in the middle position. But pd might be
           the minimum of the three or the maximum (or it may also be equal
           to both pb and pc).
        */
        d=compar(pb,pd); /* 2b */
        if (0<d)  /* pb,pd are out-of-order */
            VSWAP(pb,pd,t);
        
     else  /* *pb,*pc in strict order: *pb < *pc; how about *pc,*pd? */
        d=compar(pc,pd); /* 2c */
        if (0<d)  /* *pc,*pd are strictly out-of-order: *pd < *pc */
            /* But *pb might be either larger than or smaller than (or equal
               to) *pd, so they may (i.e. unless it's known from the earlier
               comparison of original *pc and *pd that *pb is larger than
               both) have to be compared,  Possibilities:
               *pb<=*pd<*pc (virtual swap of pc,pd corrects relationship)
               *pd<*pb<*pc (virtual rotation to the right corrects it)
            */
            c=compar(pb,pd); /* 3b (if needed) */
            if (0<c)  /* *pd < *pb < *pc */
                ROTATE(pd,pc,pb,t);
             else  /* *pc < *pb <= *pd */
                VSWAP(pc,pd,t);
            
         /* *pc:*pd comparisons results if-else */
        /* Note that if pc,pd compare equal, pc remains as the chosen
           median (biased toward the middle element).
        */
     /* *pb:*pc comparisons results if-else chain */
    /* Now pb points to the minimum of pb,pc,pd; pc to the median, and pd
       to the maximum.
    */
    /* Special case: if all three middle elements compare equal (0==c==d),
       any one can be returned as the median of 5, as it's impossible for
       either of the other two elements to be the median (unless of course
       one or both of them also compares equal to pb,pc,pd, in which case
       returning any of pb,pc,pd is still correct).  Nothing more needs to
       be done in that case.
    */
    if ((0!=c)||(0!=d))  /* Not all three of pb,pc,pd compare equal */
        int e;
        /* Compare pa and pe to pc. */
        e=compar(pa,pc); /* 3c or 4a (iff needed) */
        /* If three (or more) of the four elements so far compared are
           equal, any of those equal-comparing elements can be chhosen as
           the median of 5.  If all five elements were arranged in order,
           one of the three equal-comparing elements would necessarily be
           in the middle (at most both of the remaining elements might be
           either larger or smaller than the equal elements).  So if pa
           compares equal to pc and pc also compared equal to pb or to pd,
           nothing more need be done; pc can be considered as the median of
           five.
        */
        if ((0!=e)||(0!=c)||(0!=d))  /* no three elements compare equal */
            int f;
            f=compar(pe,pc); /* 4b or 5a (iff needed) */
            /* Check again for three equal comparisons to avoid doing any
               unnecessary additional work.
            */
            if (
                (0!=f) /* also not equal; still not three */
              ||
                ( /* 0==f && */
                 ((0!=c)&&(0!=d)) /* at most e and f; not three */
               ||
                 ((0!=c)&&(0!=e)) /* at most d and f; not three */
               ||
                 ((0!=d)&&(0!=e)) /* at most c and f; not three */
                )
            ) 
                /* Possibilites are that:
                     one of *pa,*pe is less than (or equal to) *pc and one
                       is greater than (or equal to) *pc; *pc is the median
                       of five.
                     *pa and *pe are both less than *pc; the median of five
                       is then the maximum of *pa,*pb,*pe (*pc and *pd are
                       at least as large as those three).  The ordering of
                       those 3 elements has not been established, and it
                       will take two additional comparisons to do so.
                     *pa and *pe are both greater than *pc; the median of
                       five is the minimum of *pa,*pd,*pe (neither *pb nor
                       *pc can be larger than any of those three).
                */
                if ((0<e)&&(0<f))  /* *pa,*pe both > *pc; median of five is
                                       the minimum of *pa,*pe,*pd
                                    */
                    /* Bias towards *pd (originally closest of these three
                       to the middle.  Neither *pa nor *pe has yet been
                       compared to *pd.
                    */
                    e=compar(pa,pe); /* 5b or 6a (iff needed) */
                    if (0<e)  /* *pe is smaller */
                        f=compar(pd,pe); /* 6b or 7a (iff needed) */
                        if (0<f)  /* *pe is smaller */
                            VSWAP2(pc,pe,t);
                         else  /* *pd is smaller or *pd==*pe */
                            VSWAP2(pc,pd,t);
                        
                     else  /* *pa==*pe or *pa is smaller */
                        f=compar(pd,pa); /* 6c or 7b (iff needed) */
                        if (0<f)  /* *pa is smaller */
                            VSWAP2(pc,pa,t);
                         else  /* *pd is smaller or *pd==*pa */
                            VSWAP2(pc,pd,t);
                        
                    
                 else if ((0>e)&&(0>f))  /* *pa,*pe both < *pc; median of
                                       five is the maximum of *pa,*pb,*pe
                                    */
                    /* Bias towards *pb (originally closest of these three
                       to the middle.  Neither *pa nor *pe has yet been
                       compared to *pb.
                    */
                    e=compar(pa,pe); /* 5c or 6d (iff needed) */
                    if (0<e)  /* *pa is larger */
                        f=compar(pa,pb); /* 6e or 7c (iff needed) */
                        if (0<f)  /* *pa is larger */
                            VSWAP2(pc,pa,t);
                         else  /* *pb is larger or *pa==*pb */
                            VSWAP2(pc,pb,t);
                        
                     else  /* *pe is larger or *pa==*pe */
                        f=compar(pe,pb); /* 6f or 7d (iff needed) */
                        if (0<f)  /* *pe is larger */
                            VSWAP2(pc,pe,t);
                         else  /* *pb is larger or *pe==*pb */
                            VSWAP2(pc,pb,t);
                        
                     /* *pa:*pe results if-else */
                 /* median of five: minimum or maximum of three if-else */
             /* not three equal elements among five */
         /* not three equal elements among four */
     /* not three equal elements among three */
    return pc;

【讨论】:

【参考方案10】:

有趣的是 MSDN 示例中有多少比较...

public static double Median(this IEnumerable<double> source) 
        if (source.Count() == 0)  throw new InvalidOperationException("Cannot compute median for an empty set.");

        var sortedList = from number in source
                         orderby number
                         select number;

        int itemIndex = (int)sortedList.Count() / 2;

        if (sortedList.Count() % 2 == 0) 
            // Even number of items.
            return (sortedList.ElementAt(itemIndex) + sortedList.ElementAt(itemIndex - 1)) / 2;  else 
            // Odd number of items.
            return sortedList.ElementAt(itemIndex); 
    

【讨论】:

以上是关于如何在 C# 中计算“五的中位数”?的主要内容,如果未能解决你的问题,请参考以下文章

如何计算一个数字的总位数?

分位数如何计算?

C#中的decimal怎么保留两位小数

如何在 java 中使用 Apache spark 计算中位数和众数?

如何计算 SQLite 中值的中位数?

如何计算 Google BigQuery 中多列的中位数?