如何获得子集的所有可能组合?
Posted
技术标签:
【中文标题】如何获得子集的所有可能组合?【英文标题】:How can I obtain all the possible combination of a subset? 【发布时间】:2012-11-25 18:51:13 【问题描述】:考虑一下List<string>
List<string> data = new List<string>();
data.Add("Text1");
data.Add("Text2");
data.Add("Text3");
data.Add("Text4");
我遇到的问题是:如何获得列表子集的每个组合? 有点像这样:
#Subset Dimension 4
Text1;Text2;Text3;Text4
#Subset Dimension 3
Text1;Text2;Text3;
Text1;Text2;Text4;
Text1;Text3;Text4;
Text2;Text3;Text4;
#Subset Dimension 2
Text1;Text2;
Text1;Text3;
Text1;Text4;
Text2;Text3;
Text2;Text4;
#Subset Dimension 1
Text1;
Text2;
Text3;
Text4;
我想出了一个不错的解决方案,值得在这里分享。
【问题讨论】:
迂腐,但不包括子集维度 1,您没有列表的每个子集。 【参考方案1】:与 Abaco 的答案相似的逻辑,不同的实现......
foreach (var ss in data.SubSets_LB())
Console.WriteLine(String.Join("; ",ss));
public static class SO_EXTENSIONS
public static IEnumerable<IEnumerable<T>> SubSets_LB<T>(
this IEnumerable<T> enumerable)
List<T> list = enumerable.ToList();
ulong upper = (ulong)1 << list.Count;
for (ulong i = 0; i < upper; i++)
List<T> l = new List<T>(list.Count);
for (int j = 0; j < sizeof(ulong) * 8; j++)
if (((ulong)1 << j) >= upper) break;
if (((i >> j) & 1) == 1)
l.Add(list[j]);
yield return l;
【讨论】:
+1 因为我在很大程度上抄袭了你的答案,但做了一些调整。 如果您有兴趣,我进行了另一次粘合练习。请参阅我的扩展答案。【参考方案2】:我认为,这个问题的答案需要一些性能测试。我会试一试的。这是社区维基,请随时更新。
void PerfTest()
var list = Enumerable.Range(0, 21).ToList();
var t1 = GetDurationInMs(list.SubSets_LB);
var t2 = GetDurationInMs(list.SubSets_Jodrell2);
var t3 = GetDurationInMs(() => list.CalcCombinations(20));
Console.WriteLine("0\n1\n2", t1, t2, t3);
long GetDurationInMs(Func<IEnumerable<IEnumerable<int>>> fxn)
fxn(); //JIT???
var count = 0;
var sw = Stopwatch.StartNew();
foreach (var ss in fxn())
count = ss.Sum();
return sw.ElapsedMilliseconds;
输出:
1281
1604 (_Jodrell not _Jodrell2)
6817
Jodrell 的更新
我已经建立了发布模式,即优化。当我通过 Visual Studio 运行时,我在 1 或 2 之间没有得到一致的偏差,但在重复运行 LB 的答案获胜后,我得到的答案接近于,
1190
1260
more
但如果我从命令行而不是通过 Visual Studio 运行测试工具,我会得到更像这样的结果
987
879
still more
【讨论】:
赞成所有以前的帖子。感谢您的宝贵贡献。当然我有一些返工要做:) @Abaco,正如我在扩展答案中所述,我合并产生(在我的测试中)最好的性能。 ***.comhttp://***.com/a/13768100/659190 考虑到所有答案的有用性,我决定接受 wiki 作为一种总结。【参考方案3】:编辑
我已经接受了性能挑战,接下来是我的合并,它采用了所有答案中最好的方法。在我的测试中,它似乎有最好的性能。
public static IEnumerable<IEnumerable<T>> SubSets_Jodrell2<T>(
this IEnumerable<T> source)
var list = source.ToList();
var limit = (ulong)(1 << list.Count);
for (var i = limit; i > 0; i--)
yield return list.SubSet(i);
private static IEnumerable<T> SubSet<T>(
this IList<T> source, ulong bits)
for (var i = 0; i < source.Count; i++)
if (((bits >> i) & 1) == 1)
yield return source[i];
同样的想法,与L.B's answer几乎相同,但我自己的解释。
我避免使用内部List
和Math.Pow
。
public static IEnumerable<IEnumerable<T>> SubSets_Jodrell(
this IEnumerable<T> source)
var count = source.Count();
if (count > 64)
throw new OverflowException("Not Supported ...");
var limit = (ulong)(1 << count) - 2;
for (var i = limit; i > 0; i--)
yield return source.SubSet(i);
private static IEnumerable<T> SubSet<T>(
this IEnumerable<T> source,
ulong bits)
var check = (ulong)1;
foreach (var t in source)
if ((bits & check) > 0)
yield return t;
check <<= 1;
您会注意到,这些方法不适用于初始集合中超过 64 个元素,但无论如何它开始需要一段时间。
【讨论】:
Jodrell,一段不错的代码。但是,在性能方面,我的测试结果说不一样(我在下面(或上面:)使用了代码PerfTest
)。
@L.B,我的测试代码一定是错误的,我已经重新测试并修改了wiki。【参考方案4】:
我为列表开发了一个简单的 ExtensionMethod:
/// <summary>
/// Obtain all the combinations of the elements contained in a list
/// </summary>
/// <param name="subsetDimension">Subset Dimension</param>
/// <returns>IEnumerable containing all the differents subsets</returns>
public static IEnumerable<List<T>> CalcCombinations<T>(this List<T> list, int subsetDimension)
//First of all we will create a binary matrix. The dimension of a single row
//must be the dimension of list
//on which we are working (we need a 0 or a 1 for every single element) so row
//dimension is to obtain a row-length = list.count we have to
//populate the matrix with the first 2^list.Count binary numbers
int rowDimension = Convert.ToInt32(Math.Pow(2, list.Count));
//Now we start counting! We will fill our matrix with every number from 1
//(0 is meaningless) to rowDimension
//we are creating binary mask, hence the name
List<int[]> combinationMasks = new List<int[]>();
for (int i = 1; i < rowDimension; i++)
//I'll grab the binary rapresentation of the number
string binaryString = Convert.ToString(i, 2);
//I'll initialize an array of the apropriate dimension
int[] mask = new int[list.Count];
//Now, we have to convert our string in a array of 0 and 1, so first we
//obtain an array of int then we have to copy it inside our mask
//(which have the appropriate dimension), the Reverse()
//is used because of the behaviour of CopyTo()
binaryString.Select(x => x == '0' ? 0 : 1).Reverse().ToArray().CopyTo(mask, 0);
//Why should we keep masks of a dimension which isn't the one of the subset?
// We have to filter it then!
if (mask.Sum() == subsetDimension) combinationMasks.Add(mask);
//And now we apply the matrix to our list
foreach (int[] mask in combinationMasks)
List<T> temporaryList = new List<T>(list);
//Executes the cycle in reverse order to avoid index out of bound
for (int iter = mask.Length - 1; iter >= 0; iter--)
//Whenever a 0 is found the correspondent item is removed from the list
if (mask[iter] == 0)
temporaryList.RemoveAt(iter);
yield return temporaryList;
因此考虑问题中的示例:
# Row Dimension of 4 (list.Count)
Binary Numbers to 2^4
# Binary Matrix
0 0 0 1 => skip
0 0 1 0 => skip
[...]
0 1 1 1 => added // Text2;Text3;Text4
[...]
1 0 1 1 => added // Text1;Text3;Text4
1 1 0 0 => skip
1 1 0 1 => added // Text1;Text2;Text4
1 1 1 0 => added // Text1;Text2;Text3
1 1 1 1 => skip
希望这可以帮助某人:)
如果您需要澄清或想要贡献,请随时添加答案或 cmets(哪个更合适)。
【讨论】:
以上是关于如何获得子集的所有可能组合?的主要内容,如果未能解决你的问题,请参考以下文章