用LINQ将列表拆分为子列表

是否有任何方法可以将List<SomeObject>分离为SomeObject的几个单独的列表，使用项目索引作为每个分割的分隔符?

让我举个例子:

我有一个List<SomeObject>，我需要一个List<List<SomeObject>>或List<SomeObject>[]，这样每个结果列表将包含一组原始列表的3个项目(依次)。

eg.:

原始列表:[a, g, e, w, p, s, q, f, x, y, i, m, c] 结果列表:[a、g e], [w、p, s], [q, f, x]、[y,我,m], [c]

我还需要结果列表的大小是这个函数的参数。

当前回答

好吧，以下是我的看法:

完全懒惰:工作在无限枚举上没有中间复制/缓冲 O(n)执行时间当内部序列仅被部分消耗时也适用

public static IEnumerable<IEnumerable<T>> Chunks<T>(this IEnumerable<T> enumerable, int chunkSize) { if (chunkSize < 1) throw new ArgumentException("chunkSize must be positive"); using (var e = enumerable.GetEnumerator()) while (e.MoveNext()) { var remaining = chunkSize; // elements remaining in the current chunk var innerMoveNext = new Func<bool>(() => --remaining > 0 && e.MoveNext()); yield return e.GetChunk(innerMoveNext); while (innerMoveNext()) {/* discard elements skipped by inner iterator */} } } private static IEnumerable<T> GetChunk<T>(this IEnumerator<T> e, Func<bool> innerMoveNext) { do yield return e.Current; while (innerMoveNext()); } Example Usage var src = new [] {1, 2, 3, 4, 5, 6}; var c3 = src.Chunks(3); // {{1, 2, 3}, {4, 5, 6}}; var c4 = src.Chunks(4); // {{1, 2, 3, 4}, {5, 6}}; var sum = c3.Select(c => c.Sum()); // {6, 15} var count = c3.Count(); // 2 var take2 = c3.Select(c => c.Take(2)); // {{1, 2}, {4, 5}} Explanations The code works by nesting two yield based iterators. The outer iterator must keep track of how many elements have been effectively consumed by the inner (chunk) iterator. This is done by closing over remaining with innerMoveNext(). Unconsumed elements of a chunk are discarded before the next chunk is yielded by the outer iterator. This is necessary because otherwise you get inconsistent results, when the inner enumerables are not (completely) consumed (e.g. c3.Count() would return 6). Note: The answer has been updated to address the shortcomings pointed out by @aolszowka.

2014-01-06 15:40:39

其他回答

问题是如何“用LINQ将列表拆分为子列表”，但有时你可能希望这些子列表是对原始列表的引用，而不是副本。这允许您从子列表中修改原始列表。在这种情况下，这可能对你有用。

public static IEnumerable<Memory<T>> RefChunkBy<T>(this T[] array, int size)
{
    if (size < 1 || array is null)
    {
        throw new ArgumentException("chunkSize must be positive");
    }

    var index = 0;
    var counter = 0;

    for (int i = 0; i < array.Length; i++)
    {
        if (counter == size)
        {
            yield return new Memory<T>(array, index, size);
            index = i;
            counter = 0;
        }
        counter++;

        if (i + 1 == array.Length)
        {
            yield return new Memory<T>(array, index, array.Length - index);
        }
    }
}

用法:

var src = new[] { 1, 2, 3, 4, 5, 6 };

var c3 = RefChunkBy(src, 3);      // {{1, 2, 3}, {4, 5, 6}};
var c4 = RefChunkBy(src, 4);      // {{1, 2, 3, 4}, {5, 6}};

// as extension method
var c3 = src.RefChunkBy(3);      // {{1, 2, 3}, {4, 5, 6}};
var c4 = src.RefChunkBy(4);      // {{1, 2, 3, 4}, {5, 6}};

var sum = c3.Select(c => c.Span.ToArray().Sum());    // {6, 15}
var count = c3.Count();                 // 2
var take2 = c3.Select(c => c.Span.ToArray().Take(2));  // {{1, 2}, {4, 5}}

请随意修改代码。

2021-02-10 15:06:01

这是一个老问题，但这是我最后得出的结论;它只枚举可枚举对象一次，但是为每个分区创建列表。当调用ToArray()时，它不会像某些实现那样遭受意外行为:

    public static IEnumerable<IEnumerable<T>> Partition<T>(IEnumerable<T> source, int chunkSize)
    {
        if (source == null)
        {
            throw new ArgumentNullException("source");
        }

        if (chunkSize < 1)
        {
            throw new ArgumentException("Invalid chunkSize: " + chunkSize);
        }

        using (IEnumerator<T> sourceEnumerator = source.GetEnumerator())
        {
            IList<T> currentChunk = new List<T>();
            while (sourceEnumerator.MoveNext())
            {
                currentChunk.Add(sourceEnumerator.Current);
                if (currentChunk.Count == chunkSize)
                {
                    yield return currentChunk;
                    currentChunk = new List<T>();
                }
            }

            if (currentChunk.Any())
            {
                yield return currentChunk;
            }
        }
    }

2014-05-13 13:56:30

系统。Interactive为此提供了Buffer()。一些快速测试显示性能与Sam的解决方案类似。

2012-05-03 06:24:02

没有办法在一个解决方案中结合所有理想的特性，如完全懒惰、无复制、完全通用性和安全性。最根本的原因是不能保证在访问块之前输入不发生变化。假设我们有一个如下签名的函数:

public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunkSize)
{
    // Some implementation
}

那么下面的使用方式就有问题了:

var myList = new List<int>()
{
    1,2,3,4
};
var myChunks = myList.Chunk(2);
myList.RemoveAt(0);
var firstChunk = myChunks.First();    
Console.WriteLine("First chunk:" + String.Join(',', firstChunk));
myList.RemoveAt(0);
var secondChunk = myChunks.Skip(1).First();
Console.WriteLine("Second chunk:" + String.Join(',', secondChunk));
// What outputs do we see for first and second chunk? Probably not what you would expect...

根据具体的实现，代码将失败并产生运行时错误或产生不直观的结果。

所以，至少有一个属性需要减弱。如果你想要一个无懈无击的惰性解决方案，你需要将输入类型限制为不可变类型，即使这样也不能直接覆盖所有用例。但是，如果您可以控制使用，您仍然可以选择最通用的解决方案，只要您确保以一种有效的方式使用它。否则，你可能会放弃懒惰，接受一定数量的复制。

最后，这完全取决于您的用例和需求，哪种解决方案最适合您。

2020-06-21 02:17:24

我认为下面的建议是最快的。为了能够使用数组，我牺牲了源Enumerable的惰性。复制和提前知道每个子列表的长度。

public static IEnumerable<T[]> Chunk<T>(this IEnumerable<T> items, int size)
{
    T[] array = items as T[] ?? items.ToArray();
    for (int i = 0; i < array.Length; i+=size)
    {
        T[] chunk = new T[Math.Min(size, array.Length - i)];
        Array.Copy(array, i, chunk, 0, chunk.Length);
        yield return chunk;
    }
}

2014-09-19 21:03:29

用LINQ将列表拆分为子列表

推荐文章

最新文章

标签