我为自己编写了一个实用程序,将列表分解为给定大小的批次。我只是想知道是否已经有任何apache commons util用于此。

public static <T> List<List<T>> getBatches(List<T> collection,int batchSize){
    int i = 0;
    List<List<T>> batches = new ArrayList<List<T>>();
    while(i<collection.size()){
        int nextInc = Math.min(collection.size()-i,batchSize);
        List<T> batch = collection.subList(i,i+nextInc);
        batches.add(batch);
        i = i + nextInc;
    }

    return batches;
}

请让我知道是否有任何现有的公用事业已经相同。


当前回答

还有一个问题和这个问题完全一样,但如果你仔细阅读,你会发现它有微妙的不同。因此,如果有人(比如我)真的想将一个列表分割成给定数量的几乎相同大小的子列表,那么请继续阅读。

我只是简单地将这里描述的算法移植到Java。

@Test
public void shouldPartitionListIntoAlmostEquallySizedSublists() {

    List<String> list = Arrays.asList("a", "b", "c", "d", "e", "f", "g");
    int numberOfPartitions = 3;

    List<List<String>> split = IntStream.range(0, numberOfPartitions).boxed()
            .map(i -> list.subList(
                    partitionOffset(list.size(), numberOfPartitions, i),
                    partitionOffset(list.size(), numberOfPartitions, i + 1)))
            .collect(toList());

    assertThat(split, hasSize(numberOfPartitions));
    assertEquals(list.size(), split.stream().flatMap(Collection::stream).count());
    assertThat(split, hasItems(Arrays.asList("a", "b", "c"), Arrays.asList("d", "e"), Arrays.asList("f", "g")));
}

private static int partitionOffset(int length, int numberOfPartitions, int partitionIndex) {
    return partitionIndex * (length / numberOfPartitions) + Math.min(partitionIndex, length % numberOfPartitions);
}

其他回答

您可以使用下面的代码来获得该批列表。

Iterable<List<T>> batchIds = Iterables.partition(list, batchSize);

你需要导入谷歌番石榴库来使用上面的代码。

另一种方法是使用收集器。索引的groupingBy,然后将分组索引映射到实际元素:

    final List<Integer> numbers = range(1, 12)
            .boxed()
            .collect(toList());
    System.out.println(numbers);

    final List<List<Integer>> groups = range(0, numbers.size())
            .boxed()
            .collect(groupingBy(index -> index / 4))
            .values()
            .stream()
            .map(indices -> indices
                    .stream()
                    .map(numbers::get)
                    .collect(toList()))
            .collect(toList());
    System.out.println(groups);

输出:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] [[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11]

下面是一个简单的Java 8+解决方案:

public static <T> Collection<List<T>> prepareChunks(List<T> inputList, int chunkSize) {
    AtomicInteger counter = new AtomicInteger();
    return inputList.stream().collect(Collectors.groupingBy(it -> counter.getAndIncrement() / chunkSize)).values();
}

解决这个问题的另一个方法是:

public class CollectionUtils {

    /**
    * Splits the collection into lists with given batch size
    * @param collection to split in to batches
    * @param batchsize size of the batch
    * @param <T> it maintains the input type to output type
    * @return nested list
    */
    public static <T> List<List<T>> makeBatch(Collection<T> collection, int batchsize) {

        List<List<T>> totalArrayList = new ArrayList<>();
        List<T> tempItems = new ArrayList<>();

        Iterator<T> iterator = collection.iterator();

        for (int i = 0; i < collection.size(); i++) {
            tempItems.add(iterator.next());
            if ((i+1) % batchsize == 0) {
                totalArrayList.add(tempItems);
                tempItems = new ArrayList<>();
            }
        }

        if (tempItems.size() > 0) {
            totalArrayList.add(tempItems);
        }

        return totalArrayList;
    }

}

下面是使用Java 8 Streams的解决方案:

        //Sample Input
        List<String> input = new ArrayList<String>();
        IntStream.range(1,999).forEach((num) -> {
            input.add(""+num);
        });
        
        //Identify no. of batches
        int BATCH_SIZE = 10;
        int multiples = input.size() /  BATCH_SIZE;
        if(input.size()%BATCH_SIZE!=0) {
            multiples = multiples + 1;
        }
        
        //Process each batch
        IntStream.range(0, multiples).forEach((indx)->{
            List<String> batch = input.stream().skip(indx * BATCH_SIZE).limit(BATCH_SIZE).collect(Collectors.toList());
            System.out.println("Batch Items:"+batch);
        });