如何将整个输入流读到字节数组?


当前回答

在将S3对象转换为ByteArray时,我们看到一些AWS事务的延迟。

注意:S3对象为PDF文档(最大大小为3mb)。

我们使用选项#1将S3对象转换为ByteArray。我们注意到S3提供了内置IOUtils方法来将S3对象转换为ByteArray,我们请求您确认将S3对象转换为ByteArray的最佳方法以避免延迟。

选项1:

import org.apache.commons.io.IOUtils;
is = s3object.getObjectContent();
content =IOUtils.toByteArray(is);

选项2:

import com.amazonaws.util.IOUtils;
is = s3object.getObjectContent();
content =IOUtils.toByteArray(is);

也让我知道,如果我们有任何其他更好的方法来转换s3对象到bytearray

其他回答

我用这个。

public static byte[] toByteArray(InputStream is) throws IOException {
        ByteArrayOutputStream output = new ByteArrayOutputStream();
        try {
            byte[] b = new byte[4096];
            int n = 0;
            while ((n = is.read(b)) != -1) {
                output.write(b, 0, n);
            }
            return output.toByteArray();
        } finally {
            output.close();
        }
    }

下面是一个优化的版本,尽量避免复制数据字节:

private static byte[] loadStream (InputStream stream) throws IOException {
   int available = stream.available();
   int expectedSize = available > 0 ? available : -1;
   return loadStream(stream, expectedSize);
}

private static byte[] loadStream (InputStream stream, int expectedSize) throws IOException {
   int basicBufferSize = 0x4000;
   int initialBufferSize = (expectedSize >= 0) ? expectedSize : basicBufferSize;
   byte[] buf = new byte[initialBufferSize];
   int pos = 0;
   while (true) {
      if (pos == buf.length) {
         int readAhead = -1;
         if (pos == expectedSize) {
            readAhead = stream.read();       // test whether EOF is at expectedSize
            if (readAhead == -1) {
               return buf;
            }
         }
         int newBufferSize = Math.max(2 * buf.length, basicBufferSize);
         buf = Arrays.copyOf(buf, newBufferSize);
         if (readAhead != -1) {
            buf[pos++] = (byte)readAhead;
         }
      }
      int len = stream.read(buf, pos, buf.length - pos);
      if (len < 0) {
         return Arrays.copyOf(buf, pos);
      }
      pos += len;
   }
}

Kotlin中的解决方案(当然也可以在Java中工作),其中包括当你知道大小时的两种情况:

    fun InputStream.readBytesWithSize(size: Long): ByteArray? {
        return when {
            size < 0L -> this.readBytes()
            size == 0L -> ByteArray(0)
            size > Int.MAX_VALUE -> null
            else -> {
                val sizeInt = size.toInt()
                val result = ByteArray(sizeInt)
                readBytesIntoByteArray(result, sizeInt)
                result
            }
        }
    }

    fun InputStream.readBytesIntoByteArray(byteArray: ByteArray,bytesToRead:Int=byteArray.size) {
        var offset = 0
        while (true) {
            val read = this.read(byteArray, offset, bytesToRead - offset)
            if (read == -1)
                break
            offset += read
            if (offset >= bytesToRead)
                break
        }
    }

如果您知道大小,那么与其他解决方案相比,它可以节省两倍的内存(在很短的时间内,但仍然可能有用)。这是因为您必须将整个流读到末尾,然后将其转换为字节数组(类似于将数组转换为数组的ArrayList)。

例如,如果你在Android上,你有一些Uri要处理,你可以尝试用这个来获取大小:

    fun getStreamLengthFromUri(context: Context, uri: Uri): Long {
        context.contentResolver.query(uri, arrayOf(MediaStore.MediaColumns.SIZE), null, null, null)?.use {
            if (!it.moveToNext())
                return@use
            val fileSize = it.getLong(it.getColumnIndex(MediaStore.MediaColumns.SIZE))
            if (fileSize > 0)
                return fileSize
        }
        //if you wish, you can also get the file-path from the uri here, and then try to get its size, using this: https://stackoverflow.com/a/61835665/878126
        FileUtilEx.getFilePathFromUri(context, uri, false)?.use {
            val file = it.file
            val fileSize = file.length()
            if (fileSize > 0)
                return fileSize
        }
        context.contentResolver.openInputStream(uri)?.use { inputStream ->
            if (inputStream is FileInputStream)
                return inputStream.channel.size()
            else {
                var bytesCount = 0L
                while (true) {
                    val available = inputStream.available()
                    if (available == 0)
                        break
                    val skip = inputStream.skip(available.toLong())
                    if (skip < 0)
                        break
                    bytesCount += skip
                }
                if (bytesCount > 0L)
                    return bytesCount
            }
        }
        return -1L
    }

Java 8方式(感谢BufferedReader和Adam Bien)

private static byte[] readFully(InputStream input) throws IOException {
    try (BufferedReader buffer = new BufferedReader(new InputStreamReader(input))) {
        return buffer.lines().collect(Collectors.joining("\n")).getBytes(<charset_can_be_specified>);
    }
}

注意,这个解决方案删除回车符('\r'),可能是不合适的。

下面的代码

public static byte[] serializeObj(Object obj) throws IOException {
  ByteArrayOutputStream baOStream = new ByteArrayOutputStream();
  ObjectOutputStream objOStream = new ObjectOutputStream(baOStream);

  objOStream.writeObject(obj); 
  objOStream.flush();
  objOStream.close();
  return baOStream.toByteArray(); 
} 

OR

BufferedImage img = ...
ByteArrayOutputStream baos = new ByteArrayOutputStream(1000);
ImageIO.write(img, "jpeg", baos);
baos.flush();
byte[] result = baos.toByteArray();
baos.close();