如果您有java.io.InputStream对象,应该如何处理该对象并生成String?


假设我有一个包含文本数据的InputStream,我想将其转换为String,例如,我可以将其写入日志文件。

获取InputStream并将其转换为字符串的最简单方法是什么?

public String convertStreamToString(InputStream is) {
// ???
}

当前回答

我在这里对14个不同的答案做了一个基准测试(很抱歉没有提供学分,但有太多重复)。

结果非常令人惊讶。事实证明,Apache IOUtils是最慢的解决方案,ByteArrayOutputStream是最快的解决方案:

因此,首先是最好的方法:

public String inputStreamToString(InputStream inputStream) throws IOException {
    try(ByteArrayOutputStream result = new ByteArrayOutputStream()) {
        byte[] buffer = new byte[1024];
        int length;
        while ((length = inputStream.read(buffer)) != -1) {
            result.write(buffer, 0, length);
        }

        return result.toString(UTF_8);
    }
}

20个周期内20 MB随机字节的基准结果

时间(毫秒)

字节数组输出流测试:194NioStream:198Java9ISTransferTo:201Java9ISReadAllBytes:205缓冲输入流VsByteArray输出流:314ApacheStringWriter2:574GuavaCharStreams:589扫描仪读取器无下一测试:614扫描仪读数:633ApacheStringWriter:1544StreamApi:错误ParallelStreamApi:错误BufferReaderTest:错误InputStreamAndStringBuilder:错误

基准源代码

import com.google.common.io.CharStreams;
import org.apache.commons.io.IOUtils;

import java.io.*;
import java.nio.ByteBuffer;
import java.nio.channels.Channels;
import java.nio.channels.ReadableByteChannel;
import java.nio.channels.WritableByteChannel;
import java.util.Arrays;
import java.util.List;
import java.util.Random;
import java.util.stream.Collectors;

/**
 * Created by Ilya Gazman on 2/13/18.
 */
public class InputStreamToString {


    private static final String UTF_8 = "UTF-8";

    public static void main(String... args) {
        log("App started");
        byte[] bytes = new byte[1024 * 1024];
        new Random().nextBytes(bytes);
        log("Stream is ready\n");

        try {
            test(bytes);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    private static void test(byte[] bytes) throws IOException {
        List<Stringify> tests = Arrays.asList(
                new ApacheStringWriter(),
                new ApacheStringWriter2(),
                new NioStream(),
                new ScannerReader(),
                new ScannerReaderNoNextTest(),
                new GuavaCharStreams(),
                new StreamApi(),
                new ParallelStreamApi(),
                new ByteArrayOutputStreamTest(),
                new BufferReaderTest(),
                new BufferedInputStreamVsByteArrayOutputStream(),
                new InputStreamAndStringBuilder(),
                new Java9ISTransferTo(),
                new Java9ISReadAllBytes()
        );

        String solution = new String(bytes, "UTF-8");

        for (Stringify test : tests) {
            try (ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes)) {
                String s = test.inputStreamToString(inputStream);
                if (!s.equals(solution)) {
                    log(test.name() + ": Error");
                    continue;
                }
            }
            long startTime = System.currentTimeMillis();
            for (int i = 0; i < 20; i++) {
                try (ByteArrayInputStream inputStream = new ByteArrayInputStream(bytes)) {
                    test.inputStreamToString(inputStream);
                }
            }
            log(test.name() + ": " + (System.currentTimeMillis() - startTime));
        }
    }

    private static void log(String message) {
        System.out.println(message);
    }

    interface Stringify {
        String inputStreamToString(InputStream inputStream) throws IOException;

        default String name() {
            return this.getClass().getSimpleName();
        }
    }

    static class ApacheStringWriter implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            StringWriter writer = new StringWriter();
            IOUtils.copy(inputStream, writer, UTF_8);
            return writer.toString();
        }
    }

    static class ApacheStringWriter2 implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            return IOUtils.toString(inputStream, UTF_8);
        }
    }

    static class NioStream implements Stringify {

        @Override
        public String inputStreamToString(InputStream in) throws IOException {
            ReadableByteChannel channel = Channels.newChannel(in);
            ByteBuffer byteBuffer = ByteBuffer.allocate(1024 * 16);
            ByteArrayOutputStream bout = new ByteArrayOutputStream();
            WritableByteChannel outChannel = Channels.newChannel(bout);
            while (channel.read(byteBuffer) > 0 || byteBuffer.position() > 0) {
                byteBuffer.flip();  //make buffer ready for write
                outChannel.write(byteBuffer);
                byteBuffer.compact(); //make buffer ready for reading
            }
            channel.close();
            outChannel.close();
            return bout.toString(UTF_8);
        }
    }

    static class ScannerReader implements Stringify {

        @Override
        public String inputStreamToString(InputStream is) throws IOException {
            java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
            return s.hasNext() ? s.next() : "";
        }
    }

    static class ScannerReaderNoNextTest implements Stringify {

        @Override
        public String inputStreamToString(InputStream is) throws IOException {
            java.util.Scanner s = new java.util.Scanner(is).useDelimiter("\\A");
            return s.next();
        }
    }

    static class GuavaCharStreams implements Stringify {

        @Override
        public String inputStreamToString(InputStream is) throws IOException {
            return CharStreams.toString(new InputStreamReader(
                    is, UTF_8));
        }
    }

    static class StreamApi implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            return new BufferedReader(new InputStreamReader(inputStream))
                    .lines().collect(Collectors.joining("\n"));
        }
    }

    static class ParallelStreamApi implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            return new BufferedReader(new InputStreamReader(inputStream)).lines()
                    .parallel().collect(Collectors.joining("\n"));
        }
    }

    static class ByteArrayOutputStreamTest implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            try(ByteArrayOutputStream result = new ByteArrayOutputStream()) {
                byte[] buffer = new byte[1024];
                int length;
                while ((length = inputStream.read(buffer)) != -1) {
                    result.write(buffer, 0, length);
                }

                return result.toString(UTF_8);
            }
        }
    }

    static class BufferReaderTest implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            String newLine = System.getProperty("line.separator");
            BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
            StringBuilder result = new StringBuilder(UTF_8);
            String line;
            boolean flag = false;
            while ((line = reader.readLine()) != null) {
                result.append(flag ? newLine : "").append(line);
                flag = true;
            }
            return result.toString();
        }
    }

    static class BufferedInputStreamVsByteArrayOutputStream implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            BufferedInputStream bis = new BufferedInputStream(inputStream);
            ByteArrayOutputStream buf = new ByteArrayOutputStream();
            int result = bis.read();
            while (result != -1) {
                buf.write((byte) result);
                result = bis.read();
            }

            return buf.toString(UTF_8);
        }
    }

    static class InputStreamAndStringBuilder implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            int ch;
            StringBuilder sb = new StringBuilder(UTF_8);
            while ((ch = inputStream.read()) != -1)
                sb.append((char) ch);
            return sb.toString();
        }
    }

    static class Java9ISTransferTo implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            ByteArrayOutputStream bos = new ByteArrayOutputStream();
            inputStream.transferTo(bos);
            return bos.toString(UTF_8);
        }
    }

    static class Java9ISReadAllBytes implements Stringify {

        @Override
        public String inputStreamToString(InputStream inputStream) throws IOException {
            return new String(inputStream.readAllBytes(), UTF_8);
        }
    }

}

其他回答

这里是在不使用任何第三方库的情况下将InputStream转换为String的完整方法。对于单线程环境使用StringBuilder,否则使用StringBuffer。

public static String getString( InputStream is) throws IOException {
    int ch;
    StringBuilder sb = new StringBuilder();
    while((ch = is.read()) != -1)
        sb.append((char)ch);
    return sb.toString();
}

在Groovy中

inputStream.getText()

基于已接受的Apache Commons答案的第二部分,但在始终关闭流的情况下填补了一个小缺口:

    String theString;
    try {
        theString = IOUtils.toString(inputStream, encoding);
    } finally {
        IOUtils.closeQuietly(inputStream);
    }

我做了一些计时测试,因为时间总是很重要的。

我试图以3种不同的方式将响应转换为字符串。(如下所示)为了可读性,我省略了try/catch块。

为了给出上下文,这是所有3种方法的前面代码:

   String response;
   String url = "www.blah.com/path?key=value";
   GetMethod method = new GetMethod(url);
   int status = client.executeMethod(method);

1)

 response = method.getResponseBodyAsString();

2)

InputStream resp = method.getResponseBodyAsStream();
InputStreamReader is=new InputStreamReader(resp);
BufferedReader br=new BufferedReader(is);
String read = null;
StringBuffer sb = new StringBuffer();
while((read = br.readLine()) != null) {
    sb.append(read);
}
response = sb.toString();

3)

InputStream iStream  = method.getResponseBodyAsStream();
StringWriter writer = new StringWriter();
IOUtils.copy(iStream, writer, "UTF-8");
response = writer.toString();

因此,在使用相同的请求/响应数据对每种方法运行了500次测试之后,以下是数字。再次,这些是我的发现,你的发现可能不完全相同,但我写这篇文章是为了向其他人说明这些方法的效率差异。

排名:方法#1进近#3-比#1慢2.6%2号进近——比1号进近慢4.3%

任何这些方法都是获取响应并从中创建字符串的适当解决方案。

总结我找到的11种主要方法(见下文)。我写了一些性能测试(见下面的结果):

将InputStream转换为字符串的方法:

使用IOUtils.toString(Apache Utils)字符串结果=IOTils.toString(inputStream,StandardCharsets.UTF_8);使用CharStreams(Guava)字符串结果=CharStreams.toString(新InputStreamReader(inputStream、字符集UTF_8));使用扫描仪(JDK)扫描仪s=新扫描仪(inputStream)。使用分隔符(“\\A”);字符串结果=s.hasNext()?s.next():“”;使用流API(Java 8)。警告:此解决方案将不同的换行符(如\r\n)转换为\n。字符串结果=新BufferedReader(新InputStreamReader(inputStream)).line().collector(Collectors.joining(“\n”));使用并行流API(Java8)。警告:此解决方案将不同的换行符(如\r\n)转换为\n。字符串结果=新BufferedReader(新InputStreamReader(inputStream)).line().allel().collector(Collectors.joining(“\n”));使用InputStreamReader和StringBuilder(JDK)int bufferSize=1024;char[]缓冲区=新字符[bufferSize];StringBuilder out=新StringBuilder();Reader in=新InputStreamReader(流,StandardCharsets.UTF_8);for(int numRead;(numRead=in.read(buffer,0,buffer.length))>0;){out.append(缓冲区,0,numRead);}return out.toString();使用StringWriter和IOUtils.copy(Apache Commons)StringWriter writer=新StringWriter();IOUtils.copy(inputStream,writer,“UTF-8”);return writer.toString();使用ByteArrayOutputStream和inputStream.read(JDK)ByteArrayOutputStream result=new ByteArrayOutputStream();byte[]缓冲区=新字节[1024];for(int length;(length=inputStream.read(缓冲区))!=-1; ) {result.write(缓冲区,0,长度);}//StandardCharsets.UTF_8.name()>JDK 7return result.toString(“UTF-8”);使用BufferedReader(JDK)。警告:此解决方案将不同的换行符(如\n\r)转换为line.separator系统属性(例如,在Windows中转换为“\r\n”)。String newLine=System.getProperty(“line.sepaper”);BufferedReader读取器=新的BufferedReader(新的InputStreamReader(inputStream));StringBuilder result=new StringBuilder();for(字符串行;(行=reader.readLine())!=null;){如果(result.length()>0){result.append(换行符);}result.append(行);}return result.toString();使用BufferedInputStream和ByteArrayOutputStream(JDK)BufferedInputStream bis=新缓冲输入流(inputStream);ByteArrayOutputStream buf=新ByteArrayOutputStream();for(int result=bis.read();结果!=-1.result=bis.read()){buf.write((字节)结果);}//StandardCharsets.UTF_8.name()>JDK 7return buf.toString(“UTF-8”);使用inputStream.read()和StringBuilder(JDK)。警告:此解决方案存在Unicode问题,例如俄语文本(仅适用于非Unicode文本)StringBuilder sb=新StringBuilder();for(int ch;(ch=inputStream.read())!=-1; ) {sb.append((char)ch);}return sb.toString();

警告:

解决方案4、5和9将不同的换行符转换为一个。解决方案11无法正确使用Unicode文本

性能测试

小字符串(长度=175)的性能测试,github中的url(模式=平均时间,系统=Linux,分数1343最好):

              Benchmark                         Mode  Cnt   Score   Error  Units
 8. ByteArrayOutputStream and read (JDK)        avgt   10   1,343 ± 0,028  us/op
 6. InputStreamReader and StringBuilder (JDK)   avgt   10   6,980 ± 0,404  us/op
10. BufferedInputStream, ByteArrayOutputStream  avgt   10   7,437 ± 0,735  us/op
11. InputStream.read() and StringBuilder (JDK)  avgt   10   8,977 ± 0,328  us/op
 7. StringWriter and IOUtils.copy (Apache)      avgt   10  10,613 ± 0,599  us/op
 1. IOUtils.toString (Apache Utils)             avgt   10  10,605 ± 0,527  us/op
 3. Scanner (JDK)                               avgt   10  12,083 ± 0,293  us/op
 2. CharStreams (guava)                         avgt   10  12,999 ± 0,514  us/op
 4. Stream Api (Java 8)                         avgt   10  15,811 ± 0,605  us/op
 9. BufferedReader (JDK)                        avgt   10  16,038 ± 0,711  us/op
 5. parallel Stream Api (Java 8)                avgt   10  21,544 ± 0,583  us/op

对大字符串(长度=50100)、github中的url进行性能测试(模式=平均时间,系统=Linux,得分200715最好):

               Benchmark                        Mode  Cnt   Score        Error  Units
 8. ByteArrayOutputStream and read (JDK)        avgt   10   200,715 ±   18,103  us/op
 1. IOUtils.toString (Apache Utils)             avgt   10   300,019 ±    8,751  us/op
 6. InputStreamReader and StringBuilder (JDK)   avgt   10   347,616 ±  130,348  us/op
 7. StringWriter and IOUtils.copy (Apache)      avgt   10   352,791 ±  105,337  us/op
 2. CharStreams (guava)                         avgt   10   420,137 ±   59,877  us/op
 9. BufferedReader (JDK)                        avgt   10   632,028 ±   17,002  us/op
 5. parallel Stream Api (Java 8)                avgt   10   662,999 ±   46,199  us/op
 4. Stream Api (Java 8)                         avgt   10   701,269 ±   82,296  us/op
10. BufferedInputStream, ByteArrayOutputStream  avgt   10   740,837 ±    5,613  us/op
 3. Scanner (JDK)                               avgt   10   751,417 ±   62,026  us/op
11. InputStream.read() and StringBuilder (JDK)  avgt   10  2919,350 ± 1101,942  us/op

图表(性能测试取决于Windows 7系统中的输入流长度)

性能测试(平均时间)取决于Windows 7系统中的输入流长度:

 length  182    546     1092    3276    9828    29484   58968

 test8  0.38    0.938   1.868   4.448   13.412  36.459  72.708
 test4  2.362   3.609   5.573   12.769  40.74   81.415  159.864
 test5  3.881   5.075   6.904   14.123  50.258  129.937 166.162
 test9  2.237   3.493   5.422   11.977  45.98   89.336  177.39
 test6  1.261   2.12    4.38    10.698  31.821  86.106  186.636
 test7  1.601   2.391   3.646   8.367   38.196  110.221 211.016
 test1  1.529   2.381   3.527   8.411   40.551  105.16  212.573
 test3  3.035   3.934   8.606   20.858  61.571  118.744 235.428
 test2  3.136   6.238   10.508  33.48   43.532  118.044 239.481
 test10 1.593   4.736   7.527   20.557  59.856  162.907 323.147
 test11 3.913   11.506  23.26   68.644  207.591 600.444 1211.545