如果您有java.io.InputStream对象,应该如何处理该对象并生成String?
假设我有一个包含文本数据的InputStream,我想将其转换为String,例如,我可以将其写入日志文件。
获取InputStream并将其转换为字符串的最简单方法是什么?
public String convertStreamToString(InputStream is) {
// ???
}
如果您有java.io.InputStream对象,应该如何处理该对象并生成String?
假设我有一个包含文本数据的InputStream,我想将其转换为String,例如,我可以将其写入日志文件。
获取InputStream并将其转换为字符串的最简单方法是什么?
public String convertStreamToString(InputStream is) {
// ???
}
当前回答
如果不能使用Commons IO(FileUtils/IOUtils/CopyUtils),下面是一个使用BufferedReader逐行读取文件的示例:
public class StringFromFile {
public static void main(String[] args) /*throws UnsupportedEncodingException*/ {
InputStream is = StringFromFile.class.getResourceAsStream("file.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(is/*, "UTF-8"*/));
final int CHARS_PER_PAGE = 5000; //counting spaces
StringBuilder builder = new StringBuilder(CHARS_PER_PAGE);
try {
for(String line=br.readLine(); line!=null; line=br.readLine()) {
builder.append(line);
builder.append('\n');
}
}
catch (IOException ignore) { }
String text = builder.toString();
System.out.println(text);
}
}
或者,如果你想要原始速度,我会根据Paul de Vrieze的建议(避免使用StringWriter(内部使用StringBuffer))提出一个变体:
public class StringFromFileFast {
public static void main(String[] args) /*throws UnsupportedEncodingException*/ {
InputStream is = StringFromFileFast.class.getResourceAsStream("file.txt");
InputStreamReader input = new InputStreamReader(is/*, "UTF-8"*/);
final int CHARS_PER_PAGE = 5000; //counting spaces
final char[] buffer = new char[CHARS_PER_PAGE];
StringBuilder output = new StringBuilder(CHARS_PER_PAGE);
try {
for(int read = input.read(buffer, 0, buffer.length);
read != -1;
read = input.read(buffer, 0, buffer.length)) {
output.append(buffer, 0, read);
}
} catch (IOException ignore) { }
String text = output.toString();
System.out.println(text);
}
}
其他回答
注意:这可能不是个好主意。此方法使用递归,因此将非常快地命中StackOverflowError:
public String read (InputStream is) {
byte next = is.read();
return next == -1 ? "" : next + read(is); // Recursive part: reads next byte recursively
}
下面的代码对我有用。
URL url = MyClass.class.getResource("/" + configFileName);
BufferedInputStream bi = (BufferedInputStream) url.getContent();
byte[] buffer = new byte[bi.available() ];
int bytesRead = bi.read(buffer);
String out = new String(buffer);
请注意,根据Java文档,available()方法可能不适用于InputStream,但始终适用于BufferedInputStream。如果您不想使用available()方法,我们可以始终使用以下代码
URL url = MyClass.class.getResource("/" + configFileName);
BufferedInputStream bi = (BufferedInputStream) url.getContent();
File f = new File(url.getPath());
byte[] buffer = new byte[ (int) f.length()];
int bytesRead = bi.read(buffer);
String out = new String(buffer);
我不确定是否会有任何编码问题。如果代码有任何问题,请发表评论。
这个问题的解决方案不是最简单的,但由于没有提到NIO流和通道,这里有一个使用NIO通道和ByteBuffer将流转换为字符串的版本。
public static String streamToStringChannel(InputStream in, String encoding, int bufSize) throws IOException {
ReadableByteChannel channel = Channels.newChannel(in);
ByteBuffer byteBuffer = ByteBuffer.allocate(bufSize);
ByteArrayOutputStream bout = new ByteArrayOutputStream();
WritableByteChannel outChannel = Channels.newChannel(bout);
while (channel.read(byteBuffer) > 0 || byteBuffer.position() > 0) {
byteBuffer.flip(); //make buffer ready for write
outChannel.write(byteBuffer);
byteBuffer.compact(); //make buffer ready for reading
}
channel.close();
outChannel.close();
return bout.toString(encoding);
}
下面是如何使用它的示例:
try (InputStream in = new FileInputStream("/tmp/large_file.xml")) {
String x = streamToStringChannel(in, "UTF-8", 1);
System.out.println(x);
}
对于大型文件,此方法的性能应该很好。
总结我找到的11种主要方法(见下文)。我写了一些性能测试(见下面的结果):
将InputStream转换为字符串的方法:
使用IOUtils.toString(Apache Utils)字符串结果=IOTils.toString(inputStream,StandardCharsets.UTF_8);使用CharStreams(Guava)字符串结果=CharStreams.toString(新InputStreamReader(inputStream、字符集UTF_8));使用扫描仪(JDK)扫描仪s=新扫描仪(inputStream)。使用分隔符(“\\A”);字符串结果=s.hasNext()?s.next():“”;使用流API(Java 8)。警告:此解决方案将不同的换行符(如\r\n)转换为\n。字符串结果=新BufferedReader(新InputStreamReader(inputStream)).line().collector(Collectors.joining(“\n”));使用并行流API(Java8)。警告:此解决方案将不同的换行符(如\r\n)转换为\n。字符串结果=新BufferedReader(新InputStreamReader(inputStream)).line().allel().collector(Collectors.joining(“\n”));使用InputStreamReader和StringBuilder(JDK)int bufferSize=1024;char[]缓冲区=新字符[bufferSize];StringBuilder out=新StringBuilder();Reader in=新InputStreamReader(流,StandardCharsets.UTF_8);for(int numRead;(numRead=in.read(buffer,0,buffer.length))>0;){out.append(缓冲区,0,numRead);}return out.toString();使用StringWriter和IOUtils.copy(Apache Commons)StringWriter writer=新StringWriter();IOUtils.copy(inputStream,writer,“UTF-8”);return writer.toString();使用ByteArrayOutputStream和inputStream.read(JDK)ByteArrayOutputStream result=new ByteArrayOutputStream();byte[]缓冲区=新字节[1024];for(int length;(length=inputStream.read(缓冲区))!=-1; ) {result.write(缓冲区,0,长度);}//StandardCharsets.UTF_8.name()>JDK 7return result.toString(“UTF-8”);使用BufferedReader(JDK)。警告:此解决方案将不同的换行符(如\n\r)转换为line.separator系统属性(例如,在Windows中转换为“\r\n”)。String newLine=System.getProperty(“line.sepaper”);BufferedReader读取器=新的BufferedReader(新的InputStreamReader(inputStream));StringBuilder result=new StringBuilder();for(字符串行;(行=reader.readLine())!=null;){如果(result.length()>0){result.append(换行符);}result.append(行);}return result.toString();使用BufferedInputStream和ByteArrayOutputStream(JDK)BufferedInputStream bis=新缓冲输入流(inputStream);ByteArrayOutputStream buf=新ByteArrayOutputStream();for(int result=bis.read();结果!=-1.result=bis.read()){buf.write((字节)结果);}//StandardCharsets.UTF_8.name()>JDK 7return buf.toString(“UTF-8”);使用inputStream.read()和StringBuilder(JDK)。警告:此解决方案存在Unicode问题,例如俄语文本(仅适用于非Unicode文本)StringBuilder sb=新StringBuilder();for(int ch;(ch=inputStream.read())!=-1; ) {sb.append((char)ch);}return sb.toString();
警告:
解决方案4、5和9将不同的换行符转换为一个。解决方案11无法正确使用Unicode文本
性能测试
小字符串(长度=175)的性能测试,github中的url(模式=平均时间,系统=Linux,分数1343最好):
Benchmark Mode Cnt Score Error Units
8. ByteArrayOutputStream and read (JDK) avgt 10 1,343 ± 0,028 us/op
6. InputStreamReader and StringBuilder (JDK) avgt 10 6,980 ± 0,404 us/op
10. BufferedInputStream, ByteArrayOutputStream avgt 10 7,437 ± 0,735 us/op
11. InputStream.read() and StringBuilder (JDK) avgt 10 8,977 ± 0,328 us/op
7. StringWriter and IOUtils.copy (Apache) avgt 10 10,613 ± 0,599 us/op
1. IOUtils.toString (Apache Utils) avgt 10 10,605 ± 0,527 us/op
3. Scanner (JDK) avgt 10 12,083 ± 0,293 us/op
2. CharStreams (guava) avgt 10 12,999 ± 0,514 us/op
4. Stream Api (Java 8) avgt 10 15,811 ± 0,605 us/op
9. BufferedReader (JDK) avgt 10 16,038 ± 0,711 us/op
5. parallel Stream Api (Java 8) avgt 10 21,544 ± 0,583 us/op
对大字符串(长度=50100)、github中的url进行性能测试(模式=平均时间,系统=Linux,得分200715最好):
Benchmark Mode Cnt Score Error Units
8. ByteArrayOutputStream and read (JDK) avgt 10 200,715 ± 18,103 us/op
1. IOUtils.toString (Apache Utils) avgt 10 300,019 ± 8,751 us/op
6. InputStreamReader and StringBuilder (JDK) avgt 10 347,616 ± 130,348 us/op
7. StringWriter and IOUtils.copy (Apache) avgt 10 352,791 ± 105,337 us/op
2. CharStreams (guava) avgt 10 420,137 ± 59,877 us/op
9. BufferedReader (JDK) avgt 10 632,028 ± 17,002 us/op
5. parallel Stream Api (Java 8) avgt 10 662,999 ± 46,199 us/op
4. Stream Api (Java 8) avgt 10 701,269 ± 82,296 us/op
10. BufferedInputStream, ByteArrayOutputStream avgt 10 740,837 ± 5,613 us/op
3. Scanner (JDK) avgt 10 751,417 ± 62,026 us/op
11. InputStream.read() and StringBuilder (JDK) avgt 10 2919,350 ± 1101,942 us/op
图表(性能测试取决于Windows 7系统中的输入流长度)
性能测试(平均时间)取决于Windows 7系统中的输入流长度:
length 182 546 1092 3276 9828 29484 58968
test8 0.38 0.938 1.868 4.448 13.412 36.459 72.708
test4 2.362 3.609 5.573 12.769 40.74 81.415 159.864
test5 3.881 5.075 6.904 14.123 50.258 129.937 166.162
test9 2.237 3.493 5.422 11.977 45.98 89.336 177.39
test6 1.261 2.12 4.38 10.698 31.821 86.106 186.636
test7 1.601 2.391 3.646 8.367 38.196 110.221 211.016
test1 1.529 2.381 3.527 8.411 40.551 105.16 212.573
test3 3.035 3.934 8.606 20.858 61.571 118.744 235.428
test2 3.136 6.238 10.508 33.48 43.532 118.044 239.481
test10 1.593 4.736 7.527 20.557 59.856 162.907 323.147
test11 3.913 11.506 23.26 68.644 207.591 600.444 1211.545
使用java 9中支持的java.io.InputStream.transferTo(OutputStream)和ByteArrayOutputStream.toString(String),该字符串采用字符集名称:
public static String gobble(InputStream in, String charsetName) throws IOException {
ByteArrayOutputStream bos = new ByteArrayOutputStream();
in.transferTo(bos);
return bos.toString(charsetName);
}