我需要使用Java逐行读取大约5-6 GB的大型文本文件。

我怎样才能快速做到呢?


当前回答

在Java 7中:

String folderPath = "C:/folderOfMyFile";
Path path = Paths.get(folderPath, "myFileName.csv"); //or any text file eg.: txt, bat, etc
Charset charset = Charset.forName("UTF-8");

try (BufferedReader reader = Files.newBufferedReader(path , charset)) {
  while ((line = reader.readLine()) != null ) {
    //separate all csv fields into string array
    String[] lineVariables = line.split(","); 
  }
} catch (IOException e) {
    System.err.println(e);
}

其他回答

下面是一个完整的错误处理示例,并支持java 7之前的字符集规范。在Java 7中,您可以使用try-with-resources语法,这使得代码更简洁。

如果你只想要默认字符集,你可以跳过InputStream,使用FileReader。

InputStream ins = null; // raw byte-stream
Reader r = null; // cooked reader
BufferedReader br = null; // buffered for readLine()
try {
    String s;
    if (true) {
        String data = "#foobar\t1234\n#xyz\t5678\none\ttwo\n";
        ins = new ByteArrayInputStream(data.getBytes());
    } else {
        ins = new FileInputStream("textfile.txt");
    }
    r = new InputStreamReader(ins, "UTF-8"); // leave charset out for default
    br = new BufferedReader(r);
    while ((s = br.readLine()) != null) {
        System.out.println(s);
    }
}
catch (Exception e)
{
    System.err.println(e.getMessage()); // handle exception
}
finally {
    if (br != null) { try { br.close(); } catch(Throwable t) { /* ensure close happens */ } }
    if (r != null) { try { r.close(); } catch(Throwable t) { /* ensure close happens */ } }
    if (ins != null) { try { ins.close(); } catch(Throwable t) { /* ensure close happens */ } }
}

下面是Groovy版本,有完整的错误处理:

File f = new File("textfile.txt");
f.withReader("UTF-8") { br ->
    br.eachLine { line ->
        println line;
    }
}

看看这个博客:

Java逐行读取文件- Java教程

可以指定缓冲区大小或 可以使用默认大小。的 违约对大多数人来说已经足够大了 目的。

// Open the file
FileInputStream fstream = new FileInputStream("textfile.txt");
BufferedReader br = new BufferedReader(new InputStreamReader(fstream));

String strLine;

//Read File Line By Line
while ((strLine = br.readLine()) != null)   {
  // Print the content on the console
  System.out.println (strLine);
}

//Close the input stream
fstream.close();

你可以使用流更精确地做到这一点:

Files.lines(Paths.get("input.txt")).forEach(s -> stringBuffer.append(s);

在Java 8中,除了使用Files.lines(),还有另一种方法。如果您的输入源不是文件,而是更抽象的东西,如Reader或InputStream,则可以通过BufferedReaders lines()方法对行进行流处理。

例如:

try (BufferedReader reader = new BufferedReader(...)) {
  reader.lines().forEach(line -> processLine(line));
}

BufferedReader读取的每个输入行都会调用processLine()。

BufferedReader br;
FileInputStream fin;
try {
    fin = new FileInputStream(fileName);
    br = new BufferedReader(new InputStreamReader(fin));

    /*Path pathToFile = Paths.get(fileName);
    br = Files.newBufferedReader(pathToFile,StandardCharsets.US_ASCII);*/

    String line = br.readLine();
    while (line != null) {
        String[] attributes = line.split(",");
        Movie movie = createMovie(attributes);
        movies.add(movie);
        line = br.readLine();
    }
    fin.close();
    br.close();
} catch (FileNotFoundException e) {
    System.out.println("Your Message");
} catch (IOException e) {
    System.out.println("Your Message");
}

这对我很管用。希望它也能帮助到你。