如何从文件中读取特定的行(按行号)?

我使用for循环来读取文件，但我只想读取特定的行，比如第26行和第30行。是否有任何内置功能来实现这一点?

当前回答

def getitems(iterable, items):
  items = list(items) # get a list from any iterable and make our own copy
                      # since we modify it
  if items:
    items.sort()
    for n, v in enumerate(iterable):
      if n == items[0]:
        yield v
        items.pop(0)
        if not items:
          break

print list(getitems(open("/usr/share/dict/words"), [25, 29]))
# ['Abelson\n', 'Abernathy\n']
# note that index 25 is the 26th item

2010-01-17 17:33:49

其他回答

文件对象有一个.readlines()方法，它将为您提供文件内容的列表，每个列表项一行。在此之后，您可以使用普通的列表切片技术。

http://docs.python.org/library/stdtypes.html#file.readlines

2010-01-17 17:18:33

我更喜欢这种方法，因为它更通用，即你可以在文件上使用它，在f.r edlines()的结果上，在StringIO对象上，无论什么:

def read_specific_lines(file, lines_to_read):
   """file is any iterable; lines_to_read is an iterable containing int values"""
   lines = set(lines_to_read)
   last = max(lines)
   for n, line in enumerate(file):
      if n + 1 in lines:
          yield line
      if n + 1 > last:
          return

>>> with open(r'c:\temp\words.txt') as f:
        [s for s in read_specific_lines(f, [1, 2, 3, 1000])]
['A\n', 'a\n', 'aa\n', 'accordant\n']

2010-01-17 18:37:36

如果您的大型文本文件是严格结构良好的(意味着每一行都有相同的长度l)，您可以使用第n行

with open(file) as f:
    f.seek(n*l)
    line = f.readline() 
    last_pos = f.tell()

这只适用于具有相同长度的文件!

2018-09-30 14:55:39

不要使用阅读线!

我的解决方案是:


with open(filename) as f:
    specify = [26, 30]
    results = list(
        map(lambda line: line[1],
            filter(lambda line: line[0] in specify,
                   enumerate(f))
            )
    )

对6.5G文件进行如下测试:

import time

filename = 'a.txt'
start = time.time()
with open(filename, 'w') as f:
    for i in range(10_000_000):
        f.write(f'{str(i)*100}\n')       
end1 = time.time()

with open(filename) as f:
    specify = [26, 30]
    results = list(
        map(lambda line: line[1],
            filter(lambda line: line[0] in specify,
                   enumerate(f))
            )
    )
end2 = time.time()
print(f'write time: {end1-start}')
print(f'read time: {end2-end1}')
# write time: 14.38945460319519
# read time: 8.380386352539062

2022-04-07 13:01:15

读取文件非常快。读取一个100MB的文件需要不到0.1秒(请参阅我的文章用Python读写文件)。因此，你应该完整地阅读它，然后处理单行。

大多数回答这里做的不是错，而是风格不好。打开文件应该总是用with，因为它可以确保文件再次关闭。

所以你应该这样做:

with open("path/to/file.txt") as f:
    lines = f.readlines()
print(lines[26])  # or whatever you want to do with this line
print(lines[30])  # or whatever you want to do with this line

巨大的文件

如果你有一个巨大的文件，内存消耗是一个问题，你可以逐行处理它:

with open("path/to/file.txt") as f:
    for i, line in enumerate(f):
        pass  # process line i

2015-03-23 20:41:26

如何从文件中读取特定的行(按行号)?

推荐文章

最新文章

标签