使用Pandas对同一工作簿的多个工作表进行pd.read_excel()

我有一个大的电子表格文件(.xlsx)，我正在使用python熊猫处理。碰巧，我需要数据从两个选项卡(表)在那个大文件。其中一个选项卡包含大量数据，而另一个选项卡只有几个方形单元格。

当我在任何工作表上使用pd.read_excel()时，它看起来就像加载了整个文件(而不仅仅是我感兴趣的工作表)。因此，当我使用该方法两次(每个工作表一次)时，我实际上不得不忍受整个工作簿被读取两次(即使我们只使用指定的工作表)。

我如何只加载特定的表与pd.read_excel()?

当前回答

你可以用下面几行来阅读所有的表格

import pandas as pd
file_instance = pd.ExcelFile('your_file.xlsx')

main_df = pd.concat([pd.read_excel('your_file.xlsx', sheet_name=name) for name in file_instance.sheet_names] , axis=0)

2021-09-01 13:09:21

其他回答

尝试pd。ExcelFile:

xls = pd.ExcelFile('path_to_file.xls')
df1 = pd.read_excel(xls, 'Sheet1')
df2 = pd.read_excel(xls, 'Sheet2')

正如@HaPsantran所指出的，整个Excel文件在ExcelFile()调用期间被读入(似乎没有绕过这个方法)。这只是让你不必每次访问新工作表时都读取相同的文件。

请注意，pd.read_excel()的sheet_name参数可以是工作表的名称(如上所述)、指定工作表号的整数(例如0,1等)、工作表名称或索引列表或None。如果提供了一个列表，它将返回一个字典，其中键是表名/索引，值是数据帧。默认是简单地返回第一个表(即，sheet_name=0)。

如果指定None，则返回所有表，作为{sheet_name:dataframe}字典。

2014-10-23 05:16:38

df = pd.read_excel('FileName.xlsx', 'SheetName')

这将从文件FileName.xlsx中读取表SheetName

2021-06-27 10:32:51

pd.read_excel('filename.xlsx')

默认情况下，读取工作簿的第一张。

pd.read_excel('filename.xlsx', sheet_name = 'sheetname')

阅读练习册上的具体表格

pd.read_excel('filename.xlsx', sheet_name = None)

将所有工作表从excel读取到pandas数据帧作为OrderedDict的类型，意味着嵌套的数据帧，所有工作表作为数据帧收集在数据帧内，它的类型是OrderedDict。

2019-08-01 17:01:23

If:

您需要多个工作表，但不是全部你需要一个df作为输出

然后，您可以传递一个工作表名称列表。你可以手动填充:

import pandas as pd
    
path = "C:\\Path\\To\\Your\\Data\\"
file = "data.xlsx"
sheet_lst_wanted = ["01_SomeName","05_SomeName","12_SomeName"] # tab names from Excel

### import and compile data ###
    
# read all sheets from list into an ordered dictionary    
dict_temp = pd.read_excel(path+file, sheet_name= sheet_lst_wanted)

# concatenate the ordered dict items into a dataframe
df = pd.concat(dict_temp, axis=0, ignore_index=True)

如果你想要的工作表有一个通用的命名约定，也允许你区分不需要的工作表，那么一点自动化是可能的:

# substitute following block for the sheet_lst_wanted line in above block

import xlrd

# string common to only worksheets you want
str_like = "SomeName" 
    
### create list of sheet names in Excel file ###
xls = xlrd.open_workbook(path+file, on_demand=True)
sheet_lst = xls.sheet_names()
    
### create list of sheets meeting criteria  ###
sheet_lst_wanted = []
    
for s in sheet_lst:
    # note: following conditional statement based on my sheets ending with the string defined in sheet_like
    if s[-len(str_like):] == str_like:
        sheet_lst_wanted.append(s)
    else:
        pass

2020-08-17 21:32:42

你也可以指定表名作为参数:

data_file = pd.read_excel('path_to_file.xls', sheet_name="sheet_name")

将只上传表"sheet_name"。

2017-02-11 19:37:17

使用Pandas对同一工作簿的多个工作表进行pd.read_excel()

推荐文章

最新文章

标签