Having spent a decent amount of time watching both the r and pandas tags on SO, the impression that I get is that pandas questions are less likely to contain reproducible data. This is something that the R community has been pretty good about encouraging, and thanks to guides like this, newcomers are able to get some help on putting together these examples. People who are able to read these guides and come back with reproducible data will often have much better luck getting answers to their questions.
我们如何为熊猫问题创造好的可重复的例子?简单的数据帧可以放在一起,例如:
import pandas as pd
df = pd.DataFrame({'user': ['Bob', 'Jane', 'Alice'],
'income': [40000, 50000, 42000]})
但许多示例数据集需要更复杂的结构,例如:
日期时间索引或数据 多个类别变量(是否有一个与R的expand.grid()函数等效的函数,它可以生成某些给定变量的所有可能组合?) 多索引或面板数据
对于难以用几行代码模拟的数据集,是否存在与R的dput()等效的允许您生成可复制粘贴代码以重新生成数据结构的程序?