我已经用python编程两年了;主要是数据(pandas, mpl, numpy),但也有自动化脚本和小型web应用程序。我正在努力成为一个更好的程序员,增加我的python知识,其中一件让我困扰的事情是我从来没有使用过一个类(除了为小型web应用程序复制随机flask代码)。我基本上理解它们是什么,但我似乎无法理解为什么我需要它们而不是一个简单的功能。

为我的问题补充一点:我写了大量的自动化报告,这些报告总是涉及从多个数据源(mongo, sql, postgres, api)提取数据,执行大量或少量的数据修改和格式化,将数据写入csv/excel/html,并通过电子邮件发送出去。脚本的范围从~250行到~600行。是否有任何理由让我使用类来做到这一点,为什么?


当前回答

我觉得你做得对。当您需要模拟一些业务逻辑或具有困难关系的困难的实际流程时,类是合理的。 为例:

几个具有共享状态的函数 相同状态变量的多个副本 扩展现有功能的行为

我也建议你看一下这个经典视频

其他回答

类是面向对象编程的支柱。OOP高度关注代码组织、可重用性和封装性。

首先,免责声明:OOP在一定程度上与函数式编程相反,后者是Python中经常使用的不同范式。不是每个用Python(或者大多数语言)编程的人都使用OOP。你可以在Java 8中做很多不是非常面向对象的事情。如果你不想使用OOP,那就不要。如果您只是编写一次性脚本来处理永远不会再使用的数据,那么请继续按照您现在的方式编写。

然而,使用OOP有很多原因。

一些原因:

Organization: OOP defines well known and standard ways of describing and defining both data and procedure in code. Both data and procedure can be stored at varying levels of definition (in different classes), and there are standard ways about talking about these definitions. That is, if you use OOP in a standard way, it will help your later self and others understand, edit, and use your code. Also, instead of using a complex, arbitrary data storage mechanism (dicts of dicts or lists or dicts or lists of dicts of sets, or whatever), you can name pieces of data structures and conveniently refer to them. State: OOP helps you define and keep track of state. For instance, in a classic example, if you're creating a program that processes students (for instance, a grade program), you can keep all the info you need about them in one spot (name, age, gender, grade level, courses, grades, teachers, peers, diet, special needs, etc.), and this data is persisted as long as the object is alive, and is easily accessible. In contrast, in pure functional programming, state is never mutated in place. Encapsulation: With encapsulation, procedure and data are stored together. Methods (an OOP term for functions) are defined right alongside the data that they operate on and produce. In a language like Java that allows for access control, or in Python, depending upon how you describe your public API, this means that methods and data can be hidden from the user. What this means is that if you need or want to change code, you can do whatever you want to the implementation of the code, but keep the public APIs the same. Inheritance: Inheritance allows you to define data and procedure in one place (in one class), and then override or extend that functionality later. For instance, in Python, I often see people creating subclasses of the dict class in order to add additional functionality. A common change is overriding the method that throws an exception when a key is requested from a dictionary that doesn't exist to give a default value based on an unknown key. This allows you to extend your own code now or later, allow others to extend your code, and allows you to extend other people's code. Reusability: All of these reasons and others allow for greater reusability of code. Object oriented code allows you to write solid (tested) code once, and then reuse over and over. If you need to tweak something for your specific use case, you can inherit from an existing class and overwrite the existing behavior. If you need to change something, you can change it all while maintaining the existing public method signatures, and no one is the wiser (hopefully).

同样,有几个不使用OOP的原因,您也不需要这样做。但幸运的是,对于像Python这样的语言,您可以使用少量或大量,这取决于您。

一个学生用例的例子(不保证代码质量,只是一个例子):

面向对象的

class Student(object):
    def __init__(self, name, age, gender, level, grades=None):
        self.name = name
        self.age = age
        self.gender = gender
        self.level = level
        self.grades = grades or {}

    def setGrade(self, course, grade):
        self.grades[course] = grade

    def getGrade(self, course):
        return self.grades[course]

    def getGPA(self):
        return sum(self.grades.values())/len(self.grades)

# Define some students
john = Student("John", 12, "male", 6, {"math":3.3})
jane = Student("Jane", 12, "female", 6, {"math":3.5})

# Now we can get to the grades easily
print(john.getGPA())
print(jane.getGPA())

标准的东西

def calculateGPA(gradeDict):
    return sum(gradeDict.values())/len(gradeDict)

students = {}
# We can set the keys to variables so we might minimize typos
name, age, gender, level, grades = "name", "age", "gender", "level", "grades"
john, jane = "john", "jane"
math = "math"
students[john] = {}
students[john][age] = 12
students[john][gender] = "male"
students[john][level] = 6
students[john][grades] = {math:3.3}

students[jane] = {}
students[jane][age] = 12
students[jane][gender] = "female"
students[jane][level] = 6
students[jane][grades] = {math:3.5}

# At this point, we need to remember who the students are and where the grades are stored. Not a huge deal, but avoided by OOP.
print(calculateGPA(students[john][grades]))
print(calculateGPA(students[jane][grades]))

当你需要维护你的函数的状态,它不能用生成器完成(函数产生而不是返回)。生成器维护自己的状态。

如果要重写任何标准操作符,则需要一个类。

无论何时需要使用Visitor模式,都需要类。任何其他设计模式都可以通过生成器、上下文管理器(作为生成器实现比作为类更好)和POD类型(字典、列表和元组等)更有效、更干净地完成。

如果你想写“python”代码,你应该更喜欢上下文管理器和生成器而不是类。这样会更干净。

如果您想扩展功能,几乎总是可以通过包含而不是继承来实现。

就像所有规则一样,这也有例外。如果您想快速封装功能(例如,编写测试代码而不是库级别的可重用代码),您可以在类中封装状态。它很简单,不需要重用。

如果你需要一个c++风格的析构函数(RIIA),你肯定不想使用类。你需要上下文管理器。

我觉得你做得对。当您需要模拟一些业务逻辑或具有困难关系的困难的实际流程时,类是合理的。 为例:

几个具有共享状态的函数 相同状态变量的多个副本 扩展现有功能的行为

我也建议你看一下这个经典视频

类定义了一个真实的实体。如果您正在处理一些单独存在的东西,并且有自己的逻辑,与其他逻辑是分开的,那么您应该为它创建一个类。例如,封装数据库连接的类。

如果不是这样,就不需要创建类

dantiston给出了一个很好的答案,为什么面向对象编程是有用的。但是,值得注意的是,OOP在大多数情况下并不是更好的选择。面向对象编程的优点是将数据和方法结合在一起。就应用程序而言,我认为只有在所有函数/方法都只处理特定数据集而不处理其他数据时才使用OOP。

考虑dentiston示例的函数式编程重构:

def dictMean( nums ):
    return sum(nums.values())/len(nums)
# It's good to include automatic tests for production code, to ensure that updates don't break old codes
assert( dictMean({'math':3.3,'science':3.5})==3.4 )

john = {'name':'John', 'age':12, 'gender':'male', 'level':6, 'grades':{'math':3.3}}

# setGrade
john['grades']['science']=3.5

# getGrade
print(john['grades']['math'])

# getGPA
print(dictMean(john['grades']))

乍一看,这3个方法似乎都专门处理GPA,直到您意识到学生. getgpa()可以被推广为一个函数来计算dict的平均值,并重用于其他问题,而其他2个方法重新定义了dict已经可以做的事情。

函数实现获得:

简单。没有样板类或self。 在每个测试之后轻松添加自动测试代码 功能方便维护。 随着代码的扩展,很容易分成几个程序。 可重用性用于计算GPA以外的目的。

函数实现丢失:

每次都在字典键中输入“姓名”、“年龄”、“性别”不是很DRY(不要重复)。可以通过将dict改为list来避免这种情况。当然,列表不如字典清晰,但如果您在下面包含自动测试代码,这就不是问题。

本例没有涵盖的问题:

OOP继承可以用函数回调代替。 调用OOP类必须首先创建它的实例。当你在__init__(self)中没有数据时,这可能会很无聊。