如何使一个Python类序列化?

class FileItem:
    def __init__(self, fname):
        self.fname = fname

尝试序列化为JSON:

>>> import json
>>> x = FileItem('/foo/bar')
>>> json.dumps(x)
TypeError: Object of type 'FileItem' is not JSON serializable

当前回答

我选择使用装饰器来解决datetime对象序列化问题。 这是我的代码:

#myjson.py
#Author: jmooremcc 7/16/2017

import json
from datetime import datetime, date, time, timedelta
"""
This module uses decorators to serialize date objects using json
The filename is myjson.py
In another module you simply add the following import statement:
    from myjson import json

json.dumps and json.dump will then correctly serialize datetime and date 
objects
"""

def json_serial(obj):
    """JSON serializer for objects not serializable by default json code"""

    if isinstance(obj, (datetime, date)):
        serial = str(obj)
        return serial
    raise TypeError ("Type %s not serializable" % type(obj))


def FixDumps(fn):
    def hook(obj):
        return fn(obj, default=json_serial)

    return hook

def FixDump(fn):
    def hook(obj, fp):
        return fn(obj,fp, default=json_serial)

    return hook


json.dumps=FixDumps(json.dumps)
json.dump=FixDump(json.dump)


if __name__=="__main__":
    today=datetime.now()
    data={'atime':today, 'greet':'Hello'}
    str=json.dumps(data)
    print str

通过导入上述模块,我的其他模块以正常的方式(没有指定默认关键字)使用json来序列化包含日期时间对象的数据。datetime序列化器代码会自动为json调用。Dumps和json.dump。

其他回答

我最喜欢Lost Koder的方法。当我试图序列化成员/方法不可序列化的更复杂的对象时,我遇到了问题。这是我的实现,工作在更多的对象:

class Serializer(object):
    @staticmethod
    def serialize(obj):
        def check(o):
            for k, v in o.__dict__.items():
                try:
                    _ = json.dumps(v)
                    o.__dict__[k] = v
                except TypeError:
                    o.__dict__[k] = str(v)
            return o
        return json.dumps(check(obj).__dict__, indent=2)

正如在许多其他答案中提到的,您可以将函数传递给json。转储将不是默认支持的类型之一的对象转换为受支持的类型。令人惊讶的是,他们都没有提到最简单的情况,即使用内置函数vars将对象转换为包含其所有属性的dict:

json.dumps(obj, default=vars)

注意,这只涵盖了基本的情况,如果你需要对某些类型进行更具体的序列化(例如排除某些属性或没有__dict__属性的对象),你需要使用自定义函数或JSONEncoder,如其他答案中所述。

Kyle Delaney的评论是正确的,所以我尝试使用https://stackoverflow.com/a/15538391/1497139以及https://stackoverflow.com/a/10254820/1497139的改进版本

创建一个“JSONAble”mixin。

因此,要使一个类JSON可序列化使用“JSONAble”作为超类,并调用:

 instance.toJSON()

or

 instance.asJSON()

对于这两种方法。您还可以使用本文提供的其他方法扩展JSONAble类。

家庭和个人单元测试样本的测试示例结果如下:

toJSOn ():

{
    "members": {
        "Flintstone,Fred": {
            "firstName": "Fred",
            "lastName": "Flintstone"
        },
        "Flintstone,Wilma": {
            "firstName": "Wilma",
            "lastName": "Flintstone"
        }
    },
    "name": "The Flintstones"
}

asJSOn ():

{'name': 'The Flintstones', 'members': {'Flintstone,Fred': {'firstName': 'Fred', 'lastName': 'Flintstone'}, 'Flintstone,Wilma': {'firstName': 'Wilma', 'lastName': 'Flintstone'}}}

使用家庭和个人样本进行单元测试

def testJsonAble(self):
        family=Family("The Flintstones")
        family.add(Person("Fred","Flintstone")) 
        family.add(Person("Wilma","Flintstone"))
        json1=family.toJSON()
        json2=family.asJSON()
        print(json1)
        print(json2)

class Family(JSONAble):
    def __init__(self,name):
        self.name=name
        self.members={}
    
    def add(self,person):
        self.members[person.lastName+","+person.firstName]=person

class Person(JSONAble):
    def __init__(self,firstName,lastName):
        self.firstName=firstName;
        self.lastName=lastName;

JSONAble .py定义JSONAble mixin

 '''
Created on 2020-09-03

@author: wf
'''
import json

class JSONAble(object):
    '''
    mixin to allow classes to be JSON serializable see
    https://stackoverflow.com/questions/3768895/how-to-make-a-class-json-serializable
    '''

    def __init__(self):
        '''
        Constructor
        '''
    
    def toJSON(self):
        return json.dumps(self, default=lambda o: o.__dict__, 
            sort_keys=True, indent=4)
        
    def getValue(self,v):
        if (hasattr(v, "asJSON")):
            return v.asJSON()
        elif type(v) is dict:
            return self.reprDict(v)
        elif type(v) is list:
            vlist=[]
            for vitem in v:
                vlist.append(self.getValue(vitem))
            return vlist
        else:   
            return v
    
    def reprDict(self,srcDict):
        '''
        get my dict elements
        '''
        d = dict()
        for a, v in srcDict.items():
            d[a]=self.getValue(v)
        return d
    
    def asJSON(self):
        '''
        recursively return my dict elements
        '''
        return self.reprDict(self.__dict__)   

您将发现这些方法现在集成在https://github.com/WolfgangFahl/pyLoDStorage项目中,该项目可在https://pypi.org/project/pylodstorage/上获得

为了在10年前的火灾中再添加一个日志,我还将为这个任务提供数据类向导,假设您使用的是Python 3.6+。这可以很好地用于数据类,这实际上是3.7+版本的python内置模块。

dataclass-wizard库将把对象(及其所有属性递归地)转换为dict,并使用fromdict使反向(反序列化)非常简单。另外,这里是PyPi链接:https://pypi.org/project/dataclass-wizard/。

import dataclass_wizard
import dataclasses

@dataclasses.dataclass
class A:
    hello: str
    a_field: int

obj = A('world', 123)
a_dict = dataclass_wizard.asdict(obj)
# {'hello': 'world', 'aField': 123}

或者如果你想要一个字符串:

a_str = jsons.dumps(dataclass_wizard.asdict(obj))

或者您的类是否从dataclass_wizard扩展。JSONWizard:

a_str = your_object.to_json()

最后,标准库还支持Union类型的数据类,这基本上意味着可以将dict反序列化为类C1或C2的对象。例如:

from dataclasses import dataclass

from dataclass_wizard import JSONWizard

@dataclass
class Outer(JSONWizard):

    class _(JSONWizard.Meta):
        tag_key = 'tag'
        auto_assign_tags = True

    my_string: str
    inner: 'A | B'  # alternate syntax: `inner: typing.Union['A', 'B']`

@dataclass
class A:
    my_field: int

@dataclass
class B:
    my_field: str


my_dict = {'myString': 'test', 'inner': {'tag': 'B', 'myField': 'test'}}
obj = Outer.from_dict(my_dict)

# True
assert repr(obj) == "Outer(my_string='test', inner=B(my_field='test'))"

obj.to_json()
# {"myString": "test", "inner": {"myField": "test", "tag": "B"}}

我们经常在日志文件中转储JSON格式的复杂字典。虽然大多数字段携带重要信息,但我们不太关心内置的类对象(例如子进程)。Popen对象)。由于存在这些不可序列化的对象,对json.dumps()的调用会失败。

为了解决这个问题,我构建了一个小函数来转储对象的字符串表示形式,而不是转储对象本身。如果您正在处理的数据结构嵌套太多,您可以指定嵌套的最大级别/深度。

from time import time

def safe_serialize(obj , max_depth = 2):

    max_level = max_depth

    def _safe_serialize(obj , current_level = 0):

        nonlocal max_level

        # If it is a list
        if isinstance(obj , list):

            if current_level >= max_level:
                return "[...]"

            result = list()
            for element in obj:
                result.append(_safe_serialize(element , current_level + 1))
            return result

        # If it is a dict
        elif isinstance(obj , dict):

            if current_level >= max_level:
                return "{...}"

            result = dict()
            for key , value in obj.items():
                result[f"{_safe_serialize(key , current_level + 1)}"] = _safe_serialize(value , current_level + 1)
            return result

        # If it is an object of builtin class
        elif hasattr(obj , "__dict__"):
            if hasattr(obj , "__repr__"):
                result = f"{obj.__repr__()}_{int(time())}"
            else:
                try:
                    result = f"{obj.__class__.__name__}_object_{int(time())}"
                except:
                    result = f"object_{int(time())}"
            return result

        # If it is anything else
        else:
            return obj

    return _safe_serialize(obj)

由于字典也可以有不可序列化的键,转储它们的类名或对象表示将导致所有键都具有相同的名称,这将抛出错误,因为所有键都需要有唯一的名称,这就是为什么当前时间Since epoch被int(time())附加到对象名称。

可以使用以下具有不同级别/深度的嵌套字典来测试该函数

d = {
    "a" : {
        "a1" : {
            "a11" : {
                "a111" : "some_value" ,
                "a112" : "some_value" ,
            } ,
            "a12" : {
                "a121" : "some_value" ,
                "a122" : "some_value" ,
            } ,
        } ,
        "a2" : {
            "a21" : {
                "a211" : "some_value" ,
                "a212" : "some_value" ,
            } ,
            "a22" : {
                "a221" : "some_value" ,
                "a222" : "some_value" ,
            } ,
        } ,
    } ,
    "b" : {
        "b1" : {
            "b11" : {
                "b111" : "some_value" ,
                "b112" : "some_value" ,
            } ,
            "b12" : {
                "b121" : "some_value" ,
                "b122" : "some_value" ,
            } ,
        } ,
        "b2" : {
            "b21" : {
                "b211" : "some_value" ,
                "b212" : "some_value" ,
            } ,
            "b22" : {
                "b221" : "some_value" ,
                "b222" : "some_value" ,
            } ,
        } ,
    } ,
    "c" : subprocess.Popen("ls -l".split() , stdout = subprocess.PIPE , stderr = subprocess.PIPE) ,
}

执行以下命令将会得到-

print("LEVEL 3")
print(json.dumps(safe_serialize(d , 3) , indent = 4))

print("\n\n\nLEVEL 2")
print(json.dumps(safe_serialize(d , 2) , indent = 4))

print("\n\n\nLEVEL 1")
print(json.dumps(safe_serialize(d , 1) , indent = 4))

结果:

LEVEL 3
{
    "a": {
        "a1": {
            "a11": "{...}",
            "a12": "{...}"
        },
        "a2": {
            "a21": "{...}",
            "a22": "{...}"
        }
    },
    "b": {
        "b1": {
            "b11": "{...}",
            "b12": "{...}"
        },
        "b2": {
            "b21": "{...}",
            "b22": "{...}"
        }
    },
    "c": "<Popen: returncode: None args: ['ls', '-l']>"
}



LEVEL 2
{
    "a": {
        "a1": "{...}",
        "a2": "{...}"
    },
    "b": {
        "b1": "{...}",
        "b2": "{...}"
    },
    "c": "<Popen: returncode: None args: ['ls', '-l']>"
}



LEVEL 1
{
    "a": "{...}",
    "b": "{...}",
    "c": "<Popen: returncode: None args: ['ls', '-l']>"
}

[注意]:仅在不关心内置类对象的序列化时使用此选项。