如何使一个Python类序列化?
class FileItem:
def __init__(self, fname):
self.fname = fname
尝试序列化为JSON:
>>> import json
>>> x = FileItem('/foo/bar')
>>> json.dumps(x)
TypeError: Object of type 'FileItem' is not JSON serializable
如何使一个Python类序列化?
class FileItem:
def __init__(self, fname):
self.fname = fname
尝试序列化为JSON:
>>> import json
>>> x = FileItem('/foo/bar')
>>> json.dumps(x)
TypeError: Object of type 'FileItem' is not JSON serializable
当前回答
Jsonweb似乎是我的最佳解决方案。参见http://www.jsonweb.info/en/latest/
from jsonweb.encode import to_object, dumper
@to_object()
class DataModel(object):
def __init__(self, id, value):
self.id = id
self.value = value
>>> data = DataModel(5, "foo")
>>> dumper(data)
'{"__type__": "DataModel", "id": 5, "value": "foo"}'
其他回答
你知道预期产量是多少吗?例如,这个可以吗?
>>> f = FileItem("/foo/bar")
>>> magic(f)
'{"fname": "/foo/bar"}'
在这种情况下,你只需调用json.dumps(f.__dict__)。
如果您想要更多自定义输出,那么您必须继承JSONEncoder并实现您自己的自定义序列化。
对于一个简单的例子,请参见下面。
>>> from json import JSONEncoder
>>> class MyEncoder(JSONEncoder):
def default(self, o):
return o.__dict__
>>> MyEncoder().encode(f)
'{"fname": "/foo/bar"}'
然后你把这个类作为cls kwarg传递给json.dumps()方法:
json.dumps(cls=MyEncoder)
如果还想解码,则必须向JSONDecoder类提供一个自定义object_hook。例如:
>>> def from_json(json_object):
if 'fname' in json_object:
return FileItem(json_object['fname'])
>>> f = JSONDecoder(object_hook = from_json).decode('{"fname": "/foo/bar"}')
>>> f
<__main__.FileItem object at 0x9337fac>
>>>
我们经常在日志文件中转储JSON格式的复杂字典。虽然大多数字段携带重要信息,但我们不太关心内置的类对象(例如子进程)。Popen对象)。由于存在这些不可序列化的对象,对json.dumps()的调用会失败。
为了解决这个问题,我构建了一个小函数来转储对象的字符串表示形式,而不是转储对象本身。如果您正在处理的数据结构嵌套太多,您可以指定嵌套的最大级别/深度。
from time import time
def safe_serialize(obj , max_depth = 2):
max_level = max_depth
def _safe_serialize(obj , current_level = 0):
nonlocal max_level
# If it is a list
if isinstance(obj , list):
if current_level >= max_level:
return "[...]"
result = list()
for element in obj:
result.append(_safe_serialize(element , current_level + 1))
return result
# If it is a dict
elif isinstance(obj , dict):
if current_level >= max_level:
return "{...}"
result = dict()
for key , value in obj.items():
result[f"{_safe_serialize(key , current_level + 1)}"] = _safe_serialize(value , current_level + 1)
return result
# If it is an object of builtin class
elif hasattr(obj , "__dict__"):
if hasattr(obj , "__repr__"):
result = f"{obj.__repr__()}_{int(time())}"
else:
try:
result = f"{obj.__class__.__name__}_object_{int(time())}"
except:
result = f"object_{int(time())}"
return result
# If it is anything else
else:
return obj
return _safe_serialize(obj)
由于字典也可以有不可序列化的键,转储它们的类名或对象表示将导致所有键都具有相同的名称,这将抛出错误,因为所有键都需要有唯一的名称,这就是为什么当前时间Since epoch被int(time())附加到对象名称。
可以使用以下具有不同级别/深度的嵌套字典来测试该函数
d = {
"a" : {
"a1" : {
"a11" : {
"a111" : "some_value" ,
"a112" : "some_value" ,
} ,
"a12" : {
"a121" : "some_value" ,
"a122" : "some_value" ,
} ,
} ,
"a2" : {
"a21" : {
"a211" : "some_value" ,
"a212" : "some_value" ,
} ,
"a22" : {
"a221" : "some_value" ,
"a222" : "some_value" ,
} ,
} ,
} ,
"b" : {
"b1" : {
"b11" : {
"b111" : "some_value" ,
"b112" : "some_value" ,
} ,
"b12" : {
"b121" : "some_value" ,
"b122" : "some_value" ,
} ,
} ,
"b2" : {
"b21" : {
"b211" : "some_value" ,
"b212" : "some_value" ,
} ,
"b22" : {
"b221" : "some_value" ,
"b222" : "some_value" ,
} ,
} ,
} ,
"c" : subprocess.Popen("ls -l".split() , stdout = subprocess.PIPE , stderr = subprocess.PIPE) ,
}
执行以下命令将会得到-
print("LEVEL 3")
print(json.dumps(safe_serialize(d , 3) , indent = 4))
print("\n\n\nLEVEL 2")
print(json.dumps(safe_serialize(d , 2) , indent = 4))
print("\n\n\nLEVEL 1")
print(json.dumps(safe_serialize(d , 1) , indent = 4))
结果:
LEVEL 3
{
"a": {
"a1": {
"a11": "{...}",
"a12": "{...}"
},
"a2": {
"a21": "{...}",
"a22": "{...}"
}
},
"b": {
"b1": {
"b11": "{...}",
"b12": "{...}"
},
"b2": {
"b21": "{...}",
"b22": "{...}"
}
},
"c": "<Popen: returncode: None args: ['ls', '-l']>"
}
LEVEL 2
{
"a": {
"a1": "{...}",
"a2": "{...}"
},
"b": {
"b1": "{...}",
"b2": "{...}"
},
"c": "<Popen: returncode: None args: ['ls', '-l']>"
}
LEVEL 1
{
"a": "{...}",
"b": "{...}",
"c": "<Popen: returncode: None args: ['ls', '-l']>"
}
[注意]:仅在不关心内置类对象的序列化时使用此选项。
对于更复杂的类,您可以考虑使用jsonpickle工具:
jsonpickle is a Python library for serialization and deserialization of complex Python objects to and from JSON. The standard Python libraries for encoding Python into JSON, such as the stdlib’s json, simplejson, and demjson, can only handle Python primitives that have a direct JSON equivalent (e.g. dicts, lists, strings, ints, etc.). jsonpickle builds on top of these libraries and allows more complex data structures to be serialized to JSON. jsonpickle is highly configurable and extendable–allowing the user to choose the JSON backend and add additional backends.
(链接到PyPi上的jsonpickle)
这个类可以做到这一点,它将object转换为标准json。
import json
class Serializer(object):
@staticmethod
def serialize(object):
return json.dumps(object, default=lambda o: o.__dict__.values()[0])
用法:
Serializer.serialize(my_object)
在python2.7和python3中工作。
为了给这场11年的大火再添一根柴,我想要一个满足以下条件的解决方案:
只允许使用json.dumps(obj)序列化类FileItem的实例 允许FileItem实例具有属性:FileItem .fname 允许FileItem实例提供给任何库,使用json.dumps(obj)序列化它 不需要将任何其他字段传递给json。转储(如自定义序列化器)
IE:
fileItem = FileItem('filename.ext')
assert json.dumps(fileItem) == '{"fname": "filename.ext"}'
assert fileItem.fname == 'filename.ext'
我的解决方案是:
obj的类是否继承自dict 将每个对象属性映射到底层字典
class FileItem(dict):
def __init__(self, fname):
self['fname'] = fname
#fname property
fname: str = property()
@fname.getter
def fname(self):
return self['fname']
@fname.setter
def fname(self, value: str):
self['fname'] = value
#Repeat for other properties
是的,如果你有很多属性,这有点冗长,但它是JSONSerializable,它的行为像一个对象,你可以把它给任何库,去json.dumps(obj)它。