Python Json.load Missing Array Hook And Parse Callbacks?
Solution 1:
Actually json.loads
calls its scanner
to parse the input string, and all behaviors can be hooked by reconstructing the scanner
in your customized class.
import json
from json.scanner import py_make_scanner
from json.decoder import JSONArray
classCustomizedDecoder(json.JSONDecoder):
def__init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
defparse_array(*_args, **_kwargs):
values, end = JSONArray(*_args, **_kwargs)
for item in values:
print(item) # Is it what you want?return values, end
self.parse_array = parse_array
self.scan_once = py_make_scanner(self)
json.loads('{"a": [1, [2, [3.1], ["4"]]]}', cls=CustomizedDecoder)
outputs
3.1
4
2
[3.1]
['4']
1
[2, [3.1], ['4']]
Moreover, there are several other functions you can hook by doing exactly same things.
self.object_hook = object_hook
self.parse_float = parse_float orfloat
self.parse_int = parse_int orint
self.parse_constant = parse_constant or _CONSTANTS.__getitem__
self.object_pairs_hook = object_pairs_hook
self.parse_object = JSONObject
self.parse_array = JSONArray
self.parse_string = scanstring
Solution 2:
If you are willing to take a somewhat slower parsing into account, you can use the ruamel.yaml
parser for this (disclaimer: I am the author of that package). As YAML 1.2 is a superset of JSON for all practical purposes, you can subclass the Constructor
:
import sys
from ruamel.yaml import YAML, SafeConstructor
json_str = '{"a": [1, [2.0, True, [3, null]]]}'classMyConstructor(SafeConstructor):
defconstruct_yaml_null(self, node):
print('null')
data = SafeConstructor.construct_yaml_null(self, node)
return data
defconstruct_yaml_bool(self, node):
print('bool')
data = SafeConstructor.construct_yaml_bool(self, node)
return data
defconstruct_yaml_int(self, node):
print('int')
data = SafeConstructor.construct_yaml_int(self, node)
return data
defconstruct_yaml_float(self, node):
print('float')
data = SafeConstructor.construct_yaml_float(self, node)
return data
defconstruct_yaml_str(self, node):
print('str')
data = SafeConstructor.construct_yaml_str(self, node)
return data
defconstruct_yaml_seq(self, node):
print('seq')
for data in SafeConstructor.construct_yaml_seq(self, node):
passreturn data
defconstruct_yaml_map(self, node):
print('map')
for data in SafeConstructor.construct_yaml_map(self, node):
passreturn data
MyConstructor.add_constructor(
u'tag:yaml.org,2002:null',
MyConstructor.construct_yaml_null)
MyConstructor.add_constructor(
u'tag:yaml.org,2002:bool',
MyConstructor.construct_yaml_bool)
MyConstructor.add_constructor(
u'tag:yaml.org,2002:int',
MyConstructor.construct_yaml_int)
MyConstructor.add_constructor(
u'tag:yaml.org,2002:float',
MyConstructor.construct_yaml_float)
MyConstructor.add_constructor(
u'tag:yaml.org,2002:str',
MyConstructor.construct_yaml_str)
MyConstructor.add_constructor(
u'tag:yaml.org,2002:seq',
MyConstructor.construct_yaml_seq)
MyConstructor.add_constructor(
u'tag:yaml.org,2002:map',
MyConstructor.construct_yaml_map)
yaml = YAML(typ='safe')
yaml.Constructor = MyConstructor
data = yaml.load(json_str)
print(data)
Just replace the code in each construct_yaml_XYZ
method with code that creates the objects you want and return those.
The "funny business" with the for
loop when creating a mapping/dict resp. sequence/list, is to unwrap the two step process of creating these objects (necessary for "real" YAML input to deal with recursive data structures using anchors/aliases).
The above outputs:
mapstr
seq
int
seq
floatbool
seq
int
null
{'a': [1, [2.0, True, [3, None]]]}
You can also hook into the YAML parser at a lower level, but that doesn't make the implementation easier and probably only marginally faster.
Solution 3:
Martijn Pieters' comment had the correct approach:
Performance. And the hooks are not meant to allow for a wholesale replacement of the JSON format.
I was going in the wrong direction attempting to use hooks to parse JSON. They exist to augment parsing, but not replace it.
Post a Comment for "Python Json.load Missing Array Hook And Parse Callbacks?"