This short post will present a way to deal with a common problem in nested objects: the presence of indexes in paths. Take for example example_dict[“access”][“to”][“certain”][“level”][“access”][“other”][“level”].
How do we navigate through such a nested object? The proposed solution relies on the Nob package.
reading time: 6 min
We’ll work with a generic example to illustrate the problem we wish to tackle.
We have a list of properties from several people in different locations. We’ll cast this list in a YAML format and name it
input.yml. Its content is given below.
# content of input.yml europe: france: toulouse: rue_alsace: - person: sarah_connor type: house - person: jack_burton type: house rue_lautrec: - person: chuck_norris type: house - person: sarah_connor type: house - person: john_mclane type: flat apt: 31 paris: rue_alsace: - person: sarah_connor type: flat apt: 44 - person: john_doe type: flat apt: 43 rue_rivoli: - person: john_carter type: house - person: bruce_wayne type: mansion spain: madrid: castellana: - person: bruce_wayne type: mansion valencia: sanpedro: - person: sancho_panca type: hut mars: olympus_mons: barsoom: - person: john_carter type: palace
Given the list (input.yml), we wish to modify some of the entries within a python script. This task could be considered as ordinary and chances are high you will encounter it in one form or another within the world of scientific computing (editing inputs, outputs, etc.) .
Bruce Wayne got in a financial pinch due to the covid-19 crisis. He decides
- to downgrade his property in Madrid by selling his mansion and acquire a flat instead in the same district
- to sell his mansion in Paris
Let’s modify the above list accordingly.
We can read the input.yml file in python yielding a dictionary (dof)
import yaml with open("input.yml", 'r') as fin: mydict = yaml.load(fin, Loader=yaml.FullLoader)
which will result in a nested dictionary.
Now to update the list according to the suggested scenario, the method would be,
- for the downgrade
mydict["europe"]["spain"]["madrid"]["castellana"]["type"] = "flat"
- for the sale (we will add a status key)
mydict["europe"]["france"]["paris"]["rue_rivoli"]["status"] = "for sale"
Both actions require the knowledge of the index (, ) corresponding to the properties listed for Bruce Wayne. The issue arises because of the occurences of dictionaries inside of lists which are inside a dictionary: dict[list[dict]].
In se we could achieve what we want with the above but
- the dependence on pre-existing knowledge on the exact index corresponding to the location within the nested dictionary of the properties listed under Bruce Wayne in a given place is not desirable and could easily lead to problems. A “smarter” way would consist of obtaining the index of the item we wish to modify within the nested dictionary, thus removing the need for a priori knowledge. This could, presently, be done based on the owner’s name.
- it is extremely cumbersome to have to specify all the levels in the nested dictionary before reaching the end point.
So let’s explore another way to do so.
Instead of working with a standard dictionary we will rely on the Nob package which offers an elegant way to manipulate nested objects.
Let’s convert our dictonary into a Nob object.
from nob import Nob nob_tree = Nob(mydict)
In case of Bruce Wayne’s property downgrade, to access the same key we wish to manipulate we can do several things (see also PyPI description on Nob).
# full path nob_tree["/europe/spain/madrid/castellana/0/type"][:] # shorter path 1 nob_tree.castellana["/0/type"][:] # shorter path 2 nob_tree.castellana.type[:] # shortest path nob_tree.castellana.type[:]
Note: the usage of [:] allows to output the value associated with the nested key.
The full_path is similar to what we would have to do with a dictionary, except for the advantage of specyfing an absolute path at once and not through multiple , but it remains cumbersome. The advantage of Nob starts to become visible in the other options. First we can reduce the path specification prior to the index we wish to obtain as can be seen in the shorter path alternatives:
nob_tree.castellana. The operation relies on the uniqueness of the key (here: castellana) in the nested object. Then we can access the type of property associated with Bruce Wayne in a similar manner with shorter path 2 again relies on the uniqueness of the “type” key. The issue remains that we require knowledge on the index  to continue our search. In this specific case, an even quicker method is possible as shown by shortest path which relies on the fact that there is only one owner and property listed under castellana.
In the second case we could use
nob_tree.rue_rivoli["status"] = "for sale"
which still requires an index.
Let’s describe a generic approach to avoid the need for knowledge on the indexes.
In the present case we will assume the uniqueness of the “castellana” and “rue_rivoli” key which is known to the user. There are two solution paths that can be explored.
Making use of list comprehension, Bruce Wayne’s property type can readily be obtained. We first define an intermediate subtree, followed by a search operation within that subtree.
sub_tree = nob_tree.castellana out_ = [ path[:-1] for path in sub_tree.find("person") if sub_tree[path][:] == "bruce_wayne" ] bruce_path, = out_ # _, unpack to check only one value was found
The method relies on the
find() functionality within the nested object. Note that the method to obtain the index could be written in a single line as follow at the cost of a reduced readability.
bruce_path, = [path[:-1] for path in sub_tree.find("person") if sub_tree[path][:] == "bruce_wayne"]
Then we can perform our modification as follow
# check current property type In : sub_tree[bruce_path].type[:] Out: 'mansion' # set our desired value sub_tree[bruce_path].type = "flat" # check if everything worked fine In : sub_tree[bruce_path].type[:] Out: 'flat'
For the sale of Bruce Wayne’s mansion in Paris we perform a similar approach but working with a different subtree.
# define new subtree sub_tree = nob_tree.rue_rivoli # use same procedure as previously bruce_path, = [path[:-1] for path in sub_tree.find("person") if sub_tree[path][:] == "bruce_wayne"] # add key sub_tree[bruce_path]["status"] = "for sale"
If multiple keys are to be manipulated it could become handy to consider a function which returns the path to the person of interest or even the property type associated to the person of interest itself while still relying on list comprehension.
The suggested solution method is not conditional upon the presence of lists within nested objects. In fact, a filtering task is performed on the latter type of data structures and could as well be performed on any other nested object, and hence is not limited by the occurence of indexes. Take for example the following database:
europe: france: person: bruce_wayne type: mansion italy: person: bruce_wayne type: flat
If we wish to get the types of properties associated with Bruce Wayne in the above example we would have to perform a similar search operation as detailed in this post’s context but no indexing would be encountered.
Look into Nob and get a grasp of its use. It will save you the effort in trying to create functionalities to search through nested objects and manipulate them at will.