This short post will present a way to deal with a common problem in nested objects: the presence of indexes in paths. Take for example

example_dict["access"]["to"]["certain"]["level"][1]["access"]["other"]["level"]

How do we navigate through such a nested object without prior knowledge on the specific index (1)??? The proposed solution relies on the Nob Python package. A dedicated blog post discusses the common usage of Nob adopted at COOP. In the present post we will dive into a slightly different usage.

reading time: 7 min

The context

We’ll work with a generic example to illustrate the problem we wish to tackle. We have a list of properties from several people in different locations. We’ll cast this list in a YAML format and name it input.yml. Its content is given below.

# content of input.yml
europe:
  france:
    toulouse:
      rue_alsace:
        - person: sarah_connor
          type: house
        - person: jack_burton
          type: house
      rue_lautrec:
        - person: chuck_norris
          type: house
        - person: sarah_connor
          type: house
        - person: john_mclane
          type: flat
          apt: 31
    paris:
      rue_alsace:
        - person: sarah_connor
          type: flat
          apt: 44
        - person: john_doe
          type: flat
          apt: 43
      rue_rivoli:
        - person: john_carter
          type: house
        - person: bruce_wayne
          type: mansion
  spain:
    madrid:
      castellana:
        - person: bruce_wayne
          type: mansion
    valencia:
      sanpedro:
        - person: sancho_panca
          type: hut
mars:
  olympus_mons:
    barsoom:
      - person: john_carter
        type: palace

The desired outcome

Given the list (input.yml), we wish to modify some of the entries within a python script. This task could be considered as ordinary and chances are high you will encounter it in one form or another within the world of scientific computing (editing inputs, outputs, etc.) .

Suggested scenario

Bruce Wayne got in a financial pinch due to the covid-19 crisis. He decides

  1. to downgrade his property in Madrid by selling his mansion and acquire a flat instead in the same district
  2. to sell his mansion in Paris

Let’s modify the above list accordingly.

The issue

We can read the input.yml file in python yielding a dictionary (dof)

import yaml
with open("input.yml", 'r') as fin:
    mydict = yaml.load(fin, Loader=yaml.FullLoader)

which will result in a nested dictionary.

Now to update the list according to the suggested scenario, the method would be,

  • for the downgrade
mydict["europe"]["spain"]["madrid"]["castellana"][0]["type"] = "flat"
  • for the sale (we will add a status key)
mydict["europe"]["france"]["paris"]["rue_rivoli"][1]["status"] = "for sale"

Both actions require the knowledge of the index ([0], [1]) corresponding to the properties listed for Bruce Wayne. The issue arises because of the occurences of dictionaries inside of lists which are inside a dictionary: dict[list[dict]].

In se we could achieve what we want with the above but

  • the dependence on pre-existing knowledge on the exact index corresponding to the location within the nested dictionary of the properties listed under Bruce Wayne in a given place is not desirable and could easily lead to problems. A “smarter” way would consist of obtaining the index of the item we wish to modify within the nested dictionary, thus removing the need for a priori knowledge. This could, presently, be done based on the owner’s name.
  • it is extremely cumbersome to have to specify all the levels in the nested dictionary before reaching the end point.

So let’s explore another way to do so.

The alternative

Instead of working with a standard dictionary we will rely on the Nob package which offers an elegant way to manipulate nested objects.

Let’s convert our dictonary into a Nob object.

from nob import Nob
nob_tree =  Nob(mydict)

In case of Bruce Wayne’s property downgrade, to access the same key we wish to manipulate we can do several things (see also PyPI description on Nob).

# full path
nob_tree["/europe/spain/madrid/castellana/0/type"][:]

# shorter path 1
nob_tree.castellana["/0/type"][:]

# shorter path 2
nob_tree.castellana[0].type[:]

# shortest path
nob_tree.castellana.type[:]

Note: the usage of [:] allows to output the value associated with the nested key.

The full_path is similar to what we would have to do with a dictionary, except for the advantage of specyfing an absolute path at once and not through multiple [], but it remains cumbersome. The advantage of Nob starts to become visible in the other options. First we can reduce the path specification prior to the index we wish to obtain as can be seen in the shorter path alternatives: nob_tree.castellana. The operation relies on the uniqueness of the key (here: castellana) in the nested object. Then we can access the type of property associated with Bruce Wayne in a similar manner with shorter path 2 again relies on the uniqueness of the “type” key. The issue remains that we require knowledge on the index [0] to continue our search. In this specific case, an even quicker method is possible as shown by shortest path which relies on the fact that there is only one owner and property listed under castellana.

In the second case we could use

nob_tree.rue_rivoli[1]["status"] = "for sale"

which still requires an index.

Let’s describe a generic approach to avoid the need for knowledge on the indexes.

In the present case we will assume the uniqueness of the “castellana” and “rue_rivoli” key which is known to the user. There are two solution paths that can be explored.

Solution with list comprehension

Making use of list comprehension, Bruce Wayne’s property type can readily be obtained. We first define an intermediate subtree, followed by a search operation within that subtree.

sub_tree = nob_tree.castellana
out_ = [
    path[:-1]
    for path in sub_tree.find("person")
    if sub_tree[path][:] == "bruce_wayne"
]
bruce_path, = out_   # _, unpack to check only one value was found

The method relies on the find() functionality within the nested object. Note that the method to obtain the index could be written in a single line as follow at the cost of a reduced readability.

bruce_path, = [path[:-1] for path in sub_tree.find("person") if sub_tree[path][:] == "bruce_wayne"]

Then we can perform our modification as follow

# check current property type
In [1]: sub_tree[bruce_path].type[:]
Out[1]: 'mansion'

# set our desired value
sub_tree[bruce_path].type = "flat"

# check if everything worked fine
In [3]: sub_tree[bruce_path].type[:]
Out[3]: 'flat'

For the sale of Bruce Wayne’s mansion in Paris we perform a similar approach but working with a different subtree.

# define new subtree
sub_tree = nob_tree.rue_rivoli
# use same procedure as previously
bruce_path, = [path[:-1] for path in sub_tree.find("person") if sub_tree[path][:] == "bruce_wayne"]
# add key
sub_tree[bruce_path]["status"] = "for sale"

If multiple keys are to be manipulated it could become handy to consider a function which returns the path to the person of interest or even the property type associated to the person of interest itself while still relying on list comprehension.

Final comments

The suggested solution method is not conditional upon the presence of lists within nested objects. In fact, a filtering task is performed on the latter type of data structures and could as well be performed on any other nested object, and hence is not limited by the occurence of indexes. Take for example the following database:

europe:
  france:
    person: bruce_wayne
    type: mansion
  italy:
    person: bruce_wayne
    type: flat

If we wish to get the types of properties associated with Bruce Wayne in the above example we would have to perform a similar search operation as detailed in this post’s context but no indexing would be encountered.

Take away

Look into nob and get a grasp of its use. It will save you the effort in trying to create functionalities to search through nested objects and manipulate them at will.

Like this post? Share on: TwitterFacebookEmail


Jimmy-John Hoste is a postdoctoral researcher in computer science engineering with a focus on CFD related topics.

Keep Reading


Published

Category

Work In Progress

Tags

Stay in Touch