Indexing Lists in Python With an Integer or Object’s Name

In this post, we will see a way of implementing a list in Python that can be indexed with an integer or string representing the name attribute of a stored object.

Our Task

Consider a situation when we have a list of objects providing a name attribute. For example:

class File:
	def __init__(self, name):
		self.name = name

	# Other useful methods...

	def __str__(self):
		return self.name

files = [File('a.txt'), File('b.txt'), File('c.txt')]

Obviously, you can index such a list with integers:

>>> print(files[1])
b.txt

Additionally, though, you would like to be able to index the list with file names. That is, you would like to do this:

>>> print(files['b.txt'])
b.txt

However, when you try it, you get the following (expected) exception:

>>> print(files['b.txt'])
Traceback (most recent call last):
    print(files['b.txt'])
TypeError: list indices must be integers, not str

In the remainder of the present post, I will show you a way of implementing a list that supports such indexing.

Implementing the List

In what follows, I will use Python 3. We start by creating a class that inherits from the standard Python’s list:

class NamedObjectList(list):
	...

This gives us support for all the methods and behavior that list provides. So, instances of our class will behave like an ordinary list. Additionally, however, we need to add support for indexing objects by their names. This is done by overriding the __getitem__() method, which Python automatically calls upon accessing an object with the indexing operator [].

class NamedObjectList(list):
    def __getitem__(self, key):
		...

This method is automatically called to implement the evaluation of self[key]. For standard lists, it only works for integral keys. What we do is that we check whether the key is a string, and if so, then we find an object whose name equals the key and return it. Otherwise, we simply delegate the evaluation to the base class, list:

class NamedObjectList(list):
    def __getitem__(self, key):
        if isinstance(key, str):
            for item in self:
                if item.name == key:
                    return item
            raise IndexError('no object named {!r}'.format(key))
        return list.__getitem__(self, key)

After checking that key is a string, we iterate through the list to try to find an object with the given name. Notice that as we inherited from list, self is actually a list, so we can use for item in self. When there is no object with the given name, we raise an exception to mimic the behavior of list when no such index exists. Finally, when key is not a string, we simply delegate the evaluation to the base class.

Now, when you try the following code, it should work as expected :).

>>> files = NamedObjectList([File('a.txt'), File('b.txt'), File('c.txt')])
>>> print(files[1])
b.txt
>>> print(files['b.txt'])
b.txt

Finishing the Implementation

Even though the above code works in that particular scenario, it fails when assigning or deleting from a list by using a string index:

>>> files['b.txt'] = File('b.txt')
Traceback (most recent call last):
    files['b.txt'] = File('b.txt')
TypeError: list indices must be integers, not str

>>> del files['b.txt']
Traceback (most recent call last):
    del files['b.txt']
TypeError: list indices must be integers, not str

To support such code, we need to also override the __setitem__() and __delitem__() methods:

def __setitem__(self, key, value):
	# Called to evaluate self[key] = value.
	...

def __delitem__(self, key):
	# Called to evaluate del self[key].
	...

The full source code of the implemented class, alongside with unit tests, is available on GitHub. It uses a slightly different approach of implementing the overridden methods to the one given in the present post to avoid code duplication when implementing all the three magic methods.

2 Comments

  1. Other than order, how is this different than using a dictionary?

    At least on this example, I don’t get what’s the advantage.

    Reply
    • Hi! The approach presented in the post has the following advantages:

      1. The container is a subclass of list, so it supports list methods, is ordered, etc.
      2. You can access the container by both an index and name (in a dictionary, you can only use a key).
      3. “Keys” do not have to be explicitly specified (in a dictionary, you would need to duplicate the key, as in files['a.txt'] = File('a.txt')).

      Of course, a dictionary has also its advantages (like faster lookup when a key is used to access it). However, the approach that I described in the post appeared to be useful in a project I worked on, so I wanted to share the idea.

      Reply

Leave a Reply to tglaria Cancel reply