In this post, we will see a way of implementing a list in Python that can be indexed with an integer or string representing the name
attribute of a stored object.
Our Task
Consider a situation when we have a list of objects providing a name
attribute. For example:
class File: def __init__(self, name): self.name = name # Other useful methods... def __str__(self): return self.name files = [File('a.txt'), File('b.txt'), File('c.txt')]
Obviously, you can index such a list with integers:
>>> print(files[1]) b.txt
Additionally, though, you would like to be able to index the list with file names. That is, you would like to do this:
>>> print(files['b.txt']) b.txt
However, when you try it, you get the following (expected) exception:
>>> print(files['b.txt']) Traceback (most recent call last): print(files['b.txt']) TypeError: list indices must be integers, not str
In the remainder of the present post, I will show you a way of implementing a list that supports such indexing.
Implementing the List
In what follows, I will use Python 3. We start by creating a class that inherits from the standard Python’s list
:
class NamedObjectList(list): ...
This gives us support for all the methods and behavior that list
provides. So, instances of our class will behave like an ordinary list
. Additionally, however, we need to add support for indexing objects by their names. This is done by overriding the __getitem__()
method, which Python automatically calls upon accessing an object with the indexing operator []
.
class NamedObjectList(list): def __getitem__(self, key): ...
This method is automatically called to implement the evaluation of self[key]
. For standard lists, it only works for integral keys. What we do is that we check whether the key is a string, and if so, then we find an object whose name equals the key and return it. Otherwise, we simply delegate the evaluation to the base class, list
:
class NamedObjectList(list): def __getitem__(self, key): if isinstance(key, str): for item in self: if item.name == key: return item raise IndexError('no object named {!r}'.format(key)) return list.__getitem__(self, key)
After checking that key
is a string, we iterate through the list to try to find an object with the given name. Notice that as we inherited from list
, self
is actually a list, so we can use for item in self
. When there is no object with the given name, we raise an exception to mimic the behavior of list
when no such index exists. Finally, when key
is not a string, we simply delegate the evaluation to the base class.
Now, when you try the following code, it should work as expected :).
>>> files = NamedObjectList([File('a.txt'), File('b.txt'), File('c.txt')]) >>> print(files[1]) b.txt >>> print(files['b.txt']) b.txt
Finishing the Implementation
Even though the above code works in that particular scenario, it fails when assigning or deleting from a list by using a string index:
>>> files['b.txt'] = File('b.txt') Traceback (most recent call last): files['b.txt'] = File('b.txt') TypeError: list indices must be integers, not str >>> del files['b.txt'] Traceback (most recent call last): del files['b.txt'] TypeError: list indices must be integers, not str
To support such code, we need to also override the __setitem__()
and __delitem__()
methods:
def __setitem__(self, key, value): # Called to evaluate self[key] = value. ... def __delitem__(self, key): # Called to evaluate del self[key]. ...
The full source code of the implemented class, alongside with unit tests, is available on GitHub. It uses a slightly different approach of implementing the overridden methods to the one given in the present post to avoid code duplication when implementing all the three magic methods.
Other than order, how is this different than using a dictionary?
At least on this example, I don’t get what’s the advantage.
Hi! The approach presented in the post has the following advantages:
list
, so it supportslist
methods, is ordered, etc.files['a.txt'] = File('a.txt')
).Of course, a dictionary has also its advantages (like faster lookup when a key is used to access it). However, the approach that I described in the post appeared to be useful in a project I worked on, so I wanted to share the idea.