Handling missing keys in str.format_map properly
str.format_map
was introduced in Python 3.2, it allows users to a pass a dictionary instead of individual keyword arguments. This can be very useful in case some of the format arguments are missing from the dictionary, take this example from docs:
class Default(dict):
def __missing__(self, key):
return key
print ('{name} was born in {country}'.format_map(Default(name='Guido')))
# Guido was born in country
But this fails:
>>> print ('{name} was born in {country.state}'.format_map(Default(name='Guido')))
Traceback (most recent call last):
File "<ipython-input-324-1012aa68ba8d>", line 1, in <module>
print ('{name} was born in {country.state}'.format_map(Default(name='Guido')))
AttributeError: 'str' object has no attribute 'state'
That is obvious because we are returning a string from __missing__
and that string doesn't have any attribute of the name state.
Note that the above way is also possible in Python 2 and Python 3.0-3.1 using the Formatter
class's vformat
method.
from string import Formatter
f = Formatter()
print f.vformat('{name} was born in {country}', (), Default(name='Guido'))
# Guido was born in country
Dealing with dot notations, conversions(!s
or !r
) and format specs(^
, >
etc)
The solution is, for missing keys instead of returning simple string, return an instance of a class that can handle these attribute calls along with creating full string back:
class MissingAttrHandler(object):
def __init__(self, format):
self.format = format
def __getattr__(self, attr):
return type(self)('{}.{}'.format(self.format, attr))
def __repr__(self):
return self.format + '}'
class Default(dict):
def __missing__(self, key):
return MissingAttrHandler('{{{}'.format(key))
Now let's test this:
>>> print('{name} was born in {country.state} and his last '
'name is {Person.full_name.last_name}'.format_map(Default(name='Guido')))
Guido was born in {country.state} and his last name is {Person.full_name.last_name}
Some of you may have already noticed, this solution has one issue though, it will fail if other formatting details like ^, 10d etc are present:
>>> print('{name} was born in {country.state} and his last '
'name is {Person.full_name.last_name:*^30}'.format_map(Default(name='Guido')))
Traceback (most recent call last):
File "<ipython-input-94-b375bfa3e06c>", line 2, in <module>
'name is {Person.full_name.last_name:*^30}'.format_map(Default(name='Guido')))
TypeError: non-empty format string passed to object.__format__
This is because MissingAttrHandler has no __format__`` method of its own, hence the
formatlookup goes to its base class object(
object.format`)
>>> MissingAttrHandler.__format__ is object.__format__
True
>>> object.__format__(MissingAttrHandler(''), '^*30s')
Traceback (most recent call last):
File "<ipython-input-129-c4e00a46bd28>", line 1, in <module>
object.__format__(MissingAttrHandler(''), '^*30s')
TypeError: non-empty format string passed to object.__format__
So, let's define a `format`` method in our class that takes care of this:
def __format__(self, format):
return '{}:{}}}'.format(self.format, format)
Let's test it:
>>> print('{name} was born in {country.state} and dict has '
'{dict.get:*^30} method.'.format_map(Default(name='Guido')))
Guido was born in {country.state:} and dict has {dict.get:*^30} method.
>>> print('{name} was born in {country.state} and dict has '
'{dict.get:>30d} method.'.format_map(Default(name='Guido')))
Guido was born in {country.state:} and dict has {dict.get:>30d} method.
Seems to be working fine, let's try one more thing:
>>> print('{name} was born in {country.state} and dict has '
... '{dict.get!s:*^30} method.'.format_map(Default(name='Guido')))
Guido was born in {country.state:} and dict has **********{dict.get}********** method.
Well this was quite unexpected, what exactly happened there?
Well due to the !s
present in the format string after getting the value of these fields using either str()
or repr()
(which is a string object), Python will now call __format__
on it with *^30
as an argument. But as we returned a string object and not a MissingAttrHandler
object the format call goes to that str.
>>> '{dict.get}'.__format__('*^30')
'**********{dict.get}**********'
We can try to return an instance of MissingAttrHandler rather than a string from its __repr__
method. But to return MissingAttrHandle
instance from __str__
or __repr__
we will have to inherit from str as well because Python expects us to return an instance of type str. Now __repr__
will look like:
def __repr__(self):
return MissingAttrHandler(self.format + '!r}')
Note that now we need to define __str__
as well because our class does not inherit from str which provides a __str__
method, hence calling __str__
on it won't fallback to __repr__
anymore.
And one cool thing about __format__
is that once defined, it is the function that is by default called during string formatting unless we provide !r
or !s explicitly. If !r
or !s
are present on the format string then __repr__
and __str__
are called respectively and then __format__
is called on the resulting object.
Ah! ha that's exactly what we needed right? Using this we can also add !r or !s in our format strings and later complete it with the __format__
method.
So, in the end our class will look like:
class MissingAttrHandler(str):
def __init__(self, format):
self.format = format
def __getattr__(self, attr):
return type(self)('{}.{}'.format(self.format, attr))
def __repr__(self):
return MissingAttrHandler(self.format + '!r}')
def __str__(self):
return MissingAttrHandler(self.format + '!s}')
def __format__(self, format):
if self.format.endswith('}'):
self.format = self.format[:-1]
return '{}:{}}}'.format(self.format, format)
class Default(dict):
def __missing__(self, key):
return MissingAttrHandler('{{{}'.format(key))
Let's try it:
>>> print('{name} was born in {country.state} and dict has '
'{dict.get!r:*^30} method.'.format_map(Default(name='Guido', dct=dict)))
Guido was born in {country.state:} and dict has {dict.get!r:*^30} method.
>>> print('{name} was born in {country.state!r:=20s} and dict has '
'{dict.get!s:*^30} method.'.format_map(Default(name='Guido', dct=dict)))
Guido was born in {country.state!r:=20s} and dict has {dict.get!s:*^30} method.
Works! ;-)
I hope you must have learned something about string formatting in Python with the aforementioned method.
But is there any other way to do this?
Yes!
Second way:
We can achieve the same thing as above using Formatter
class from string module, the parse()
method of this class can be used to parse the format string. It returns an iterable that yields a tuple containing (literal_text, field_name, format_spec, conversion). We can use these fields to re-create our string.
from functools import reduce
from operator import attrgetter
from string import Formatter
def get_field_value(field_name, mapping):
try:
if '.' not in field_name:
return mapping[field_name], True
else:
obj, attrs = field_name.split('.', 1)
return attrgetter(attrs)(mapping[obj]), True
except Exception as e:
return field_name, False
def str_format_map(format_string, mapping):
f = Formatter()
parsed = f.parse(format_string)
output = []
for literal_text, field_name, format_spec, conversion in parsed:
conversion = '!' + conversion if conversion is not None else ''
format_spec = ':' + format_spec if format_spec else ''
if field_name is not None:
field_value, found = get_field_value(field_name, mapping)
if not found:
text = '{{{}{}{}}}'.format(field_value,
conversion,
format_spec)
else:
format_string = '{{{}{}}}'.format(conversion, format_spec)
text = format_string.format(field_value)
output.append(literal_text + text)
text = ''
return ''.join(output)
Demo:
>>> s = '{name} was born in {country.state} and dict has {dict.get!r:*^30} method.'
>>> print(str_format_map(s, dict(dict=dict, name="guido")))
guido was born in {country.state} and dict has <method 'get' of 'dict' objects> method.
>>> s = '{name} was born in {country.state!r:=20s} and dict has {dict.get!s:*^30} method.'
>>> print(str_format_map(s, dict(dct=dict, name="guido")))
guido was born in {country.state!r:=20s} and dict has {dict.get!s:*^30} method.