You are here: Home > Dive Into Python > Dynamic functions > plural.py, stage 1 | << >> | ||||
Dive Into PythonPython from novice to pro |
So we're looking at words, which at least in English are strings of characters. And we have rules that say we need to find different combinations of characters, and then do different things to them. This sounds like a job for regular expressions.
import re def plural(noun): if re.search('[sxz]$', noun):return re.sub('$', 'es', noun)
elif re.search('[^aeioudgkprt]h$', noun): return re.sub('$', 'es', noun) elif re.search('[^aeiou]y$', noun): return re.sub('y$', 'ies', noun) else: return noun + 's'
![]() |
OK, this is a regular expression, but it uses a syntax we didn't see in Regular Expressions. The square brackets mean “match exactly one of these characters”. So [sxz] means “s, or x, or z”, but only one of them. The $ should be familiar; it matches the end of string. So we're checking to see if noun ends with s, x, or z. |
![]() |
This re.sub function is totally new, in the sense that we never covered it in the regular expressions chapter. It performs regular expression-based string substitutions. Let's look at it in more detail. |
>>> import re >>> re.search('[abc]', 'Mark')<_sre.SRE_Match object at 0x001C1FA8> >>> re.sub('[abc]', 'o', 'Mark')
'Mork' >>> re.sub('[abc]', 'o', 'rock')
'rook' >>> re.sub('[abc]', 'o', 'caps')
'oops'
import re def plural(noun): if re.search('[sxz]$', noun): return re.sub('$', 'es', noun)elif re.search('[^aeioudgkprt]h$', noun):
return re.sub('$', 'es', noun)
elif re.search('[^aeiou]y$', noun): return re.sub('y$', 'ies', noun) else: return noun + 's'
>>> import re >>> re.search('[^aeiou]y$', 'vacancy')<_sre.SRE_Match object at 0x001C1FA8> >>> re.search('[^aeiou]y$', 'boy')
>>> >>> re.search('[^aeiou]y$', 'day') >>> >>> re.search('[^aeiou]y$', 'pita')
>>>
>>> re.sub('y$', 'ies', 'vacancy')'vacancies' >>> re.sub('y$', 'ies', 'agency') 'agencies' >>> re.sub('([^aeiou])y$', r'\1ies', 'vacancy')
'vacancies'
![]() |
This regular expression turns vacancy into vacancies and agency into agencies, which is what we wanted. Note that it would also turn boy into boies, but that will never happen in our function because we did that re.search first to find out whether we should do this re.sub. |
![]() |
Just in passing, I want to point out that it is possible to combine these two regular expressions (one to find out if the rule applies, and another to actually apply it) into a single regular expression. Here's what that would look like. Most of it should look familiar: we're using a remembered group, which we learned in Case study: parsing phone numbers, to remember the character before the y. Then in the substitution string, we use a new syntax, \1, which means “hey, that first group you remembered? put it here”. In this case, we remember the c before the y, and then when we do the substitution, we substitute c in place of c, and ies in place of y. (If you have more than one remembered group, you can use \2 and \3 and so on.) |
Regular expression substitutions are extremely powerful, and the \1 syntax makes them even more powerful. But combining the entire operation into one regular expression is also much harder to read, and it doesn't directly map to the way we first described the pluralizing rules. We originally laid out rules like “if the word ends in S, X, or Z, then add ES”. And if you look at this function, we have two lines of code that say “if the word ends in S, X, or Z, then add ES”. It doesn't get much more direct than that.
<< Dynamic functions |
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | |
plural.py, stage 2 >> |