re.match
always matches from the start of the string. From the docs:
re.match(pattern, string, flags=0)
If zero or more characters at the beginning of
string
match the regular expressionpattern
, return a corresponding match object.
Emphasis mine.
You need to use re.search
to have Python search for a pattern anywhere in the string:
>>> import re>>> tests = ["Milt Jackson, vibraphone, piano, guitar, 1923 (d. October 9, 1999)", "Howard Johnson, alto sax, 1908 (d. December 28, 1991)","Sonny Greenwich, guitar, 1936", "Eiichi Hayashi, alto sax, 1960", "Yoshio Ikeda, bass, 1942", "Urs Leimgruber, saxophones, bass clarinet. 1952"]>>>>>> for test in tests:... m = re.search ("\(d\.(.*)\)", test)... if m:... print(m.groups())...(' October 9, 1999',)(' December 28, 1991',)>>>
Also, in your pattern, I escaped the .
after d
to have Python match a literal period. Otherwise, Python will match any character there (except a newline).