Quantcast
Viewing latest article 4
Browse Latest Browse All 4

Finding the contents of parenthesis

I have a few thousand blocks of text which may or may not contain a date of death for the person in the record, which is always in the form:

(d. xxxxxxxxxxxxx)

that is that it starts with parenthesis, followed by a d and ., then some date text and closes with the final parenthesis.

I wrote the following code with a few test samples to test a Regex which I wrote:

import retests = ["Milt Jackson, vibraphone, piano, guitar, 1923 (d. October 9, 1999)", "Howard Johnson, alto sax, 1908 (d. December 28, 1991)","Sonny Greenwich, guitar, 1936", "Eiichi Hayashi, alto sax, 1960", "Yoshio Ikeda, bass, 1942", "Urs Leimgruber, saxophones, bass clarinet. 1952"]for test in tests:    m = re.match ("\(d.(.*)\)", test)    if m:        print(m.groups())

However it prints no results.

I've tested the Regex in an online Regex tester and it works for valid test input.

So, I guess my code is wrong. Can anyone suggest why, please?

Finally - what I want to extract is date of death itself (not the parenthesis and d.)- any suggestions how I could do that?


Viewing latest article 4
Browse Latest Browse All 4

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>