Looping over group in a Python regex -


edit: i've gotten work--i had forgotten put in space separator multiple edges.

i've got python regex, handles of strings have parse.

edge_value_pattern = re.compile(r'(?p<edge>e[0-9]+) +(?p<label1>[^ ]*)[^"]+"(?p<word>[^"]+)"[^:]+:: (?p<label2>[^\n]+)')

here example string regex meant parse:

'e0 bike-event 1 "biking" 2'

it correctly stores e0 edge group, bike-event label1 group, , "biking" word group. last group, label2, different variation of string, shown below. note label2 regex group behaves expected when given string 1 below.

'e29 e30 "of" :: of, of'

however, regex pattern fills in label1 value e30. truth string not have label1 value--it should none or @ least empty string. ad-hoc solution parse label1 regex determine if it's actual label or edge. want know if there way modify original regex group edge takes in edges. e.g., output above string be:

edge = "e29 e30"

label1 = none

word = of

label2 = of, of

i tried solution below, thought translate looping on first group, edge (this trivial if had actual fsa), doesn't change behavior of regex.

edge_value_pattern = re.compile(r'(?p<edge>(e[0-9]+)+) +(?p<label1>[^ ]*)[^"]+"(?p<word>[^"]+)"[^:]+:: (?p<label2>[^\n]+)')

if want edge match "e29 e30", have put repetition inside group, not outside.

you did sticking new group inside edge group + repetition—which fine, although wanted non-capturing group there—but forgot include space inside repeating group.

(you left external repeat, , used capturing group wanted non-capturing, less serious.)

look @ fragment:

(?p<edge>(e[0-9]+)+) 

regular expression visualization

debuggex demo

here, expression catches e29 1 match, e30 subsequent match. so, if add else expression, it's either going miss e29, or fail. add space:

(?p<edge>(e[0-9]+ )+) 

regular expression visualization

debuggex demo

and it's matching e29 e30 plus trailing space single match, means can tack on additional stuff , work (as long additional stuff right—you still need remove +, , think may need make couple of other repetitions non-greedy…).


Comments

Popular posts from this blog

c++ - OpenMP unpredictable overhead -

ruby on rails - RuntimeError: Circular dependency detected while autoloading constant - ActiveAdmin.register Role -

javascript - Wordpress slider, not displayed 100% width -