Do Python Regexes Support Something Like Perl's \g?
Solution 1:
Try these:
import re
re.sub()
re.findall()
re.finditer()
for example:
# Finds all words of length 3 or 4
s = "the quick brown fox jumped over the lazy dogs."print re.findall(r'\b\w{3,4}\b', s)
# prints ['the','fox','over','the','lazy','dogs']
Solution 2:
Python does not have the /g modifier for their regexen, and so do not have the \G regex token. A pity, really.
Solution 3:
You can use re.match
to match anchored patterns. re.match
will only match at the beginning (position 0) of the text, or where you specify.
def match_sequence(pattern,text,pos=0):
pat = re.compile(pattern)
match = pat.match(text,pos)
whilematch:
yieldmatchifmatch.end() == pos:
break # infinite loop otherwise
pos = match.end()
match = pat.match(text,pos)
This will only match pattern from the given position, and any matches that follow 0 characters after.
>>>for match in match_sequence(r'[^\W\d]+|\d+',"he11o world!"):...print match.group()...
he
11
o
Solution 4:
I know I'm little late, but here's an alternative to the \G
approach:
import re
defreplace(match):
if match.group(0)[0] == '/': return match.group(0)
else: return'<' + match.group(0) + '>'
source = '''http://a.com http://b.com
//http://etc.'''
pattern = re.compile(r'(?m)^//.*$|http://\S+')
result = re.sub(pattern, replace, source)
print(result)
output (via Ideone):
<http://a.com> <http://b.com>
//http://etc.
The idea is to use a regex that matches both kinds of string: a URL or a commented line. Then you use a callback (delegate, closure, embedded code, etc.) to find out which one you matched and return the appropriate replacement string.
As a matter of fact, this is my preferred approach even in flavors that do support \G
. Even in Java, where I have to write a bunch of boilerplate code to implement the callback.
(I'm not a Python guy, so forgive me if the code is terribly un-pythonic.)
Solution 5:
Don't try to put everything into one expression as it become very hard to read, translate (as you see for yourself) and maintain.
import re
lines = [re.sub(r'http://[^\s]+', r'<\g<0>>', line) for line in text_block.splitlines() ifnot line.startedwith('//')]
print'\n'.join(lines)
Python is not usually best when you literally translate from Perl, it has it's own programming patterns.
Post a Comment for "Do Python Regexes Support Something Like Perl's \g?"