faqts : Computers : Programming : Languages : Python : Snippets

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

1 of 1 people (100%) answered Yes

Entry

Sub last match

Jul 5th, 2000 10:03
Nathan Wallace, Hans Nowak, Snippet 363, Andrew M. Kuchling


"""
Packages: text.regular_expressions
"""

"""
>While re.sub can be limited to first match but how about first from end. 
>is it possible?

 Tricky, and difficult for me to figure out without a
motivating example; can you explain what you're trying to do?  Putting .*
in front of your pattern will gobble all the way to the end of the string
and then backtrack, so that *might* do it, but you'd have to rework the
pattern carefully because of the backtracking.

 For example, let's say you want to find a string of characters
followed by a ';' character, and replace them with 'XXX'.  Replacing
just the first group is easy:
"""

import re, string   # added by PSST

pat = re.compile('[^;]*;')
s = "compatible; MSIE 4.01; Windows NT;"
print pat.sub('XXX', s, 1)

"""
This prints 'XXX MSIE 4.01; Windows NT;'.  To do the last string
instead of the first, you might try:
"""

pat = re.compile('(.*) [^;]* ;', re.VERBOSE)
s = "compatible; MSIE 4.01; Windows NT;"
print pat.sub('\g<1>XXX', s, 1)

"""
But this outputs 'compatible; MSIE 4.01; Windows NTXXX', because it
backtracks until the first ';', and [^;]* then matches on a
zero-length string.  In this case you'd be better off doing:
"""

pat = re.compile("""
(
  (?: [^;]* ;)    # Match a single string+semicolon ...
     *            # ... repeated 0 or more times.
) 
[^;]* ;           # Final string + semicolon
""", re.VERBOSE)

s = "compatible; MSIE 4.01; Windows NT;"
print pat.sub('\g<1>XXX', s, 1)

"""
This prints 'compatible; MSIE 4.01;XXX'.

 Greg Ward suggests that re.split() might help; you could split
your string into a list of strings separated by some pattern, and take the
last element of that list:
"""

s = "compatible; MSIE 4.01; Windows NT;"

# Prints ['compatible', 'MSIE 4.01', 'Windows NT', '']
L = re.split(';\s*', s)
print L

# Change list by slicing
L[-2:] = ["XXX"]

# Assemble the string again
print string.join(L, ';')

"""
It's impossible to give more advice without knowing exactly what
you're trying to do.
"""