Using REMatch versus REReplace in ColdFusion 8

Posted on Aug 08, 2007

A new function in ColdFusion 8 that I hadn't seen discussed yet is REMatch (and obviously the related REMatchNoCase). Well, today I was writing an application where I needed to strip all but alphanumeric characters from a string. Looking through the regular expression reference (my favorite is RegExLib) you will see that "\w" will return only alphanumeric characters. Except that prior to ColdFusion 8, there was only REFind, which will return the start and end positions of matches but clearly won't work in this case, and REReplace, which would work if I wanted to replace the alphanumeric characters. What I really wanted was all the matching characters returned, and this is where REMatch comes in.Now, in this specific scenario, my result was possible without REMatch because there is an "opposite" of sorts to "\w" which is "\W", which returns all non-alphanumeric characters, as in the example below:

<cfset blogTitle = "Using REMatch versus REReplace: A Comparison" /> <P>
<cfset friendlyBlogURL = REReplaceNoCase(blogTitle,"\W","","All") /> <cfdump var="#friendlyBlogURL#" />

However, finding a simply "opposite" that you can use with replace isn't always so easy. Thankfully, ColdFusion 8 has rectified this missing functionality with REMatch. REMatch will return an array of all matches to your regular expression within a string. Since what we want in this case is our string returned with only alphanumeric characters, we can simply use the ArrayToList function with an empty delimiter like in this example:

<cfset blogTitle = "Using REMatch versus REReplace: A Comparison" /> <P>
<cfset friendlyBlogURL = REMatchNoCase("\w",blogTitle) /> <cfdump var="#arrayToList(friendlyBlogURL,'')#" />

Both examples will output the blog title as "UsingREMatchversusREReplaceAComparison" - notice how the spaces and colon have been removed.

If you know the power of regular expressions, you can probably think of a number of cases where you wished you had something like REMatch - thankfully you no longer have to wish, just upgrade :)

Comments

Chris Phillips Did you know the ^ negates a set of characters?
I do the same thing all the time like this:

locKey = reReplace(Location,&quot;[^[:alpha:]]&quot;,&quot;&quot;,&quot;all&quot;)

[:alpha:] is a POSIX character class. They are worth looking into. I have Ben's &quot;Teach Yourself Regular Expressions in 10 Minutes&quot; book at my desk. I highly recommend it as a good pocket reference.

Posted By Chris Phillips / Posted on 08/13/2007 at 12:31 PM


Write your comment



(it will not be displayed)





About

My name is Brian Rinaldi and I am the Web Community Manager for Flash Platform at Adobe. I am a regular blogger, speaker and author. I also founded RIA Unleashed conference in Boston. The views expressed on this site are my own & not those of my employer.