Jun 09 2009

Using Regular Expressions in Visual Studio find-and-replace window

Category: Uncategorizedzvolkov @ 12:48 pm

In the best spirit of Yesterday’s news I keep posting on stuff everybody knows about. Today I finally had a task that required using RegEx in Visual Studio find-and-replace window.

I had bunch of NHibernate mappings with same string pattern repeated multiple times:

<property name="***Name1***">
  <column name="***Name1***" />
</property>
<property name="***Name2***">
  <column name="***Name2***" />
</property>
<property name="***Name3***">
  <column name="***Name3***" />
</property>

I wanted to replace each of them with the following:

    <property name="***NameX***"/>

As you see, not only this spans multiple lines, but it also requires the regular expression to find all places where the column name is the same as property name and use than name in the resulting string.

First of all I went to VS find-and-replace window (Ctrl+H) and enabled regular expressions by checking the “Use” checkbox and selecting Regular Expressions from the drop down box.

The next step was to figure out the regex. Now, if your experience with regular expressions is limited to copypasting them from the web, you may think it’s too complicated to be used for an adhoc find-and-replace operation like this, but in reality it turns out to be very simple. Here’s what you should do:

  • Escape any non-alphanumeric symbols with \ like so: \<property name\= and so on.
  • use [A-Z]+ to match 1 or more letters, it is case insensitive, like so: \<property name\=\”[A-Z]+\”\>
  • use \n to match end of line like so: \<property name\=\”[A-Z]+\”\>\n
  • use space followed by star (i.e. ” *”) to indicate any number of spaces, like so: \<property name\=\”[A-Z]+\”\>\n *
  • finally, to reference back to a part of the string matched by this expresssion, you need to use so called capture groups. For that you need to surround corresponding part of the regex with figure brackets {} and reference it in the subsequent part of the regex with \1, like so: \<property name\=\”{[A-Z]+}“\>\n *\<column name\=\”\1\” *\/\>\n *\<\/property\>
  • As for the resulting string to replace with, keep in mind it’s not a regex, but it does support some advanced features like \n for new line and \1 to reference capture group #1. Remember to escape non-alphanumeric characters. The final expression may look like so: \<property name\=\”\1″\/\>\n

As you see the {} piece was different than normal RegEx syntaxis for capture groups where you use round brackets () to indicate capture groups.

Happy replacing, and remember: it’s always better to spend more time but learn something new than to save time and stay dumb!

P.S. here’s a Coding Horror post from 2006 that covers this topic in more details: http://www.codinghorror.com/blog/archives/000633.html