Monday, March 09, 2009

Find and replace a pattern using VIM

In recent past I am not using shell scripts and thus lost touch with powerful search replace options for patters.

As Windows Find replace options doesn’t support much of regex stuff, I resorted to VIM.

My intention to fix an XML file in structured format.

I have an xml file with elements like following (it is just a snippet of big file)

<PolicyNo>012345678</PolicyNo>
<DateOfCommencement>03DEC1983</DateOfCommencement>
<Plan_Term>579-60</Plan_Term>
<SumAssured>2,00,000</SumAssured>
<GrievanceRedressalOfficer>011-57293184</GrievanceRedressalOfficer>

And I wanted it in the format of

<DataPoint PolicyNo=”012345678”></DataPoint>
<DataPoint DateOfCommencement=”03DEC1983”></DataPoint>
<DataPoint Plan_Term=”579-60”</DataPoint>
<DataPoint SumAssured=”2,00,000”></DataPoint>
<DataPoint GrievanceRedressalOfficer=”011-57293184”</DataPoint>

Following expression did the trick. I got to tweak second back reference little a bit for different cases where non special characters are present.

And I could also prefix line numbers to confine operations to limited set of lines

like

:9,26s…..

:s/<\(\w\+\)>\(.\+\)<\/\(\w\+\)>/<Detail \1=”\2”><\/Detail>/gc

As usual this is a simple search replace syntax in the format of :s/<find string>/<replace string>/<options in this case gc>

Here find string is formatted as

  1. starting with ‘<’
  2. start of first back reference \(
  3. word of one of more length \w\+
  4. close of first back reference \)
  5. ending with ‘>’
  6. start of second back reference <\
  7. characters repeated till character ‘<’. I changed this for few lines
  8. escape for close charcter ‘/’
  9. closing tag. Though this is need not be back reference. I just did it
  10. ending with >

Now replace string is

  1. “<Detail “
  2. First back reference \1
  3. followed by ‘=’
  4. followed by opening quotes “
  5. second back reference ‘\2’
  6. followed by closing quotes ‘”’
  7. Closing tag syntax.

And the options include

g – all occurrences in a line

c – seek confirmation

Following links might be useful

VIM Regular Expressions

And my favorite regex site Regular Expressions Reference

No comments: