CodeIgor

A simple script to serialize files

Special params

Jun 15, 2010 by cesla

When you define params, e.g.

<param pattern="SOME_UNIQUE_ID">SOME VALUE</param>

You may want to use special kind of parameters, like solution's name or maybe you'd like to extract some things from the config file, like number of solutions. CodeIgor can do this for you, when you put a special value instead of "SOME VALUE" in your param definition:

  • {codeigor:solution_name} - current solution's name
  • {codeigor:solution_id} - current solution's id (a natural number 1..N)
  • {codeigor:number_of_solutions} - number of solutions (N)
  • {codeigor:output_filename} - currently processed filename
  • {codeigor:output_extension} - currently processed filename's extension
  • {codeigor:output_file} - currently processed file (filename+'.'+extension)

Simple regular expressions

Jun 15, 2010 by cesla

There is also a special pattern, which can help those more advancened users achieve their goals. This pattern is {codeigor:regexp} and, as one may expect, allows using regular expressions instead of simple substitutions. The format is as follows:

<param pattern="{codeigor:regexp}"> /regular expr. pattern/regular expr. replacement/ </param>

Since you may want to use special chars, like '>' or '<', you shoud use a CDATA section, like this:

<param pattern="{codeigor:regexp}"><![CDATA[ /regular expr. pattern/regular expr. replacement/ ]]></param>

For example, if you want to get rid of all spaces in your file, you would use a regexp like this:

<param pattern="{codeigor:regexp}"><![CDATA[ /\s+// ]]></param>

Note that this is line-by-line match evaluation (i.e. you cannot match more than one line in your regular expression pattern): if you need to match more than one line, check 'fullregex' blocks (described below).

And if you want to make nice spaces in your 'for' loops, you would use something like this (mind that regexp pattern in the CDATA section should be written in one line):

<param pattern="{codeigor:regexp}"><![CDATA[ /for\s*\(\s*([^;]+)\s*;\s*([^;]+)\s*;\s*([^;]+)\s*\)/for (\1; \2; \3)/ ]]></param>

This last example changes all those nasty 'for' loops:

for(int i=0;i<5;i++) for (int i=0 ; i<5 ; i++)

into proper code:

for (int i=0; i<5; i++)

(see Wikipedia on POSIX regexp syntax - but mind that POSIX character classes are not available here)

Multiline regular expressions ('fullregex' blocks)

Jun 24, 2010 by cesla

If you need to match more than one line in your regular expression, using {codeigor:regexp} pattern isn't enough, as it is evaluated like the rest of params: line by line. This means that we get one line from input file, parse it (and change it if that's the case) and save it in a buffer, which is later saved to output file. Therefore, regular expression substitution with multiline match cannot be done simultaneously and is done after all params are applied to the input file.

The syntax is pretty much the same as with {codeigor:regexp} param, but instead of 'param' block, you specify a 'fullregex' block:

<fullregex flags="m|i"><![CDATA[ /multiline\ncontent/substitution/ ]]></fullregex>

You can specify several matching flags (separated by '|'), specified in Python documentation (see re module docs):

  • i: Perform case-insensitive matching; expressions like '[A-Z]' will match lowercase letters, too.
  • m: When specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline). By default, '^' matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string.
  • s: Make the '.' special character match any character at all, including a newline; without this flag, '.' will match anything except a newline.
  • u: Make '\w', '\W', '\b', '\B', '\d', '\D', '\s' and '\S' dependent on the Unicode character properties database.
  • l: Make '\w', '\W', '\b', '\B', '\d', '\D', '\s' and '\S' dependent on the current locale.
  • x: This flag allows you to write regular expressions that look nicer. Whitespace within the pattern is ignored, except when in a character class or preceded by an unescaped backslash, and, when a line contains a '#' neither in a character class or preceded by an unescaped backslash, all characters from the leftmost such '#' through the end of the line are ignored.