2010/01/18

The Data Generation Framework in MbUnit v3 (Part 3)

MbUnit v3 is able to generate pseudo-random string data for your unit tests. A standard regular expression approach is used to constraint the input.
[TestFixture]
public class LuxembourgMobilePhoneValidatorTest
{
[Test]
public void ValidateLuxembourgMobilePhoneNumber(
[RandomStrings(Count = 20, Pattern = "6[269]1[0-9]{6}")]] string phoneNumber)
{
var validator = new LuxembourgMobilePhoneValidator();
Assert.IsTrue(validator.IsValid(phoneNumber));
}
}
Dead easy, is'n it?

It's interesting to remark that most often, regular expressions are used to validate a given input. I mean, we first take the input, and then we validate it against a regular expression pattern. But on the contrary, we first have here an existing pattern, and we try to find a random value that matches it. The .NET regular expression framework does not support this scenario. That's why Gallio implements its own very light regular expression parsing engine.

It's light because it does not need to support all those numerous nifty language elements. In fact, we really do not want to support tags and metacharacters that match non-finite sets of elements. For example, how the generator could reasonably handle with the "A+" pattern? Literally, it matches the character 'A' repeated at least once. But what would be the upper limit of the number of times the character is repeated. Would it be reasonable to provide to the test method a string containing a sequence of 1 million A's? Surely not.

That's why only a tiny subset of regular expression metacharacters is supported.

  • Logical Grouping - Group a part of the expression ((...)).
  • Explicit Quantifier - Specify the number of times the previous expression must be repeated. Constant ({N}) or Range ({N,M}) syntaxes are both accepted.
  • Zero Or One Quantifier Metacharacter - 0 or 1 of the previous expression (?). Same effect as {0,1}.
  • Escape Character - Makes the next character literal instead of a special character (\).
  •