Converting code to the new Regex Source Generator

 
 
  • Gérald Barré

.NET 7 brings a new feature for Regex. Indeed, it allows generating the source code of a regular expression at compile time using a Roslyn Source Generator. Generating the source code at compile-time instead of runtime has multiple advantages:

  • The first regex execution is faster. Indeed, you don't need to parse the regex and generate the code to execute it at runtime.
  • The regex takes advantage of all the optimizations. At compile-time, you can take more time to optimize the regex. Currently, there are the same optimization when using the Source Generator as when using the RegexOptions.Compiled option, but it could change in the future.
  • For platforms that do not allow code generation at runtime, such as iOS, you can get maximum regex performance.
  • The source code is more readable as you must give a meaningful name to the method that generates the code for the regular expression.
  • The generated code contains an English text that describes what the regex matches. It means you can understand what the regular expression is doing even if you don't know some specific syntaxes.
  • The code is more trimmable as you don't need to include all the code to support Regex parsing and code generation.
  • The code is debuggable using breakpoints if needed.
  • You can learn nice optimizations by looking at the generated code 🙂

To use the Regex source generator, you need to use .NET 7 and C# 11 (preview). Also, all regex parameters (pattern, options, timeout) must be constant.

C#
public static bool IsLowercase(string value)
{
    // ✔️ The pattern is contant
    // => can be converted to use the source generator
    var lowercaseLettersRegex = new Regex("[a-z]+");
    return lowercaseLettersRegex.IsMatch("abc");
}

public static bool IsLowercase(string value)
{
    // ✔️ The pattern, options, and timeout are contant
    // => can be converted to use the source generator
    return Regex.IsMatch(value, "[a-z]+", RegexOptions.CultureInvariant, TimeSpan.FromSeconds(1));
}

public static bool Match(string value, string pattern)
{
    // ❌ The pattern is not constant => cannot use the source generator
    return Regex.IsMatch(value, pattern);
}

To convert the previous code to the source generator, you need to extract the Regex to a partial method and decorate the method with the [GeneratedRegex] attribute:

C#
// The Source Generator generates the code of the method at compile time
[GeneratedRegex("^[a-z]+$", RegexOptions.CultureInvariant, matchTimeoutMilliseconds: 1000)]
private static partial Regex LowercaseLettersRegex();

public static bool IsLowercase(string value)
{
    return LowercaseLettersRegex().IsMatch(value);
}

You can see the generated code using the solution explorer or by using "Go to Definition" on the partial method:

#Automating the conversion of a Regex to a Source Generator

Meziantou.Analyzer contains a rule that checks for Regex that can benefit from the Source Generator, and allow converting them easily. First, add the NuGet package to your project:

Shell
dotnet add package Meziantou.Analyzer

Then, the rule MA0110 reports any regex that can benefit from the Source Generator. You can use the code fixer to convert the Regex to a Source Generator. The fixer adds the partial keyword if needed to the parent types, extracts the Regex to a partial method, and enters in rename mode.

Meziantou.Analyzer only suggests the refactoring when the project target C# 11 and the [GeneratedRegex] attribute is available (.NET 7).

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?Buy Me A Coffee💖 Sponsor on GitHub