Performance: Lambda Expressions, Method Groups, and delegate caching

  • Gérald Barré

Delegates are used to pass methods as arguments to other methods. The most common delegates are Action, Func<T>, and EventHandler. You can use a lambda expression to provide a delegate or you can use a method group. You can also cache the delegate into a field and reuse the instance when needed.

C#
void Demo(Func<string> action) { }
string GetString() => "";

// Method Group
Demo(GetString);

// Lambda expression
Demo(() => GetString());

// delegate caching
// store the delegate into a field to reuse it later
Func<string> _instance = new Func<string>(GetString);
Demo(_instance);

Using a method group is shorter, so maybe more readable. However, in this case it can be slower… Let's see why!

#Method group (before C# 11)

When using a method group, the compiler always creates a new instance of the delegate for each call:

C#
// C# code
Demo(GetString);
C#
// Code generated by the compiler
// It allocates a new Func<string> delegate for each call
Demo(new Func<string>(GetString));

So, you have one allocation per call of the method.

#Lambda without closure / Method group (C# 11)

C# 11 changed the way Method Groups are handled. So, method groups and Lambda expressions now generate the same code when there is no closure (source).

When there is no closure, the compiler caches the delegate and reuse it for each call:

C#
// C# code
Demo(() => GetString());
Demo(GetString);
C#
// Code generated by the compiler
// note: I renamed the generated symbols to make them readable
[Serializable]
[CompilerGenerated]
private sealed class Generated
{
    public static readonly Generated instance = new Generated();
    public static Func<string> CachedLambda;

    internal string Lambda() => GetString();
}

// Create the delegate instance the first time, then reuse it.
// There is a null-check at each call to lazily instantiate the Func.
Demo(Generated.CachedLambda ?? (Generated.CachedLambda = new Func<string>(Generated.Lambda)));

#Lambda without closure with manual caching

When the compiler automatically caches the delegate, it uses a null-check to lazy-create the delegate. If performance is an issue, you can manually cache the delegate, so you can avoid the null check:

C#
// Cache the delegate
static readonly Func<string> Instance = new Func<string>(GetString);

// Then, use it
Demo(Instance);

This time there is no null-check, so it should be slightly faster than the code generated by the compiler for Lambda expressions.

#Lambda with closure

When there is a closure, the compiler creates a new delegate for each call as there is no way to reuse the delegate. This means, it allocates one delegate per call.

C#
// C# code
var instance = new Instance();
Demo(() => instance.GetString());
C#
// Code generated by the compiler
// note: I renamed the generated symbols to make them readable
[CompilerGenerated]
private sealed class Generated
{
    public Instance instance;
    internal string Lambda() => instance.GetString();
}

var lambdaInstance = new Generated();
lambdaInstance.instance = new Instance();
Sample(new Func<string>(lambdaInstance.Lambda));

#Bug with nullable reference types?

There is a bug (or a limitation?) in Roslyn (version 4.1.0-2.21607.5) that causes the compiler to not correctly handle nullable reference type attributes when using method groups. So, you may need to use a lambda expression in some cases to correctly get Nullable Reference Type annotation.

C#
var array = Array.Empty<string>();

// Lambda => ok
IEnumerable<string> a = array.Select(x => Convert(x));

// Method group => ko
// Warning CS8619 Nullability of reference types in value of type 'IEnumerable<string?>'
// doesn't match target type 'IEnumerable<string>'.
IEnumerable<string> b = array.Select(Convert);

[return: NotNullIfNotNull("value")]
static string? Convert(string? value) => value;

#Benchmark

C#
[MemoryDiagnoser] // Analyse the memory usage
public class DelegateBenchmark
{
    private static readonly Func<string?> s_cachedAction = GetStringStatic;
    private readonly DelegateBenchmark _instance = new();

    [Benchmark]
    public void MethodGroup() => M(GetStringStatic);

    [Benchmark]
    public void Lambda() => M(() => GetStringStatic());

    [Benchmark]
    public void Lambda_Cached() => M(s_cachedAction);

    [Benchmark]
    public void MethodGroup_Closure() => M(_instance.GetString);

    [Benchmark]
    public void Lambda_Closure() => M(() => _instance.GetString());

    private static void M(Func<string?> _) { }
    private string? GetString() => null;
    private static string? GetStringStatic() => null;
}
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
AMD Ryzen 7 5800X, 1 CPU, 16 logical and 8 physical cores
.NET SDK=6.0.101
  [Host]     : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT
  DefaultJob : .NET 6.0.1 (6.0.121.56705), X64 RyuJIT
MethodMeanErrorStdDevGen 0Allocated
MethodGroup (C# 10)6.537 ns0.1355 ns0.1267 ns0.003864 B
Lambda / MethodGroup (C# 11)1.353 ns0.0407 ns0.0381 ns--
Lambda_Cached1.311 ns0.0350 ns0.0310 ns--
MethodGroup_Closure5.301 ns0.0936 ns0.0782 ns0.003864 B
Lambda_Closure5.286 ns0.1030 ns0.0913 ns0.003864 B

You can see why it is important to avoid closure for performance reasons. Some closure are very easy to remove. For instance, the ConcurrentDictionary have many overloads that helps removing them. You can read my previous post about ConcurrentDictionary and closures for more details.

#Additional resources

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?Buy Me A Coffee💖 Sponsor on GitHub