StringBuilder performance pitfalls

  • .NET

In .NET, strings are immutable. Each operation that appears to modify a string object actually creates a new string. When you are using a loop to concatenate a random number of strings, you should use a StringBuilder to improve the performance.StringBuilder represents a mutable string of characters. So, you can modify it without allocating. I sometimes see some bad usages of the StringBuilder which may reduce its benefits. In this post, we'll see some patterns that can be optimized.

Call Append multiple times instead of concatenating string

The following code is not optimal because it allocates a new string with the result of the concatenation of "test" and "i" and then call AppendLine with it:

[Benchmark]
public string AppendLineWithStringConcatenation()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.AppendLine("test" + i);
    }
    return sb.ToString();
}

You should call Append with each part of the concatenation and then call AppendLine. Note that internally, Append use ISpanFormattable to avoid allocation while converting the number to string. Here's the correct code:

[Benchmark]
public string MultipleAppend()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append("test").Append(i).AppendLine();
    }
    return sb.ToString();
}

You can see that the second code is faster and allocates less memory:

Use Append(char) instead of Append(string) when possible

If you need to add a single character, you should use Append(char) instead of Append(string). The former method is about 40% faster. You can also replace AppendLine("a") with Append('a').AppendLine().

[Benchmark]
public string AppendString()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append("a");
    }
    return sb.ToString();
}

[Benchmark]
public string AppendChar()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append('a');
    }
    return sb.ToString();
}

Use AppendFormat instead of Append(string.Format())

Using ToString is a little bit faster, but AppendFormat is the one that allocates the least memory.

[Benchmark]
public string ToString()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(42.ToString("N2"));
    }
    return sb.ToString();
}

[Benchmark]
public string StringFormat()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(string.Format("{0:N2}", 42)); // Or sb.Append($"{42:N2}")
    }
    return sb.ToString();
}

[Benchmark]
public string AppendFormat()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.AppendFormat("{0:N2}", 42);
    }
    return sb.ToString();
}

Use Append(ReadOnlySpan<char>) instead Append(str.SubString())

Instead of using string.Substring, you can use the overload of Append that support start index and length. It is similar to Substring except it doesn't need to create the intermediate string, so it allocates less memory. In .NET Core you can also use a Span<char> to create a substring.

private const string Str = "abcdefghijklmnopqrstuvwxyz";

[Benchmark]
public string Substring()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(Str.Substring(0, 6));
    }
    return sb.ToString();
}

[Benchmark]
public string Append()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(Str, 0, 6);
    }
    return sb.ToString();
}

[Benchmark]
public string Span()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(Str.AsSpan(0, 6));
    }
    return sb.ToString();
}

[Benchmark]
public string SpanSlice()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(Str.AsSpan().Slice(0, 6));
    }
    return sb.ToString();
}

Use AppendJoin instead of Append(string.Join())

The StringBuilder.AppendJoin method only exists in .NET Core, so some people may not know it. You should use it instead of using string.Join.

[Benchmark]
public string StringJoin()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.Append(string.Join(' ', s_values));
    }
    return sb.ToString();
}

[Benchmark]
public string AppendJoin()
{
    var sb = new StringBuilder();
    for (int i = 0; i < 10000; i++)
    {
        sb.AppendJoin(' ', s_values);
    }
    return sb.ToString();
}

Set the capacity of the StringBuilder

If you know approximately the final size of the string you want to build, you should set the initial capacity. It doesn't change a lot in term of speed, but it can reduce the number of allocations. Note that you can use EnsureCapacity to increase the capacity of an existing StringBuilder.

[Params(1, 1_000, 10_000, 50_000, 999_999, 1_000_000, 1_500_000)]
public int Size { get; set; }

[Benchmark]
public string InitialSize()
{
    var sb = new StringBuilder(Size);
    for (int i = 0; i < 1_000_000; i++)
    {
        sb.Append('a');
    }
    return sb.ToString();
}

Use a pool of StringBuilder

If you use lots of StringBuilder, you may want to use a reusable pool of StringBuilder to avoid lots of allocations. Instead of creating a new instance of StringBuilder when you need it, you get an existing one from the pool. Then, you return the instance to the pool once you finish using it. It seems to be a little bit slower, but it reduces drastically the allocations. This also means that you'll spend less time in the GC.

[Benchmark]
public void WithoutPool()
{
    for (int i = 0; i < 10000; i++)
    {
        var sb = new StringBuilder();
        sb.Append("sample");
        _ = sb.ToString();
    }
}

[Benchmark]
public void WithPool()
{
    // Use NuGet package Microsoft.Extensions.ObjectPool
    var objectPoolProvider = new DefaultObjectPoolProvider();
    var stringBuilderPool = objectPoolProvider.CreateStringBuilderPool();

    for (var i = 0; i < 10000; i++)
    {
        var sb = stringBuilderPool.Get();
        sb.Append("sample");
        _ = sb.ToString();
        stringBuilderPool.Return(sb);
    }
}

Automatically fix your code using a Roslyn analyzer

You can check the usages of these methods in your applications using a Roslyn analyzer. The good news is the free analyzer I've made already contains rules for that: https://github.com/meziantou/Meziantou.Analyzer:

Do you have a question or a suggestion about this post? Contact me on Twitter or by email!

Follow me:
Enjoy this blog?Buy Me A CoffeeDonate with PayPal