Caching an IEnumerable<T> instance

 
 
  • Gérald Barré

An IEnumerable<T> can be costly to enumerate. If you need to do it twice or more, you'll pay the cost each time you enumerate it. For instance, if you use Directory.EnumerateFiles(), it will enumerate the file system each time you start enumerating the results. You can use ToList or ToArray to read all the items and have a data structure you can enumerate very quickly. However, you have to read the full sequence first, and then you can read it for processing. This means you lose the capability of streaming the items.

The idea of the following code is to wrap an IEnumerable<T> instance, and store the items in a list when they are enumerating. In my case, the enumerable may be iterated from multiple threads, so it must be thread-safe. This means the underlying IEnumerable<T> must be enumerated only once even when multiple threads enumerate the items at the same time.

The code is simple and does not deserve lots of comments. Here's the code:

C#
public static class CachedEnumerable
{
    public static CachedEnumerable<T> Create<T>(IEnumerable<T> enumerable)
    {
        return new CachedEnumerable<T>(enumerable);
    }
}

public sealed class CachedEnumerable<T> : IEnumerable<T>, IDisposable
{
    private readonly List<T> _cache = new List<T>();
    private readonly IEnumerable<T> _enumerable;
    private IEnumerator<T> _enumerator;
    private bool _enumerated = false;

    public CachedEnumerable(IEnumerable<T> enumerable)
    {
        _enumerable = enumerable ?? throw new ArgumentNullException(nameof(enumerable));
    }

    public IEnumerator<T> GetEnumerator()
    {
        var index = 0;
        while (true)
        {
            if (TryGetItem(index, out var result))
            {
                yield return result;
                index++;
            }
            else
            {
                // There are no more items
                yield break;
            }
        }
    }

    private bool TryGetItem(int index, out T result)
    {
        // if the item is in the cache, use it
        if (index < _cache.Count)
        {
            result = _cache[index];
            return true;
        }

        lock (_cache)
        {
            if(_enumerator == null && !_enumerated)
            {
                _enumerator = _enumerable.GetEnumerator();
            }

            // Another thread may have get the item while we were acquiring the lock
            if (index < _cache.Count)
            {
                result = _cache[index];
                return true;
            }

            // If we have already enumerate the whole stream, there is nothing else to do
            if (_enumerated)
            {
                result = default;
                return false;
            }

            // Get the next item and store it to the cache
            if (_enumerator.MoveNext())
            {
                result = _enumerator.Current;
                _cache.Add(result);
                return true;
            }
            else
            {
                // There are no more items, we can dispose the underlying enumerator
                _enumerator.Dispose();
                _enumerator = null;
                _enumerated = true;
                result = default;
                return false;
            }
        }
    }

    public void Dispose()
    {
        if (_enumerator != null)
        {
            _enumerator.Dispose();
            _enumerator = null;
        }
    }

    IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
}

Here's how to use the CachedEnumerable<T>:

C#
static void Main(string[] args)
{
    var enumerable = MyEnumerable();
    using var cachedEnumerable = CachedEnumerable.Create(enumerable);

    Parallel.ForEach(cachedEnumerable, item => Console.WriteLine(item));

    foreach (var item in cachedEnumerable)
    {
        Console.WriteLine(item);
    }
}

static IEnumerable<int> MyEnumerable()
{
    // Should be called only once
    yield return 1;
    yield return 2;
    yield return 3;
}

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?Buy Me A Coffee💖 Sponsor on GitHub