Writing a language-agnostic Roslyn Analyzer using IOperation

In a previous blog post, I explained how to write a Roslyn Analyzer for C#. This analyzer uses the C# syntax tree and the semantic model to detect some patterns and reports warnings. Then, the code fix can replace nodes in the C# syntax tree. So, the analyzer cannot be used for VB.NET as the syntax tree is different. This means you would need to create and maintain an analyzer for C# and another one for VB.NET. This is possible and lots of analyzers work this way. But the maintenance cost is high. So, the Roslyn team comes with a new solution: IOperation!

C# and VB.NET have different syntaxes, but they can express the same things, e.g. the syntax is different but the semantic is the same. So, the idea chosen by Roslyn is to generate a kind of semantic model above the syntax tree and being able to convert this semantic model to a C# or VB.NET syntax tree. This semantic model is not the same as the actual semantic model, it's more a mix of the syntax tree and the semantic model.

For instance, Dim numbers(2) As Integer and var numbers = new int[2] are semantically identical. Both expressions create a single-dimension array of 2 numbers. But you can also create the array using var numbers = new int[] { 1, 2 }. While the syntax trees are different, they all create a new array. If you use operations, the three expressions will be represented as an IArrayCreationOperation. The operation will expose the dimensions of the array.

Let's create a language-agnostic analyzer to replace zero length array creation (new T[0] or new T[] { }) by Array.Empty<T>().

The structure is very similar to the analyzer of the previous post. But, this time we can add Visual Basic to the list of supported languages, and we use RegisterOperationAction to register the analyzer instead of RegisterSyntaxNodeAction.

// You can declare CSharp and Visual Basic as the analyzer is language agnostic
[DiagnosticAnalyzer(LanguageNames.CSharp, LanguageNames.VisualBasic)]
public class UseArrayEmptyAnalyzer : DiagnosticAnalyzer
{
    private static readonly DiagnosticDescriptor s_rule = new DiagnosticDescriptor(
        "Sample",
        title: "Use Array.Empty<T>()",
        messageFormat: "Use Array.Empty<T>()",
        RuleCategories.Usage,
        DiagnosticSeverity.Warning,
        isEnabledByDefault: true,
        description: "");

    public override ImmutableArray<DiagnosticDescriptor> SupportedDiagnostics => ImmutableArray.Create(s_rule);

    public override void Initialize(AnalysisContext context)
    {
        context.EnableConcurrentExecution();
        context.ConfigureGeneratedCodeAnalysis(GeneratedCodeAnalysisFlags.None);

        // Will call AnalyzeArrayCreationOperation for each ArrayCreation operation
        context.RegisterOperationAction(AnalyzeArrayCreationOperation, OperationKind.ArrayCreation);
    }

    private void AnalyzeArrayCreationOperation(OperationAnalysisContext context)
    {
        // Cast the operation to the actual operation type. You can get the type by looking at the XML documentation of the OperationKind enum members.
        // https://github.com/dotnet/roslyn/blob/6e63c8f74ab8e8af9a545a6625907f529c843d62/src/Compilers/Core/Portable/Operations/OperationKind.cs
        var operation = (IArrayCreationOperation)context.Operation;
        if (IsZeroLengthArrayCreation(operation))
        {
            // We can access the original C# or VB.NET syntax node using operation.Syntax.
            // This way we can get the location to report the diagnostic
            var diagnostic = Diagnostic.Create(s_rule, operation.Syntax.GetLocation());
            context.ReportDiagnostic(diagnostic);
        }
    }

    private static bool IsZeroLengthArrayCreation(IArrayCreationOperation operation)
    {
        // Check if the array has only 1 dimension
        // new int[]  : 1 dimension
        // new int[,] : 2 dimensions
        if (operation.DimensionSizes.Length != 1)
            return false;

        // Get the size of the first dimension
        // ConstantValue is the actual value as an object
        var dimensionSize = operation.DimensionSizes[0].ConstantValue;
        return dimensionSize.HasValue && (int)dimensionSize.Value == 0;
    }
}

Now, we have to create the CodeFix. The CodeFix should also be language agnostic. So, instead of using the syntax tree directly, we'll use the SyntaxGenerator which will generate the right SyntaxNode depending on the language of the document. The syntax generator looks like CodeDom, so it's very easy to use.

// You can declare CSharp and Visual Basic as the code fix is language agnostic
[ExportCodeFixProvider(LanguageNames.CSharp, LanguageNames.VisualBasic), Shared]
public sealed class UseArrayEmptyFixer : CodeFixProvider
{
    public override ImmutableArray<string> FixableDiagnosticIds => ImmutableArray.Create("Sample");

    public override FixAllProvider GetFixAllProvider() => WellKnownFixAllProviders.BatchFixer;

    public override async Task RegisterCodeFixesAsync(CodeFixContext context)
    {
        var root = await context.Document.GetSyntaxRootAsync(context.CancellationToken).ConfigureAwait(false);
        var nodeToFix = root.FindNode(context.Span, getInnermostNodeForTie: true);
        if (nodeToFix == null)
            return;

        var title = "Use Array.Empty<T>()";
        var codeAction = CodeAction.Create(
            title,
            ct => ConvertToArrayEmpty(context.Document, nodeToFix, ct),
            equivalenceKey: title);

        context.RegisterCodeFix(codeAction, context.Diagnostics);
    }

    private static async Task<Document> ConvertToArrayEmpty(Document document, SyntaxNode nodeToFix, CancellationToken cancellationToken)
    {
        var editor = await DocumentEditor.CreateAsync(document, cancellationToken).ConfigureAwait(false);

        var semanticModel = await document.GetSemanticModelAsync(cancellationToken).ConfigureAwait(false);

        // Get the generator that will generate the SyntaxNode for the expected language (C# or VB.NET)
        var generator = editor.Generator;

        // Get the type of the elements of the array (new int[] => int)
        var elementType = GetArrayElementType(nodeToFix, semanticModel, cancellationToken);
        if (elementType == null)
            return document;

        // Generate the new node "Array.Empty<T>()" (replace T with elementType)
        var arrayTypeSymbol = semanticModel.Compilation.GetTypeByMetadataName("System.Array");
        var arrayEmptyName = generator.MemberAccessExpression(
            generator.TypeExpression(arrayTypeSymbol),
            generator.GenericName("Empty", elementType));
        var arrayEmptyInvocation = generator.InvocationExpression(arrayEmptyName);

        // Replace the old node with the new node in the document
        editor.ReplaceNode(nodeToFix, arrayEmptyInvocation);
        return editor.GetChangedDocument();
    }

    private static ITypeSymbol GetArrayElementType(SyntaxNode arrayCreationExpression, SemanticModel semanticModel, CancellationToken cancellationToken)
    {
        var typeInfo = semanticModel.GetTypeInfo(arrayCreationExpression, cancellationToken);
        var arrayType = (IArrayTypeSymbol)(typeInfo.Type ?? typeInfo.ConvertedType);
        return arrayType?.ElementType;
    }
}

I think writing an analyzer using IOperation is clearly easier and more readable than using language-specific syntax nodes. However, you have less control over the code generation. Thus, some analyzers cannot be written using this new mechanism. Also, only a subset of the languages is supported by IOperation, so this also limits the possibilities. I hope we'll be able to use this new way of writing analyzers for more cases.

Follow me:
Enjoy this blog? Buy Me A Coffee Donate with PayPal

Leave a reply