Skip to content
This repository has been archived by the owner on Dec 12, 2020. It is now read-only.

How to debug and troubleshoot my code generator #76

Closed
mwpowellhtx opened this issue Jun 7, 2018 · 18 comments
Closed

How to debug and troubleshoot my code generator #76

mwpowellhtx opened this issue Jun 7, 2018 · 18 comments
Labels

Comments

@mwpowellhtx
Copy link

Basically that. Is there some sort of "print line", console output, etc, I could wire into?

I see evidence that the generator is at least trying, literally the .generated.cs file is landing in my obj/$(Configuration) folder. However, what I expect to be generated is not being generated, but I'd at least like to get some feedback as to what is happening.

Do you have a Gitter I could post to?

@mwpowellhtx
Copy link
Author

Notes for above; the generated file is there, the comments are there, but there is no output. But I assure you there is a tree I am building which should be there.

Ideally, I should be able to examine the output of the test assembly for the Type of interest, the expected elements, methods, constructors, etc, whether public, private, static, or what have you. But if the generator is not being invoked, how do I detect that? Never mind what the solution is.

Consequently, I am also chewing on a Generators parallel to the Analyzers Diagnostic Verifiers, but I wonder if that is even possible, considering whether it is possible to do a couple of things:

  • Introduce NuGet package dependencies. i.e. to the requisite CodeGeneration.Roslyn.BuildTime package
  • Introduce DotNet CLI references, along similar lines, dependencies on dotnet-codegen

Assuming presence of the Microsoft.CodeAnalysis.Project instance. It is possible, for instance, to reference a concrete Assembly MetadataReference, but I do not think there is one for BuildTime?

But basically, somewhere between the given source code, and the compilation result, i.e.

Project project;
// Ostensibly, my Generator is invoked during this call:
var diagnostics = project.GetCompilationAsync().Result.GetDiagnostics();
// Do something with the Diagnostics...

And/or, ostensibly, additional analytics on the built source tree. Which, I assume would invoke the generator.

Short of this approach, would a "simple" Console.WriteLine be sufficient? Something which I could at least scan the build output to determine whether I am even hitting the generator in the first place? Or other CodeGeneration.Roslyn logs which would yield as much?

@amis92
Copy link
Collaborator

amis92 commented Jun 8, 2018

I'm no expert, but I've had my share of experience with this codebase, allow me to iron out the way this framework/tool works.

First, let's go through all of the artifacts this project (CodeGeneration.Roslyn) produces:

  • CodeGeneration.Roslyn package contains the actual worker logic that compiles your project and calls custom code generators for appropriate syntax nodes. It consists of:

    • ICodeGenerator interface that custom generators must implement, through which they are called. An important thing to note is that the custom implementation must also provide a public constructor with single AttributeData parameter.
    • TransformationContext class that is the context parameter passed into ICodeGenerator's method and references the source node on which generator-associated attribute was found, the whole compilation of the project (in the state before any code is generated), and a semantic model for it. Also a directory of the project file is provided in the form of string.
    • CompilationGenerator is a public API class that handles everything from getting input parameters for generation, compiling, loading assemblies (also dependecies), monitoring, building, formatting and writing output files, and publishing the results into output parameters. There is a single point of delegation - for each SyntaxTree a method of DocumentTransform (described below) is called, with three retry attempts in case a file lock somewhere blocks reading a required file.
    • DocumentTransform is the class responsible for creating a generated SyntaxTree, by visiting all appropriate SyntaxNodes (root, namespace and type declaration), discovering generator-associated attributes they're decorated with, calling these generators and joining all results from these generators in an appropriate "parent" node (e.g. namespace). This is the class that calls custom generator's GenerateAsync method.
  • [CodeGeneration.Roslyn.Attributes] contains a single attribute definition that must decorate the generator-associated attribute. That attribute in turn must then be applied onto any member you want your custom generator to be called with.
    Important to note is the fact that when decorating your custom attribute with CodeGenerationAttribute and using it's string-parametrized constructor, you must pass in custom generator's full assembly-qualified type name in CLR notation, e.g. for a MyGenerators.Generators.CustomGenerator class in a MyGenerators assembly, you must provide the string in the form of MyGenerators.Generators.CustomGenerator, MyGenerators.

  • [CodeGeneration.Roslyn.Tool] contains an implementation of a command line (CLI) executable that is published as dotnet-codegen DotNetCliToolproject tool. This executable parses arguments, sets up and calls intoCodeGeneration.Roslyn.CompilationGenerator`, and writes out file with generated files names.

  • [CodeGeneration.Roslyn.Tasks], packaged into CodeGeneration.Roslyn.BuildTime, contains:

    • GenerateCodeFromAttributes is MSBuild's ToolTask implementation. This is a plumbing that defines Task's Input and Output parameters so that MSBuild understands them, and calls dotnet-codegen executable with arguments built from inputs, and then reads generated-files-list file and pushes it into approriate output.
    • CodeGeneration.Roslyn.BuildTime .props and .targets MSBuild files. When the package is referenced by NuGet (or the files are directly imported in project file, as in Tests projects), MSBuild's pipeline is extended to call the Task described above in appropriate build phase.

Now, armed with understanding of the CodeGeneration framework structure, we can analyze a typical code generation-involving build process.

  1. MSBuild begins a build.
  2. After MSBuild discovers all source files, resolves assembly references and does all other necessary steps, the MSBuild target defined in .targets file from CodeGeneration.Roslyn.BuildTime package is resolved.
    1. This target calls GenerateCodeFromAttributes MSBuild Task, which invokes dotnet-codegen executable with appropriately-prepared argument list.
    2. dotnet-codegen executalbe parses arguments and after assigning appropriate values to CompilationGenerator's properties, invokes it's method.
    3. CompilationGenerator does it's job, and assigns results to its output properties.
    4. dotnet-codegen writes the generated-file-list to file.
    5. MSBuild task outputs the list of generated files read from the file.
  3. MSBuild target adds generated files to the Compile Item and continues on to the next build steps.

To clear up:

  • This framework introduced a new build step, which involves compiling all existing project files using Roslyn Compiler Platform API (Microsoft.CodeAnalysis), but it's not the final compilation - this is done later, and includes generated files.
  • The ICodeGenerator custom implementation is created and called for each generator-referencing-attribute-decorated SyntaxNode (most often a type declaration).
  • Unit testing MSBuild targets/tasks is a very difficult, and practically undoable.
  • Unit testing custom generator should not involve any of this frameworks' compontents. If you're not using TransformationContext's SemanticModel or Compilation I'd suggest simply parsing a code into a SyntaxTree from which you'd pass an appropriate descendant into the context, and pass it into your ICodeGenerator implementation, validating that the received result is/contains what you expect. That'd satisfy unit-testing's core assumptions. In case you use any of the additional context's parameters, you'll have to appropriately extend the preparation steps.

I hope this helps somewhat.

PS @AArnott having put a bit of effort into writing this down, would you consider using this comment somewhere in wiki or in README, pending additions and modifications you deem necessary? I'm inclined to believe it could help some more people get a clearer picture of how this framework actually works. :)

@mwpowellhtx
Copy link
Author

In general, I've got an idea how it works, but I wonder if the DotNetCli references could at all be migrated to NuGet style package references? In other words, make the whole code-gen process more seamless. That would certainly help with the dependencies, keeping the whole pipeline flowing. That, or there being first class support for propagating DotNetCli references via NuGet, somehow...

@amis92
Copy link
Collaborator

amis92 commented Jun 8, 2018

@mwpowellhtx No, DotNetCliTools are by design non-propagating. They are project-level, development-time only tools, and are useful only during the build of the project in which they are referenced. Other similar tools are EntityFrameworkCore tool, XmlSerializer tool etc.

That said, if you're referencing the issue of including the reference to the tool package, then that's a whole other story - for example it could be "done" (more like "hacked" but it'd work) by preparing an MSBuild .props file that'd add a reference to it. But that works only for direct references (including packages props and targets into build) - for indirectly referenced packages it doesn't.

So if you're planning to distribute your custom generator package, currently it's rather hard. You could hack up a props/targets and include that in your package, but it'll likely cause conflicts if the consumer references more than one code-generating package.

That said, this project is not yet quite stable regarding the API and generators based on different dotnet-codegen tool will likely crash anyway.

@mwpowellhtx
Copy link
Author

@amis92 Yes, I understand that. But what I'm suggesting, what I wonder about, is whether the NuGet package reference PrivateAssets="All", or development only, propagation couldn't be leveraged here?

It would certainly make unit testing a heck of a lot easier, I think. That said concerning Code Gen, per se...

Concerning Verifiers efforts, yes, I am breaking up the dependencies a bit to better support at least a compilation-oriented treatment of my generator callbacks. I'd followed a similar approach as you did concerning calling helper Generate methods and so forth. That should make at least those somewhat testable, just short of integration bits.

For this purpose, I wouldn't need Code Fixes, per se, so to allow some flexibility whether analyzers are required, and so on. In this instance, all I need is the compilation result, diagnostic, etc. Anyway, that's a bit tangential to this, apart from ensuring my integration scaffold is in order.

@mwpowellhtx
Copy link
Author

Following up here, I am at a point, I have the Microsoft.CodeAnalysis.AttributeData in hand after successful compilation. If I create a new MyGenerator(myAttrData), and subsequently invoke the GenerateAsync(...) method, this assumes that I land with appropriate TransformationContext and potentially also IProgress<Diagnostic>. However, where do I get those from?

Short of that, sort of integration tests, I don't know if it is appropriate from a unit test perspective to simply test whatever underlying bits are contributing to the generation of the SyntaxList<MemberDeclarationSyntax>. At least not without incurring the Code Generation overhead in a unit test context, which is kind of why I'm here in this issue in the first place; cutting out that glue as much as possible in order to verify whether code generation is happening at all.

Assuming that it is, why doesn't it land in the expected generated code documents?

@mwpowellhtx
Copy link
Author

It seems I can do something like this:

var context = new CodeGeneration.Roslyn.TransformationContext(classDecl
    , compilation.GetSemanticModel(compilation.SyntaxTrees.First()), compilation, ProjectDirectory);

var members = g.GenerateAsync(context, new DiagnosticProgress(), CancellationToken.None).Result;

But this is generating something that clearly is not correct. Does this tell me I must even provide white space elements? The really important bits are there, to be sure; but the white space elements, new lines, etc, are not. Really?

  | Name | Value | Type
-- | -- | -- | --
  | $"{members}" | "partialclassCardinalDirection{privateCardinalDirection(byte[]bytes):base(bytes){}publicstaticCardinalDirectionoperator~(CardinalDirectionother)=>other?.BitwiseNot();publicstaticCardinalDirectionoperator&(CardinalDirectiona,CardinalDirectionb)=>a?.BitwiseAnd(b);publicstaticCardinalDirectionoperator\|(CardinalDirectiona,CardinalDirectionb)=>a?.BitwiseOr(b);publicstaticCardinalDirectionoperator^(CardinalDirectiona,CardinalDirectionb)=>a?.BitwiseXor(b);}" | string

@mwpowellhtx
Copy link
Author

Okay, with .NormalizeWhitespace, thank goodness! 👍

@mwpowellhtx
Copy link
Author

mwpowellhtx commented Jun 14, 2018

Alright, so at this point, I am fairly confident that everything from the CSharpCompilation is good, and everything in the ICodeGenerator itself is good, if acceptable. I have not tested deep in the generated code yet, but I can tell from visual inspection that it is good, as expected.

Literally, what is generated is this:

  | Name | Value | Type
-- | -- | -- | --
  | $"{members}" | "partial class CardinalDirection\r\n{\r\n    private CardinalDirection(byte[] bytes): base(bytes)\r\n    {\r\n    }\r\n\r\n    public static CardinalDirection operator ~(CardinalDirection other) => other?.BitwiseNot();\r\n    public static CardinalDirection operator &(CardinalDirection a, CardinalDirection b) => a?.BitwiseAnd(b);\r\n    public static CardinalDirection operator \|(CardinalDirection a, CardinalDirection b) => a?.BitwiseOr(b);\r\n    public static CardinalDirection operator ^(CardinalDirection a, CardinalDirection b) => a?.BitwiseXor(b);\r\n}" | string

So... Something glue-wise, in between, is not working.

@mwpowellhtx
Copy link
Author

Ugh, there's also a whole slew of possible combinations for parent syntax declarations that could happen. Such as,

namespace MyClasses { }
namespace My.Classes { }
namespace My { namespace Classes { } }

To name a few. Which all has an impact on the syntax that must be digested. I'll have to study that in more depth I think for the actual semantic model, how it breaks down.

Resulting syntax elements are something like, Json-ish in pseudo code:

{ "MyClasses" : NamespaceDeclarationSyntax }
{ "My.Classes" : NamespaceDeclarationSyntax }

And,

{ "My" : NamespaceDeclarationSyntax { "Classes" : NamespaceDeclarationSyntax } }

In the last couple of use cases, I might expect "My.Classes" to fall out in the semantic model, however.

@amis92
Copy link
Collaborator

amis92 commented Jun 15, 2018

I don't know what's not working for you, but I can assure you that Roslyn.CodeGeneration takes care of inserting results of your Generator into parent syntax correctly.

using System;

namespace A
{
  namespace B
  {
    [CustomGenerator]
    class ToBeProcessed
    {
      public string Text { get; }
    }
  }
}

If your generator returns SyntaxList of class Generated { } you should expect the resulting generated file to contain:

using System;

namespace A
{
  namespace B
  {
    class Generated { }
  }
}

That will also cover nested classes - their "parent" classes will be auto-inserted as well.

@mwpowellhtx
Copy link
Author

I don't know, either; at the moment, I am 99.9% confident the generator itself would provide the correct code, unless, let's say, it wasn't before and was silently failing for some reason. This is plausible considering I had to NormalizeWhitespace after all the important bits had been arranged. At any rate, it's helped me improve a couple of things, and raised my awareness of a couple of potentially nasty corner cases re: namespaces. For the typical, sort of, dot delimited namespace, no worries. But for nested ones, or anything more complicated than that, I shall not worry about it for now. Really, it's just more of a concern verifying not only the compilation under test, but also the generated bits aligning with that compilation. The next steps for me are to investigate the connective tissue in more depth, and there are a half dozen or so that deserve a closer look most likely.

@amis92
Copy link
Collaborator

amis92 commented Jun 15, 2018

NormalizeWhitespace is also colled by Roslyn.CodeGeneration just before writing to the file anyway, so that was not a problem.

I'd really like to help you out, but without access to your project I don't think I can. Maybe you could prepare a minimal reproduction on GitHub, or sth.

@mwpowellhtx
Copy link
Author

@amis92 Well, I'd appreciate it. My project is on Github, for starters. I am in the process of making the transition into Standard/Core. If you are interested you could start there. I am fleshing out my Generator unit tests at the moment. As soon as that is more or less presentable, I will commit that and reconsider the Code Generation integration bits.

@mwpowellhtx
Copy link
Author

@amis92 I pushed my work thus far into the repo just now. The next bits for me are to review my comprehension of the Code Generation integration. Otherwise, should be ready for you to have a gander at it if you wouldn't mind.

@amis92
Copy link
Collaborator

amis92 commented Jan 22, 2019

Afaik the issue was resolved. Closing.

@amis92 amis92 closed this as completed Jan 22, 2019
@daiplusplus
Copy link

daiplusplus commented Apr 18, 2020

Just to chime-in (and crosspost from my reply to #186 ) - I'm able to debug and step-through my ICodeGenerator instances by calling System.Diagnostics.Debugger.Launch(); from my ICodeGenerator implementation's class constructor:

public class DuplicateWithSuffixGenerator : ICodeGenerator
{
	private readonly string suffix;

	public DuplicateWithSuffixGenerator( AttributeData attributeData )
	{
		this.suffix = (string)attributeData.ConstructorArguments[0].Value;

		System.Diagnostics.Debugger.Launch();
		while( !System.Diagnostics.Debugger.IsAttached )
		{
			Thread.Sleep( 500 ); // eww, eww, eww
		}
	}

	public Task<SyntaxList<MemberDeclarationSyntax>> GenerateAsync( TransformationContext context, IProgress<Diagnostic> progress, CancellationToken cancellationToken )
	{
		// Our generator is applied to any class that our attribute is applied to.
		ClassDeclarationSyntax applyToClass = (ClassDeclarationSyntax)context.ProcessingNode;

		// Apply a suffix to the name of a copy of the class.
		ClassDeclarationSyntax copy = applyToClass.WithIdentifier(SyntaxFactory.Identifier(applyToClass.Identifier.ValueText + this.suffix));

		// Return our modified copy. It will be added to the user's project for compilation.
		SyntaxList<MemberDeclarationSyntax> results = SyntaxFactory.SingletonList<MemberDeclarationSyntax>(copy);

		return Task.FromResult( results );
	}
}

@dszryan
Copy link

dszryan commented Jul 11, 2020

my code generation attribute inherits from the below

    [AttributeUsage(AttributeTargets.Interface)]
    public class GeneratorAttribute : Attribute
    {
        public bool LaunchDebuggerDuringBuild { get; set; }
    }

    public static class GeneratorAttributeExtensions
    {
        internal static void IfRequestedLaunchDebugger(this AttributeData attributeData)
        {
            if (!attributeData.NamedArguments.Any(n => n.Key == nameof(GeneratorAttribute.LaunchDebuggerDuringBuild) && n.Value.ToCSharpString() == "true")) return;
            Debugger.Launch();
            while (!Debugger.IsAttached) Thread.Sleep(500); // eww, eww, eww
        }
    }

N.B.: // eww, eww, eww is still present.

that way at build-runtime I can decide to launch the debugger when I need to investigate something - simply by adding the attribute property 'LaunchDebuggerDuringBuild' when required.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants