Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up CodeGenerator by moving compilation-global data and logic to Context #1190

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

mzabaluev
Copy link
Contributor

@mzabaluev mzabaluev commented Nov 19, 2024

Refactoring improvements to prost-build:

  • Move compilation-global context data and logic around it from CodeGenerator to a new Context struct.
  • Lift similar logic into Context from MessageGraph, where it caused some data duplication.
  • Resolve some awkwardness about the Config reference needing to be mutable for its user-provided service generator, by exposing both from Context in a more disciplined way.

This also fixes a bug in the logic that determines whether Copy can be derived on a message, where a boxed config rule matching a field that is part of a oneof would be disregarded.

@mzabaluev
Copy link
Contributor Author

mzabaluev commented Nov 19, 2024

I would like this to be merged before I add an enum type registry, which is necessary to properly implement imports (and later with editions, the enum_type feature) for #1079.

Copy link
Collaborator

@caspermeijn caspermeijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see plenty of internal changes that I like: better naming, cleanup of boxed, more helper functions.

  • Is it possible to separate those from the API changes? Then we can that merge that before doing a version number bump
  • Is it possible to separate each improvement into a separate commit? That will make reviewing easier.

buf: &mut String,
) {
impl<'a, 'b> CodeGenerator<'a, 'b> {
pub fn generate(context: &mut Context<'b>, file: FileDescriptorProto, buf: &mut String) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an API breaking change, right? I don't understand how the Context instance should be created by prost users as there is no documentation changes included.

Copy link
Contributor Author

@mzabaluev mzabaluev Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, CodeGenerator is private to the crate, as is the newly added Context.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see: Config::generate is public, but CodeGenerator is private. Can you confirm that no breaking change happened using cargo semver-checks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I also think it would help to make this method pub(crate) to alleviate any confusion.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes pub(crate) is a great idea

pub struct CodeGenerator<'a> {
config: &'a mut Config,
pub struct CodeGenerator<'a, 'b> {
context: &'a mut Context<'b>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why some fields are in CodeGenerator, Context or Config. For example, depth also feels like it could be part of Context.

Can you explain when a field should be in each of those structs?

Copy link
Contributor Author

@mzabaluev mzabaluev Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config is the public settings configured by the API user.

Context is a private helper that captures data global to the compilation requests: the message graph, the map of extern paths, and the reference to the Config. I also plan to add a map of enum openness flags for a proper implementation of open enums in #1079.
It is populated before code generation and the instance is shared between successive calls to CodeGenerator::generate for the requested members. This replaces several arguments that did the same thing with the disparate data structures.

The owned members of CodeGenerator manage state specific to code generation for a proto file. So depth belongs there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Context fields are used for multiple files and CodeGenerator fields are discarded after a single file. That makes sense.

Can you add this explanation as a doc comment to the types?

prost-build/src/context.rs Show resolved Hide resolved
pub struct CodeGenerator<'a> {
config: &'a mut Config,
pub struct CodeGenerator<'a, 'b> {
context: &'a mut Context<'b>,
Copy link
Contributor Author

@mzabaluev mzabaluev Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config is the public settings configured by the API user.

Context is a private helper that captures data global to the compilation requests: the message graph, the map of extern paths, and the reference to the Config. I also plan to add a map of enum openness flags for a proper implementation of open enums in #1079.
It is populated before code generation and the instance is shared between successive calls to CodeGenerator::generate for the requested members. This replaces several arguments that did the same thing with the disparate data structures.

The owned members of CodeGenerator manage state specific to code generation for a proto file. So depth belongs there.

prost-build/src/context.rs Show resolved Hide resolved
prost-build/src/context.rs Show resolved Hide resolved
prost-build/src/context.rs Outdated Show resolved Hide resolved
@mzabaluev mzabaluev changed the title Clean up CodeGenerator by moving global data and code to Context Clean up CodeGenerator by moving compilation-global data and logic to Context Nov 20, 2024
Move global context data and logic around it from CodeGenerator
to a new Context struct. Also move the logic of can_message_derive_copy
and can_field_derive_copy from MessageGraph where it caused some
data duplication.
Copy link
Collaborator

@caspermeijn caspermeijn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not every line is improved, but I think this is an improvement overall. I have a few questions that I like to be answered before merging.

pub struct CodeGenerator<'a> {
config: &'a mut Config,
pub struct CodeGenerator<'a, 'b> {
context: &'a mut Context<'b>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Context fields are used for multiple files and CodeGenerator fields are discarded after a single file. That makes sense.

Can you add this explanation as a doc comment to the types?

buf: &mut String,
) {
impl<'a, 'b> CodeGenerator<'a, 'b> {
pub fn generate(context: &mut Context<'b>, file: FileDescriptorProto, buf: &mut String) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see: Config::generate is public, but CodeGenerator is private. Can you confirm that no breaking change happened using cargo semver-checks?

self.buf.push_str(&format!(
"impl {}::Name for {} {{\n",
self.config.prost_path.as_deref().unwrap_or("::prost"),
"impl {prost_path}::Name for {} {{\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this. It improves readability.

@@ -297,15 +285,16 @@ impl CodeGenerator<'_> {
self.pop_mod();
}

if self.config.enable_type_names {
if self.context.config().enable_type_names {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a decrease in readability. It adds an extra layer .context and an extra indirection config().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a .config() helper method to CodeGenerator.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's leave it like this for now. I wanted to point to a concrete example where I think readability suffered. But I think this PR is a net positive.

prost-build/src/code_generator.rs Outdated Show resolved Hide resolved
prost-build/src/context.rs Show resolved Hide resolved
prost-build/src/context.rs Outdated Show resolved Hide resolved
prost-build/src/context.rs Outdated Show resolved Hide resolved
prost-build/src/context.rs Outdated Show resolved Hide resolved
It was not obvious immediately from the declaration that
CodeGenerator is private to the crate.
Add doc comments clarifying the lifetime of `Context` and
`CodeGenerator` in the code generation process.
Rename to service_generator_mut so it's evident that we are obtaining
a mutable reference even though the Context reference is shared.
Avoid behavioral changes in the Context refactoring.
Avoid behavioral changes in the Context refactoring.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants