-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(gen): implement struct generation from schema registry subjects and versions #498
feat(gen): implement struct generation from schema registry subjects and versions #498
Conversation
gen/gen.go
Outdated
} | ||
|
||
func (g *Generator) generate(schema avro.Schema) string { | ||
func (g *Generator) generate(schema avro.Schema, schemaMetadata *SchemaMetadata, level int) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not entirely sure I see the purpose of schemaMetadata
. It is a very intrusive change for something that brings little value. They are also values that are meaningless outside of a specific registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I see your point. It does feel intrusive. These values are mostly useful when using Schema registry. The metadata provides information about the subject and the version of some specific schema that is fetched from Schema registry.
However the benefit of these is that the subject and version can be embedded in the generated struct. Whoever will be working with schema registry might certainly appreciate the extra information provided by the generated struct.
I wonder if there could be some other way to embed this information into struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying to attach it to the first struct is not really correct in this case. For instance a schema that is a union -> array -> record
would attach it in a weird place. IMO it would be better to make them consts in the package, meaning it could be attached to the Generator
itself, and handle in the Write
. This would be simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is a good idea. This will keep the method signatures backwards compatible and still provide information about subject and version. I will implement these changes tomorrow and ask again for a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made the changes, the API is backwards compatible, but there are a few things and limitations I would like to point out and clarify.
At the moment, I can only generate structs for a single subject and version pair, because the SchemaMetadata can only be attached on the generator config, so only once. With files, we can still do multiple files per single generate command.
The subject and version are now constants on the package, but I would really like to attach them on the "root" type, the reason being that it would allow us to work with schemaregistry (de)serializer, e.g.:
type AvroMessage interface {
Schema() avro.Schema
Subject() string
Version() int
}
func (s *Deserializer) DeserializeInto(payload []byte, msg interface{}) error {
avroMsg, ok := v.(AvroMessage)
if !ok {
return nil, errors.WithStack(NoAvroMessageError{v})
}
schemaMetadata, err := m.schemaRegistryClient.GetSchemaMetadata(avroMsg.Subject(), avroMsg.Version())
if err != nil {
return nil, err
}
fmt.Println(schemaMetadata.ID)
<...>
}
func (m *AvroMarshaler) Marshal(v interface{}) ([]byte, error) {
avroMsg, ok := v.(AvroMessage)
if !ok {
return nil, errors.WithStack(NoAvroMessageError{v})
}
schemaMetadata, err := m.schemaRegistryClient.GetSchemaMetadata(avroMsg.Subject(), avroMsg.Version())
if err != nil {
return nil, err
}
config := avrov2.NewSerializerConfig()
config.AutoRegisterSchemas = false
config.UseSchemaID = schemaMetadata.ID
serializer, err := avrov2.NewSerializer(m.schemaRegistryClient, serde.ValueSerde, config)
if err != nil {
return nil, err
}
b, err := serializer.Serialize(avroMsg.Subject(), avroMsg)
if err != nil {
return nil, err
}
return b, nil
}
I am not sure what would be the best way to achieve this and whether it's possible at all. I could also generate the Subject() and Version() functions on each of the types that come from current subject and version pair
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe one of the options here would be to provide a map that could be used to generate custom functions on specified types. E.g.:
Customizations: map[string]map[string]any
Where root map key would be type name, child map key would be function name and value would be the value it returns.
e.g.:
test := map[string]map[string]any{
"SomeType": map[string]string{
"AFunction": "a",
"BFunction": "b",
},
}
============================================
type SomeType struct {
// SomeString is a string.
SomeString string `avro:"someString"`
SomeInt int `avro:"someInt"`
}
func (o *SomeType) AFunction() string {
return "a"
}
func (o *SomeType) BFunction() string {
return "b"
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are hoping to generate seems out of the scope of avrogen
, but you could use gen
and a custom template to achieve what you need. In this case, the only change necessary is a permissive (read any
) attachment of metadata in the generator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's really a very specific usecase... I guess it's a bit too much to ask without affecting other users.
I have updated avrogen to support reading from schema registry subject versions. It can do it with multiple entries.
And I have also added the any
metadata field for gen
to be able to implement a custom version of avrogen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor nitpicks
cmd/avrogen/main.go
Outdated
for _, entry := range flgs.Args() { | ||
var schema avro.Schema | ||
|
||
if cfg.SchemaRegistry != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A switch here would be better, as else
is in general a code smell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Example usage:
I will be happy to help with anything that could potentially speed up the review, merge and release of this change. 🙏