Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in serializing avro with union type null and byte #319

Open
pandsaurus opened this issue Jan 8, 2025 · 4 comments
Open

Error in serializing avro with union type null and byte #319

pandsaurus opened this issue Jan 8, 2025 · 4 comments
Labels
✨ Feature Request New feature or request 💪 Help Wanted Extra attention is needed

Comments

@pandsaurus
Copy link

pandsaurus commented Jan 8, 2025

Hi @mostafa,

Happy new year!

I have an inquiry regarding serialization of avro with union types of null and byte. I am currently working on a k6 script and encountering the issue below (for obvious reasons, I omitted/anonimzed those I think are sensitive info).

time="2025-01-08T02:36:25Z" level=error msg="GoError: Failed to encode data, OriginalError: %!w(*errors.errorString=&{cannot decode textual record \"com.wallet.transaction.avro.TransactionProcessedV1\": cannot decode textual union: cannot decode textual map: cannot determine codec: \"bytes\" for key: \"payoutAmount\"})\n\tat github.com/mostafa/xk6-kafka.(*Kafka).schemaRegistryClientClass.func4 (native)\n\tat serializeSchemaAvro (file:////dist/test.js:2:4419(13))\n\tat produce (file:///test.js:2:4572(11))\n\tat produceTransaction (file://test.js:2:23437(9))\n\tat Ae (file:///test.js:2:66172(8))\n" executor=ramping-arrival-rate scenario=sport_trans_sc source=stacktrace

Below is a snippet of the avro schema

{ "type": "record", "name": "transactionProcessedV1", "namespace": "com.wallet.transaction.avro", "fields": [ { "name": "payoutAmount", "type": [ "null", { "type": "bytes", "logicalType": "decimal", "precision": 19, "scale": 4 } ], "doc": "Payout amount", "default": null }]

I already saw issue #220 and explicitly defined 'bytes' as the type (actually that is the first thing I did since that was also the solution I found a long time ago when I encountered the issue above when using a union type of string) but as we can see above I still encountered the issue.

I also saw#285 and tried but it was giving me this error

panic: cannot encode binary record "com.wallet.transaction.avro.TransactionProcessedV1" field "payoutAmount": value does not match its schema: cannot encode binary union: no member schema types support datum: allowed types: [null bytes.decimal]; received: map[string]interface {}

schema.json
data.json

Not sure if this is a limitation of goavro since in union types bytes is not included? or if I am doing something wrong here or perhaps a workaround on this as I am totally clueless why I am encountering the issue :( .

@mostafa
Copy link
Owner

mostafa commented Jan 8, 2025

Hey,

Read this #220 (comment).

@pandsaurus
Copy link
Author

Hi @mostafa,

Thanks for the quick response :) , actually the comment in #220 comment is the first thing I did but still encountering the issue above (failed to encode data), Attached is a snippet of the data

{
  "payoutAmount": {
    "bytes": "\u0000"
  }
}

I also tried using nested-avro-schema to try to debug it but it is giving me the an error

panic: cannot encode binary record "com.wallet.transaction.avro.TransactionProcessedV1" field "payoutAmount": value does not match its schema: cannot encode binary union: no member schema types support datum: allowed types: [null bytes.decimal]; received: map[string]interface {}

I attached the schema and data json files I used.

Schema:

{
	"type": "record",
	"name": "TransactionProcessedV1",
	"namespace": "com.wallet.transaction.avro",
	"fields": [
		{
			"name": "payoutAmount",
			"type": [
				"null",
				{
					"type": "bytes",
					"logicalType": "decimal",
					"precision": 19,
					"scale": 4
				}
			],
			"doc": "Payout amount",
			"default": null
		}
	]
}

Data:

{
  "payoutAmount": {
    "bytes": "\u0000"
  }
}

@mostafa
Copy link
Owner

mostafa commented Jan 8, 2025

The problem arises in handling and converting JSON to Avro, specifically with logical types. This appears to be a missing feature, and it would be great if someone could implement it.

I tested this JSON snippet using the nested-avro-schema tool:

{
  "payoutAmount": {
    "bytes.decimal": "\u0000"
  }
}

Notice the bytes.decimal instead of bytes key. However, it resulted in the following error:

panic: cannot encode binary record "com.wallet.transaction.avro.TransactionProcessedV1" field "payoutAmount": value does not match its schema: cannot transform to bytes, expected *big.Rat, received string

This error indicates that goavro expects the bytes.decimal logical type to be explicitly provided as a *big.Rat (math/big package) in Go. If you pass a number, it will be interpreted as a float64, which cannot be automatically converted to *big.Rat by goavro.

To address this, all logical types in the JSON input must be identified and converted separately to their corresponding Go types before being passed to the Avro serializer. This should be handled in the AvroSerde Serialize function—either immediately after or possibly before the toJSONBytes call—so that goavro can correctly process these fields.

If you are (or anyone is) interested in contributing this enhancement, I’d be happy to review the PR and merge it promptly.

@mostafa mostafa added the ✨ Feature Request New feature or request label Jan 8, 2025
@mostafa mostafa added this to xk6-kafka Jan 8, 2025
@github-project-automation github-project-automation bot moved this to Todo in xk6-kafka Jan 8, 2025
@mostafa mostafa added the 💪 Help Wanted Extra attention is needed label Jan 8, 2025
@pandsaurus
Copy link
Author

pandsaurus commented Jan 9, 2025

Hi @mostafa,

Thank you for the clarification, I am not sure if I have the technical knowledge for this but will be happy to help and give it a try as soon as I can. For now I do have a workaround (by sending a set of api request's to our internal service).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ Feature Request New feature or request 💪 Help Wanted Extra attention is needed
Projects
Status: Todo
Development

No branches or pull requests

2 participants