Skip to content

Commit

Permalink
Avro to C# and Java
Browse files Browse the repository at this point in the history
Signed-off-by: Clemens Vasters <clemens@vasters.com>
  • Loading branch information
clemensv committed Apr 8, 2024
1 parent d76c216 commit ef5cb0c
Show file tree
Hide file tree
Showing 8 changed files with 803 additions and 4 deletions.
66 changes: 65 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,12 @@ You can use the tool to convert between Avro Schema and other schema formats
like JSON Schema, XML Schema (XSD), Protocol Buffers (Protobuf), ASN.1, and
database schema formats like Kusto Data Table Definition (KQL) and T-SQL Table
Definition (SQL). That means you can also convert from JSON Schema to Protobuf
going via Avro Schema.
going via Avro Schema.

You can also generate C# and Java code from the Avro Schema documents with
Avrotize. The difference to the native Avto tools is that Avrotize can emit
data classes without Avro library dependencies and, optionally, with annotations
for JSON serialization libraries like Jackson or System.Text.Json.

The tool does not convert data (instances of schemas), only the data structure
definitions.
Expand Down Expand Up @@ -126,6 +131,11 @@ Converting from Avro Schema:
- [`avrotize a2tsql`](#convert-avro-schema-to-t-sql-table-definition) - Convert Avro schema to T-SQL table definition.
- [`avrotize a2pq`](#convert-avro-schema-to-empty-parquet-file) - Convert Avro schema to empty Parquet file.

Generate code from Avro Schema:

- [`avrotize a2csharp`](#generate-c-code-from-avro-schema) - Generate C# code from Avro schema.
- [`avrotize a2java`](#generate-java-code-from-avro-schema) - Generate Java code from Avro schema.

### Convert Proto schema to Avro schema

```bash
Expand Down Expand Up @@ -392,6 +402,60 @@ Conversion notes:
to structures, not to Parquet unions since those are not supported by the
PyArrow library used here.

### Generate C# code from Avro schema

```bash
avrotize a2csharp --avsc <path_to_avro_schema_file> --csharp <path_to_csharp_directory> [--avro-annotation] [--system-text-json-annotation] [--newtonsoft-json-annotation] [--pascal-properties]
```

Parameters:
- `--avsc`: The path to the Avro schema file to be converted.
- `--csharp`: The path to the C# directory to write the conversion result to.
- `--avro-annotation`: (optional) If set, the tool will add Avro annotations to the C# classes.
- `--system-text-json-annotation`: (optional) If set, the tool will add System.Text.Json annotations to the C# classes.
- `--newtonsoft-json-annotation`: (optional) If set, the tool will add Newtonsoft.Json annotations to the C# classes.
- `--pascal-properties`: (optional) If set, the tool will use PascalCase properties in the C# classes.

Conversion notes:
- The tool generates C# classes that represent the Avro schema as data classes.
- Using the `--system-text-json-annotation` or `--newtonsoft-json-annotation` option
will add annotations for the respective JSON serialization library to the generated
C# classes. Because the [`JSON Schema to Avro`](#convert-json-schema-to-avro-schema) conversion generally
preserves the JSON Schema structure in the Avro schema, the generated C# classes
can be used to serialize and deserialize data that is valid per the input JSON schema.
- The classes are generated into a directory structure that reflects the Avro namespace
structure. The tool drops a minimal, default `.csproj` project file into the given
directory if none exists.


### Generate Java code from Avro schema

```bash
avrotize a2java --avsc <path_to_avro_schema_file> --java <path_to_java_directory> [--package <java_package_name>] [--avro-annotation] [--jackson-annotation] [--pascal-properties]
```

Parameters:
- `--avsc`: The path to the Avro schema file to be converted.
- `--java`: The path to the Java directory to write the conversion result to.
- `--package`: (optional) The Java package name to use in the generated Java classes.
- `--avro-annotation`: (optional) If set, the tool will add Avro annotations to the Java classes.
- `--jackson-annotation`: (optional) If set, the tool will add Jackson annotations to the Java classes.
- `--pascal-properties`: (optional) If set, the tool will use PascalCase properties in the Java classes.

Conversion notes:

- The tool generates Java classes that represent the Avro schema as data classes.
- Using the `--jackson-annotation` option will add annotations for the Jackson
JSON serialization library to the generated Java classes. Because the
[`JSON Schema to Avro`](#convert-json-schema-to-avro-schema) conversion generally
preserves the JSON Schema structure in the Avro schema, the generated Java classes
can be used to serialize and deserialize data that is valid per the input JSON schema.
- The directory `/src/main/java` is created in the specified directory and the
generated Java classes are written to this directory. The tool drops a
minimal, default `pom.xml` Maven project file into the given directory if none
exists.


## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.
Expand Down
38 changes: 37 additions & 1 deletion avrotize/avrotize.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import argparse
from avrotize.asn1toavro import convert_asn1_to_avro
from avrotize.avrotocsharp import convert_avro_to_csharp
from avrotize.avrotojava import convert_avro_to_java
from avrotize.avrotojsons import convert_avro_to_json_schema
from avrotize.avrotokusto import convert_avro_to_kusto
from avrotize.avrotoparquet import convert_avro_to_parquet

from avrotize.avrotoproto import convert_avro_to_proto
from avrotize.avrototsql import convert_avro_to_tsql
from avrotize.jsonstoavro import convert_jsons_to_avro
Expand Down Expand Up @@ -70,6 +71,23 @@ def main():
kstruct2a_parser = subparsers.add_parser('kstruct2a', help='Convert Kafka Struct to Avro schema')
kstruct2a_parser.add_argument('--kstruct', type=str, help='Path to the Kafka Struct file', required=True)
kstruct2a_parser.add_argument('--avsc', type=str, help='Path to the Avro schema file', required=True)

a2csharp_parser = subparsers.add_parser('a2csharp', help='Convert Avro schema to C# classes')
a2csharp_parser.add_argument('--avsc', type=str, help='Path to the Avro schema file', required=True)
a2csharp_parser.add_argument('--csharp', type=str, help='Output path for the C# classes', required=True)
a2csharp_parser.add_argument('--avro-annotation', action='store_true', help='Use Avro annotations', default=False)
a2csharp_parser.add_argument('--system-text-json-annotation', action='store_true', help='Use System.Text.Json annotations', default=False)
a2csharp_parser.add_argument('--newtonsoft-json-annotation', action='store_true', help='Use Newtonsoft.Json annotations', default=False)
a2csharp_parser.add_argument('--pascal-properties', action='store_true', help='Use PascalCase properties', default=False)

a2java_parser = subparsers.add_parser('a2java', help='Convert Avro schema to Java classes')
a2java_parser.add_argument('--avsc', type=str, help='Path to the Avro schema file', required=True)
a2java_parser.add_argument('--java', type=str, help='Output path for the Java classes', required=True)
a2java_parser.add_argument('--package', type=str, help='Java package name', required=False)
a2java_parser.add_argument('--avro-annotation', action='store_true', help='Use Avro annotations', default=False)
a2java_parser.add_argument('--jackson-annotation', action='store_true', help='Use Jackson annotations', default=False)
a2java_parser.add_argument('--pascal-properties', action='store_true', help='Use PascalCase properties', default=False)


args = parser.parse_args()
if args.command is None:
Expand Down Expand Up @@ -142,6 +160,24 @@ def main():
avro_schema_path = args.avsc
print(f'Converting Kafka Struct {kstruct_file_path} to Avro {avro_schema_path}')
convert_kafka_struct_to_avro_schema(kstruct_file_path, avro_schema_path)
elif args.command == 'a2csharp':
avro_schema_path = args.avsc
csharp_path = args.csharp
avro_annotation = args.avro_annotation
system_text_json_annotation = args.system_text_json_annotation
newtonsoft_json_annotation = args.newtonsoft_json_annotation
pascal_properties = args.pascal_properties
print(f'Converting Avro {avro_schema_path} to C# {csharp_path}')
convert_avro_to_csharp(avro_schema_path, csharp_path, avro_annotation=avro_annotation, system_text_json_annotation=system_text_json_annotation, newtonsoft_json_annotation=newtonsoft_json_annotation, pascal_properties=pascal_properties)
elif args.command == 'a2java':
avro_schema_path = args.avsc
java_path = args.java
package = args.package
avro_annotation = args.avro_annotation
jackson_annotation = args.jackson_annotation
pascal_properties = args.pascal_properties
print(f'Converting Avro {avro_schema_path} to Java {java_path}')
convert_avro_to_java(avro_schema_path, java_path, package_name=package, avro_annotation=avro_annotation, jackson_annotation=jackson_annotation, pascal_properties=pascal_properties)

if __name__ == "__main__":
try:
Expand Down
Loading

0 comments on commit ef5cb0c

Please sign in to comment.