Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature](serde) support presto compatible output format (#37039) #37253

Merged
merged 3 commits into from
Jul 4, 2024

Conversation

morningman
Copy link
Contributor

bp #37039

the output format of some data types are different between Presto/Trino
and Doris,
especially for complex type such as array, map and struct.
When user migrate from Presto to Doris, they expect the same format so
that they
don't need to modify their business code.

This PR mainly changes:

1. Add a new session variable `serde_dialect`
Default is `doris`, options are `presto` or `trino`. If set to presto or
trino,
the output format returned to MySQL client of some datatypes will be
changed:

    - Array
        Doris: `["abc", "def", "", null]`
        Presto: `[abc, def, , NULL]`

    - Map
        Doris: `{"k1":null, "k2":"v3"}`
        Presto: `{k1=NULL, k2=v3}`

    - Struct
        Doris: `{"s_id":100, "s_name":"abc , "", "s_address":null}`
        Presto: `{s_id=100, s_name=abc , ", s_address=NULL}`

2. Change the output format of struct type

    Remove the space after `:`

    - Before: `{"s_id": 100, "s_name": "abc , "", "s_address": null}`
    - After: ``{"s_id":100, "s_name":"abc , "", "s_address":null}``
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@morningman
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -397,7 +397,8 @@ void DataTypeMapSerDe::read_column_from_arrow(IColumn& column, const arrow::Arra
template <bool is_binary_format>
Status DataTypeMapSerDe::_write_column_to_mysql(const IColumn& column,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_write_column_to_mysql' has cognitive complexity of 84 (threshold 50) [readability-function-cognitive-complexity]

Status DataTypeMapSerDe::_write_column_to_mysql(const IColumn& column,
                         ^
Additional context

be/src/vec/data_types/serde/data_type_map_serde.cpp:409: +1, including nesting penalty of 0, nesting level increased to 1

    if (0 != result.push_string("{", 1)) {
    ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:413: +1, including nesting penalty of 0, nesting level increased to 1

    for (auto j = offsets[col_index - 1]; j < offsets[col_index]; ++j) {
    ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:414: +2, including nesting penalty of 1, nesting level increased to 2

        if (j != offsets[col_index - 1]) {
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:415: +3, including nesting penalty of 2, nesting level increased to 3

            if (0 != result.push_string(", ", 2)) {
            ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:419: +2, including nesting penalty of 1, nesting level increased to 2

        if (nested_keys_column.is_null_at(j)) {
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:420: +3, including nesting penalty of 2, nesting level increased to 3

            if (0 != result.push_string(options.null_format, options.null_len)) {
            ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:423: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:424: +3, including nesting penalty of 2, nesting level increased to 3

            if (is_key_string && options.wrapper_len > 0) {
            ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:424: +1

            if (is_key_string && options.wrapper_len > 0) {
                              ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:425: +4, including nesting penalty of 3, nesting level increased to 4

                if (0 != result.push_string(options.nested_string_wrapper, options.wrapper_len)) {
                ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:428: +4, including nesting penalty of 3, nesting level increased to 4

                RETURN_IF_ERROR(key_serde->write_column_to_mysql(nested_keys_column, result, j,
                ^

be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:428: +5, including nesting penalty of 4, nesting level increased to 5

                RETURN_IF_ERROR(key_serde->write_column_to_mysql(nested_keys_column, result, j,
                ^

be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:430: +4, including nesting penalty of 3, nesting level increased to 4

                if (0 != result.push_string(options.nested_string_wrapper, options.wrapper_len)) {
                ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:433: +1, nesting level increased to 3

            } else {
              ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:434: +4, including nesting penalty of 3, nesting level increased to 4

                RETURN_IF_ERROR(key_serde->write_column_to_mysql(nested_keys_column, result, j,
                ^

be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:434: +5, including nesting penalty of 4, nesting level increased to 5

                RETURN_IF_ERROR(key_serde->write_column_to_mysql(nested_keys_column, result, j,
                ^

be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:438: +2, including nesting penalty of 1, nesting level increased to 2

        if (0 != result.push_string(&options.map_key_delim, 1)) {
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:441: +2, including nesting penalty of 1, nesting level increased to 2

        if (nested_values_column.is_null_at(j)) {
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:442: +3, including nesting penalty of 2, nesting level increased to 3

            if (0 != result.push_string(options.null_format, options.null_len)) {
            ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:445: +1, nesting level increased to 2

        } else {
          ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:446: +3, including nesting penalty of 2, nesting level increased to 3

            if (is_val_string && options.wrapper_len > 0) {
            ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:446: +1

            if (is_val_string && options.wrapper_len > 0) {
                              ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:447: +4, including nesting penalty of 3, nesting level increased to 4

                if (0 != result.push_string(options.nested_string_wrapper, options.wrapper_len)) {
                ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:450: +4, including nesting penalty of 3, nesting level increased to 4

                RETURN_IF_ERROR(value_serde->write_column_to_mysql(nested_values_column, result, j,
                ^

be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:450: +5, including nesting penalty of 4, nesting level increased to 5

                RETURN_IF_ERROR(value_serde->write_column_to_mysql(nested_values_column, result, j,
                ^

be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:452: +4, including nesting penalty of 3, nesting level increased to 4

                if (0 != result.push_string(options.nested_string_wrapper, options.wrapper_len)) {
                ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:455: +1, nesting level increased to 3

            } else {
              ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:456: +4, including nesting penalty of 3, nesting level increased to 4

                RETURN_IF_ERROR(value_serde->write_column_to_mysql(nested_values_column, result, j,
                ^

be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'

    do {                                \
    ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:456: +5, including nesting penalty of 4, nesting level increased to 5

                RETURN_IF_ERROR(value_serde->write_column_to_mysql(nested_values_column, result, j,
                ^

be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'

        if (UNLIKELY(!_status_.ok())) { \
        ^

be/src/vec/data_types/serde/data_type_map_serde.cpp:461: +1, including nesting penalty of 0, nesting level increased to 1

    if (0 != result.push_string("}", 1)) {
    ^

@morningman
Copy link
Contributor Author

run buildall

@morningman
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.32% (9149/25189)
Line Coverage: 27.86% (74665/267972)
Region Coverage: 26.75% (38502/143907)
Branch Coverage: 23.45% (19511/83212)
Coverage Report: http://coverage.selectdb-in.cc/coverage/be9417607e8415aae346f3c7bfdb927eefc77e9c_be9417607e8415aae346f3c7bfdb927eefc77e9c/report/index.html

@morningman morningman merged commit ceef9ee into apache:branch-2.1 Jul 4, 2024
19 of 21 checks passed
@yiguolei yiguolei mentioned this pull request Jul 19, 2024
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants