Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error when using parquet broker load to table with generated column from MAP #49760

Open
Frefreak opened this issue Aug 13, 2024 · 0 comments
Open
Labels
type/bug Something isn't working

Comments

@Frefreak
Copy link

Frefreak commented Aug 13, 2024

Steps to reproduce the behavior (Required)

CREATE TABLE `test` (
  `a` varchar(65533),
  `b` map<varchar(65533), varchar(65533)>
) ENGINE = OLAP
DUPLICATE KEY (`a`)
DISTRIBUTED BY RANDOM
ORDER BY(`a`);

CREATE TABLE `test_gen` (
  `a` varchar(65533),
  `b` map<varchar(65533), varchar(65533)>,
  `c` varchar(65533) as b['c']
) ENGINE = OLAP
DUPLICATE KEY (`a`)
DISTRIBUTED BY RANDOM
ORDER BY(`a`);

with an arbitrary csv like this:

aaa,"{'a':'11','b':'22','c':'3\'3'}"
bbb,"{'a':'111','b':'222','c':'33\'3'}"

Convert to parquet, and try to load with something like:

LOAD LABEL logs.poc1 ( DATA INFILE("file:///poc.parquet") INTO TABLE test FORMAT AS "parquet" ( a, b)) WITH BROKER allin1broker;
LOAD LABEL logs.poc2 ( DATA INFILE("file:///poc.parquet") INTO TABLE test_gen FORMAT AS "parquet" ( a, b)) WITH BROKER allin1broker;

The generated parquet when inspected with parquet-tool show this (also attached, need to rename):
poc.parquet.txt

❯ parquet-tools show poc.parquet
+-----+---------------------------------------------+
| a   | b                                           |
|-----+---------------------------------------------|
| aaa | [('a', '11'), ('b', '22'), ('c', "3'3")]    |
| bbb | [('a', '111'), ('b', '222'), ('c', "33'3")] |
+-----+---------------------------------------------+

❯ parquet-tools inspect poc.parquet

############ file meta data ############
created_by: parquet-cpp-arrow version 16.1.0
num_columns: 3
num_rows: 2
num_row_groups: 1
format_version: 2.6
serialized_size: 1954


############ Columns ############
a
key
value

############ Column(a) ############
name: a
path: a
max_definition_level: 1
max_repetition_level: 0
physical_type: BYTE_ARRAY
logical_type: String
converted_type (legacy): UTF8
compression: SNAPPY (space_saved: -6%)

############ Column(key) ############
name: key
path: b.key_value.key
max_definition_level: 2
max_repetition_level: 1
physical_type: BYTE_ARRAY
logical_type: String
converted_type (legacy): UTF8
compression: SNAPPY (space_saved: -6%)

############ Column(value) ############
name: value
path: b.key_value.value
max_definition_level: 3
max_repetition_level: 1
physical_type: BYTE_ARRAY
logical_type: String
converted_type (legacy): UTF8
compression: SNAPPY (space_saved: -2%)

Expected behavior (Required)

both load succeed

Real behavior (Required)

the first load to table test without generated column succeed, but the later with the c generated column field failed with error:

type:ETL_RUN_FAIL; msg:Cannot cast '<slot 4>' from VARCHAR to MAP<VARCHAR(65533),VARCHAR(65533)> 

StarRocks version (Required)

3.3.1-2b87854 (all in one docker)

BTW its seems unclear in the docs on how to streamload a csv to table with MAP field. I tried several format but it either reports errors or the map isn't correctly parsed. One way I find is using str_to_map in the columns header but that format seems to be hard to convert to when my value string contains many symbols like ':', '|' etc, not exactly sure how to escape.

@Frefreak Frefreak added the type/bug Something isn't working label Aug 13, 2024
@Frefreak Frefreak changed the title error when using parquet broker load to table with generated field from MAP error when using parquet broker load to table with generated column from MAP Aug 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant