Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check if can set a default value if some value is not present in materialized column #2594

Closed
nityanandagohain opened this issue Apr 19, 2023 · 32 comments · Fixed by #3275, SigNoz/signoz-db-migrations#4, SigNoz/signoz-otel-collector#164 or SigNoz/signoz.io#1037
Assignees

Comments

@nityanandagohain
Copy link
Member

Right now if key is not present the default value is set by clickhouse

@nityanandagohain
Copy link
Member Author

nityanandagohain commented Apr 20, 2023

We can use the if condition for setting a default value.

If I want to create a materialized column for bytes this will be the code to do it.

alter table logs add column bytes Float64 materialized if(indexOf(attributes_float64_key, 'bytes') != 0, attributes_float64_value[indexOf(attributes_float64_key, 'bytes')], 99999999) CODEC(LZ4)

Where 99999999 is the default value.

@srikanthccv @ankitnayan

Show create table command returns

┌─statement──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ CREATE TABLE signoz_logs.logs
(
    `timestamp` UInt64 CODEC(DoubleDelta, LZ4),
    `observed_timestamp` UInt64 CODEC(DoubleDelta, LZ4),
    `id` String CODEC(ZSTD(1)),
    `trace_id` String CODEC(ZSTD(1)),
    `span_id` String CODEC(ZSTD(1)),
    `trace_flags` UInt32,
    `severity_text` LowCardinality(String) CODEC(ZSTD(1)),
    `severity_number` UInt8,
    `body` String CODEC(ZSTD(2)),
    `resources_string_key` Array(String) CODEC(ZSTD(1)),
    `resources_string_value` Array(String) CODEC(ZSTD(1)),
    `attributes_string_key` Array(String) CODEC(ZSTD(1)),
    `attributes_string_value` Array(String) CODEC(ZSTD(1)),
    `attributes_int64_key` Array(String) CODEC(ZSTD(1)),
    `attributes_int64_value` Array(Int64) CODEC(ZSTD(1)),
    `attributes_float64_key` Array(String) CODEC(ZSTD(1)),
    `attributes_float64_value` Array(Float64) CODEC(ZSTD(1)),
    `bytes` Float64 MATERIALIZED if(indexOf(attributes_float64_key, 'bytes') != 0, attributes_float64_value[indexOf(attributes_float64_key, 'bytes')], 99999999) CODEC(LZ4),
    INDEX body_idx body TYPE tokenbf_v1(10240, 3, 0) GRANULARITY 4,
    INDEX id_minmax id TYPE minmax GRANULARITY 1
)
ENGINE = MergeTree
PARTITION BY toDate(timestamp / 1000000000)
ORDER BY (timestamp, id)
TTL toDateTime(timestamp / 1000000000) + toIntervalSecond(604800)
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1 │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

@makeavish
Copy link
Member

makeavish commented Apr 20, 2023

@nityanandagohain Still we are applying condition on bytes value == 0. But zero can be the intended value set by the user.

@nityanandagohain
Copy link
Member Author

No @makeavish , we are checking if bytes is present or not. Indexes in clickhouse starts from 1

@srikanthccv
Copy link
Member

Still we are applying condition on bytes value == 0. But zero can be the intended value set by the user

No, it's checking if the key existed in the array and using it otherwise defaults to 99999999.

It's good to know we can do that, but we should also get an idea if the NULL will really be a perf problem for us. We are not going to put the materialised column in ORDER BY/PRIMARY KEY of the create table, so it shouldn't be a problem. We should skip the NULL in the (PRE)WHERE clause for some queries and should see if that changes the perf concern.

@nityanandagohain
Copy link
Member Author

Here are some results, there are perf issues with nullable columns. The performance drops by half in nullable columns

Non nullable column perf

Elapsed: 9.874 sec. Processed 300.00 million rows, 2.40 GB (30.38 million rows/s., 243.07 MB/s.)

Elapsed: 1.666 sec. Processed 300.00 million rows, 2.40 GB (180.06 million rows/s., 1.44 GB/s.)

Nullable column perf

Elapsed: 12.333 sec. Processed 300.00 million rows, 2.70 GB (24.32 million rows/s., 218.92 MB/s.)

Elapsed: 3.201 sec. Processed 300.00 million rows, 2.70 GB (93.71 million rows/s., 843.42 MB/s.)

As you can see apart from the first query which is done with settings min_bytes_to_use_direct_io=1; the performance drops by half.

Click-house team also uses placeholder values for their table according to their comment last year https://groups.google.com/g/clickhouse/c/AP2FbQ-uoj8/m/SKHRSgIYBwAJ

https://groups.google.com/g/clickhouse/c/AP2FbQ-uoj8

https://www.notion.so/signoz/Nullable-columns-in-Clickhouse-ad40c1cb2f6046a18eb73713480f72ee?pvs=4

@nityanandagohain
Copy link
Member Author

Check out results for

column having
data and max(not null)
data and null

also perf on
null heavy vs not null heavy. ( while filtering out null values)

@nityanandagohain
Copy link
Member Author

nityanandagohain commented Apr 26, 2023

Steps performed

Create the table

create table test(A Int64, B Int64, nonil Nullable(Int64), halfnil Nullable(Int64), mostnil Nullable(Int64)) Engine=MergeTree order by A;

Insert the data

insert into test select number id, id % 9973 x, x, if(x%2==0,x,NULL), if(rand(1)%5==0,x, NULL) from numbers (300000000);

Group by normal (B)

select count(), B from test group by B format Null settings min_bytes_to_use_direct_io=1;

15.255 sec. Processed 300.00 million rows, 2.40 GB (19.67 million rows/s., 157.32 MB/s.)
12.681 sec. Processed 300.00 million rows, 2.40 GB (23.66 million rows/s., 189.25 MB/s.)

----
select count(), B from test group by B format Null;
1.363 sec. Processed 300.00 million rows, 2.40 GB (220.12 million rows/s., 1.76 GB/s.)
0.850 sec. Processed 300.00 million rows, 2.40 GB (352.85 million rows/s., 2.82 GB/s.)

Group by nullable, all values filled (nonil)

select count(), nonil from test group by nonil format Null settings min_bytes_to_use_direct_io=1;

12.065 sec. Processed 300.00 million rows, 2.70 GB (24.87 million rows/s., 223.79 MB/s.)
14.229 sec. Processed 300.00 million rows, 2.70 GB (21.08 million rows/s., 189.75 MB/s.)

----
select count(), nonil from test group by nonil format Null;
2.651 sec. Processed 300.00 million rows, 2.70 GB (113.16 million rows/s., 1.02 GB/s.)
2.286 sec. Processed 300.00 million rows, 2.70 GB (131.24 million rows/s., 1.18 GB/s.)

Group by nullable, but half values are null(halfnil)

select count(), halfnil from test where isNotNull(halfnil) group by halfnil format Null settings min_bytes_to_use_direct_io=1;

14.944 sec. Processed 300.00 million rows, 2.70 GB (20.08 million rows/s., 180.68 MB/s.)
9.424 sec. Processed 300.00 million rows, 2.70 GB (31.84 million rows/s., 286.52 MB/s.)
----
select count(), halfnil from test where isNotNull(halfnil) group by halfnil format Null;
2.793 sec. Processed 300.00 million rows, 2.70 GB (107.42 million rows/s., 966.81 MB/s.)
1.923 sec. Processed 300.00 million rows, 2.70 GB (155.98 million rows/s., 1.40 GB/s.)

Group by mostnil, most of the values are null(mostnil)

select count(), mostnil from test where isNotNull(mostnil) group by mostnil format Null settings min_bytes_to_use_direct_io=1;

Elapsed: 10.257 sec. Processed 300.00 million rows, 2.70 GB (29.25 million rows/s., 263.22 MB/s.)
Elapsed: 11.072 sec. Processed 300.00 million rows, 2.70 GB (27.10 million rows/s., 243.86 MB/s.)

----
select count(), mostnil from test where isNotNull(mostnil) group by mostnil format Null;
1.470 sec. Processed 300.00 million rows, 2.70 GB (204.12 million rows/s., 1.84 GB/s.)
1.516 sec. Processed 300.00 million rows, 2.70 GB (197.83 million rows/s., 1.78 GB/s.)

Conclusion

From the above results, we can see that not nill column is faster than, regardless of how much data is filled in the not nill columns

Clikhouse also suggests not to use nullable column https://clickhouse.com/docs/en/optimize/avoid-nullable-columns
Using Nullable almost always negatively affects performance, keep this in mind when designing your databases.

@srikanthccv
Copy link
Member

Can you create an associated boolean column ({coulmn}_in_keys) for those which could contain null values and replace isNotNull with != false and share the result and the size of the associated column? In your test example additional columns will be halfnil_in_keys, mostnil_in_keys and where isNotNull(halfnil) -> halfnil_in_keys != false where halfnil_in_keys is true if key exists otherwise false.

@nityanandagohain
Copy link
Member Author

@srikanthccv sorry missed your comment, I didn't get what exactly you are asking to try out.

@srikanthccv
Copy link
Member

I was saying create an associated column for the regular column, example, halfnil can potentially be nullable so we create an additional column hafnil_key_exists and fill it in true or false depending on whether the key exists in the list.

That translates to the following for more clear example.

ALTER TABLE logs
    ADD COLUMN `bytes` Nullable(Float64) MATERIALIZED if(indexOf(attributes_float64_key, 'bytes') != 0, attributes_float64_value[indexOf(attributes_float64_key, 'bytes')], NULL) CODEC(LZ4)

ALTER TABLE logs
    ADD COLUMN `bytes_key_exists` Bool MATERIALIZED if(indexOf(attributes_float64_key, 'bytes') != 0, true, false) CODEC(LZ4)

Now when we write query, we will filter out those with null using the bytes_key_exists != false. And your test query will roughly be

select count(), halfnil from test where halfnil_key_exists != false group by halfnil format Null;

wanted to see query time and storage results for this way of doing things.

@nityanandagohain
Copy link
Member Author

nityanandagohain commented Jul 19, 2023

Here are the results

select count(), halfnil from test where isNotNull(halfnil) group by halfnil format Null settings min_bytes_to_use_direct_io=1;

7.712 sec. Processed 300.00 million rows, 2.70 GB (38.90 million rows/s., 350.08 MB/s.)
7.544 sec. Processed 300.00 million rows, 2.70 GB (39.77 million rows/s., 357.92 MB/s.)
8.235 sec. Processed 300.00 million rows, 2.70 GB (36.43 million rows/s., 327.87 MB/s.)

select count(), halfnil from test where exist != false group by halfnil format Null;

10.495 sec. Processed 300.00 million rows, 3.00 GB (28.58 million rows/s., 285.85 MB/s.)
8.157 sec. Processed 300.00 million rows, 3.00 GB (36.78 million rows/s., 367.79 MB/s.)
8.249 sec. Processed 300.00 million rows, 3.00 GB (36.37 million rows/s., 363.70 MB/s.)

Here are the results @srikanthccv
Don't see any major advantage/disadvantage

@srikanthccv
Copy link
Member

Thanks, I wanted to see if there is a better way to avoid the magic values for nullable rows because what you consider magic could be a real value for the user. You never know how people use stuff in the wild. Have you decided on how to handle this for all data types?

@nityanandagohain
Copy link
Member Author

Are you asking what are the max values that we have considered since we are not going with nullable columns ?

@nityanandagohain
Copy link
Member Author

The max values for the data types are the max value present here https://clickhouse.com/docs/en/sql-reference/data-types/int-uint .

For string I am using a value #SIGNOZ_MAX

For boolean we can't have anything default, but as of now we don't handle boolean seperately in logs we just store it is a string. Though we might need to find a solution for boolean

@srikanthccv
Copy link
Member

I still believe the magic values for each data type (and I see the boolean is not taken care of) is not a good solution. How would you check if something exists in map/array in go (or any other language for that matter)? There is usually some membership check involved. And developers use INT_MAX in their code special meaning, which could collide. When somebody looks at schema, how would they realize that we add these special values? As there is a majority agreement already I am reiterating my concerns for the record but don't want to block anything.

@nityanandagohain
Copy link
Member Author

Ah so you are saying if we have array and map columns in future what kind of default values will we set right ?

Because empty map/array won't tell was if it was actually there or not.

@srikanthccv
Copy link
Member

No, the map/array part I was referring to is the if value, exists := m[key]; exists or similar used to check if something exists vs writing value != {INT_MAX, STRING_MAX, ...}.

@nityanandagohain
Copy link
Member Author

Ahh okay, would you still recommend nullable columns in that case ?

@srikanthccv
Copy link
Member

I was not convinced how the Group by normal (B) beat the other, and we concluded From the above results, we can see that not nill column is faster than, regardless of how much data is filled in the not nill columns. I was interested in using the {}_key_exists and seeing the results. Which I think is a better approach than INT_MAX, STRING_MAX and works well for all data types now or in future. Since this will be a cross signal requirement, I want us to take one solid approach everywhere. I am going to try this on my VM and share my observations.

@srikanthccv
Copy link
Member

Comparing the relative timing for the same queries, I don't see how a non-nil column is faster regardless of how much data is filled.

CREATE TABLE test
(
    `A` Int64,
    `B` Int64,
    `nonil` Int64,
    `nonil_key_exists` bool,
    `halfnil` Int64,
    `halfnil_key_exists` bool,
    `mostnil` Int64,
    `mostnil_key_exists` bool
)
ENGINE = MergeTree
ORDER BY A

INSERT INTO test
SELECT
    number AS id,
    id % 9973 AS x,
    x,
    true,
    if((x % 2) = 0, x, 0),
    if((x % 2) = 0, true, false),
    if((rand(1) % 5) = 0, x, 0),
    if((rand(1) % 5) = 0, true, false)
FROM numbers(300000000);

Inserted data

SELECT
    countDistinct(B),
    countDistinct(nonil),
    countDistinct(halfnil),
    countDistinct(mostnil)
FROM test

Query id: 59c82174-43e1-41c6-808b-c9db3c925ed1

┌─uniqExact(B)─┬─uniqExact(nonil)─┬─uniqExact(halfnil)─┬─uniqExact(mostnil)─┐
│         9973 │             9973 │               4987 │               9973 │
└──────────────┴──────────────────┴────────────────────┴────────────────────┘

Group by normal (B)

select count(), B from test group by B format Null settings min_bytes_to_use_direct_io=1;

0 rows in set. Elapsed: 7.187 sec. Processed 300.00 million rows, 2.40 GB (41.74 million rows/s., 333.94 MB/s.)

0 rows in set. Elapsed: 5.771 sec. Processed 300.00 million rows, 2.40 GB (51.99 million rows/s., 415.88 MB/s.)

select count(), B from test group by B format Null;

0 rows in set. Elapsed: 4.063 sec. Processed 300.00 million rows, 2.40 GB (73.84 million rows/s., 590.73 MB/s.)
0 rows in set. Elapsed: 3.909 sec. Processed 300.00 million rows, 2.40 GB (76.74 million rows/s., 613.92 MB/s.)

Group by nullable, all values filled (nonil)

select count(), nonil from test group by nonil format Null settings min_bytes_to_use_direct_io=1;

0 rows in set. Elapsed: 5.774 sec. Processed 300.00 million rows, 2.40 GB (51.95 million rows/s., 415.63 MB/s.)

0 rows in set. Elapsed: 5.772 sec. Processed 300.00 million rows, 2.40 GB (51.98 million rows/s., 415.83 MB/s.)

select count(), nonil from test group by nonil format Null;

0 rows in set. Elapsed: 3.856 sec. Processed 300.00 million rows, 2.40 GB (77.80 million rows/s., 622.38 MB/s.)

0 rows in set. Elapsed: 3.840 sec. Processed 300.00 million rows, 2.40 GB (78.12 million rows/s., 624.97 MB/s.)

Group by nullable, but half values are null(halfnil)

select count(), halfnil from test where halfnil_key_exists = true group by halfnil format Null settings min_bytes_to_use_direct_io=1;

0 rows in set. Elapsed: 2.952 sec. Processed 300.00 million rows, 2.70 GB (101.64 million rows/s., 914.77 MB/s.)

0 rows in set. Elapsed: 2.964 sec. Processed 300.00 million rows, 2.70 GB (101.23 million rows/s., 911.07 MB/s.)

select count(), halfnil from test where halfnil_key_exists = true group by halfnil format Null;

0 rows in set. Elapsed: 2.540 sec. Processed 300.00 million rows, 2.70 GB (118.13 million rows/s., 1.06 GB/s.)

0 rows in set. Elapsed: 2.141 sec. Processed 300.00 million rows, 2.70 GB (140.14 million rows/s., 1.26 GB/s.)

Group by mostnil, most of the values are null(mostnil)

select count(), mostnil from test where mostnil_key_exists = true group by mostnil format Null settings min_bytes_to_use_direct_io=1;

0 rows in set. Elapsed: 3.145 sec. Processed 300.00 million rows, 2.70 GB (95.38 million rows/s., 858.46 MB/s.)

0 rows in set. Elapsed: 2.952 sec. Processed 300.00 million rows, 2.70 GB (101.63 million rows/s., 914.71 MB/s.)


select count(), mostnil from test where mostnil_key_exists group by mostnil format Null;

0 rows in set. Elapsed: 2.765 sec. Processed 300.00 million rows, 2.70 GB (108.49 million rows/s., 976.40 MB/s.)

0 rows in set. Elapsed: 2.158 sec. Processed 300.00 million rows, 2.70 GB (139.00 million rows/s., 1.25 GB/s.)

@nityanandagohain
Copy link
Member Author

nityanandagohain commented Jul 24, 2023

Interesting, your results seems to be different from mine. I performed it on my mac though.

What configuration machine did you test this on ?

Thanks for running this again

@srikanthccv
Copy link
Member

4 Core, 16GB VM. And for that reason, I am only comparing the relative timings for different types of queries. I always do this kind of work on VMs because you can't trust a personal Mac for this.

@nityanandagohain
Copy link
Member Author

I will test this on another machine as well, and will get back

@nityanandagohain
Copy link
Member Author

Have tested this and found similar results, using an boolean columns seems to be the way to goo. But let me think about other cases that we might need to handle. @srikanthccv do you think something might come up with ingestion performance ?

@srikanthccv
Copy link
Member

If the total number of columns is in the range of 1000, then the system defaults of clickhouse work fine.

@ankitnayan
Copy link
Collaborator

The perf impact shared by @srikanthccv is non-trivial (almost twice performant). My concern is storage cost. But guessing should not be large. Let's assume we are going forward with having an extra column. Some questions follow:

  • Are the users supposed to use the _keyexists columns in the query? Or do we do it behind the scenes for the query builder?
  • Storage impact? I think nullable perf impact was mentioned under the storage section as it needs to create another columns to store if the value is null or not. We seem to be following the same with a new _keyexists column.

Maybe, the users do not run queries with _keyexists=true, but we can slowly train them as it is a better query. I am seeing all blogs using a weird default value to keep things simple. But we gain in perf if we add a column as found in our observation.

@nityanandagohain

@nityanandagohain
Copy link
Member Author

nityanandagohain commented Jul 31, 2023

So _keyexists columns will be queried behind the scenes. The user will just have context about selected/interesting (i.e indexed/non indexed).

The reason is that we know before a query is run if the column is indexed or not if indexed we can modify the query in a certain way to include _keyexists


I will get back on the storage part

@nityanandagohain
Copy link
Member Author

So compared the ingestion speed, little to no difference in ingestion speed due to addition of boolean column.

The exra storage used by the bool column is also very less i.e 1.27 MiB │ 285.99 MiB compressed and uncompressed for all rows to be true for 300 million rows.

Without extra column

create table non_empty_exists(A Int64, B Int64, b_exists bool) Engine=MergeTree order by A;

insert into non_empty_exists select number id, id % 9973, true from numbers (300000000);

0 rows in set. Elapsed: 60.136 sec. Processed 300.93 million rows, 2.41 GB (5.00 million rows/s., 40.03 MB/s.)
0 rows in set. Elapsed: 57.210 sec. Processed 300.93 million rows, 2.41 GB (5.26 million rows/s., 42.08 MB/s.)

Size

┌─database─┬─table────────────┬─column───┬─compressed─┬─uncompressed─┬─compr_ratio─┬──rows_cnt─┬─avg_row_size─┐
 test      non_empty_exists  A         1.12 GiB    2.23 GiB                2  300000000             8 
 test      non_empty_exists  B         1.12 GiB    2.23 GiB                2  300000000             8 
 test      non_empty_exists  b_exists  1.27 MiB    285.99 MiB         224.44  300000000             1 
└──────────┴──────────────────┴──────────┴────────────┴──────────────┴─────────────┴───────────┴──────────────┘

with extra column

create table non_empty_exists(A Int64, B Int64, b_exists bool) Engine=MergeTree order by A;

insert into non_empty_exists select number id, id % 9973, true from numbers (300000000);

0 rows in set. Elapsed: 60.136 sec. Processed 300.93 million rows, 2.41 GB (5.00 million rows/s., 40.03 MB/s.)
0 rows in set. Elapsed: 57.210 sec. Processed 300.93 million rows, 2.41 GB (5.26 million rows/s., 42.08 MB/s.)


Size

┌─database─┬─table────────────┬─column───┬─compressed─┬─uncompressed─┬─compr_ratio─┬──rows_cnt─┬─avg_row_size─┐
 test      non_empty_exists  A         1.12 GiB    2.23 GiB                2  300000000             8 
 test      non_empty_exists  B         1.12 GiB    2.23 GiB                2  300000000             8 
 test      non_empty_exists  b_exists  1.27 MiB    285.99 MiB         224.44  300000000             1 
└──────────┴──────────────────┴──────────┴────────────┴──────────────┴─────────────┴───────────┴──────────────┘

@ankitnayan
Copy link
Collaborator

Please go ahead with the implementation and ask in clickhouse community why are nullable columns less performant?

@nityanandagohain
Copy link
Member Author

@ankitnayan @srikanthccv need suggestion.

For the top level columns, since the default values are already added by SDK/Otel collector and we won't have default column for them. In our current implementation we don't have index for all these top level columns and users are allowed to convert them to selected fields by creating index in the backend.Should we also add index for them in our migration by default.

The reason is that these fields will always be there regarless of the log line and will be filled with default values sent by the SDK/collector. I think we can add appropiate index for these and then add it in the migration.

Also we can disable update fields for these top level keys i.e (adding/removing index)

What are your thoughts ?

@ankitnayan
Copy link
Collaborator

ankitnayan commented Aug 7, 2023

Your thoughts LGTM
We should have added severity_text in the sorting key, maybe. Can't be done now.

cc: @rkssisodiya

@nityanandagohain
Copy link
Member Author

Added default index for top level keys SigNoz/signoz-otel-collector#164

but have not added skip index for trace_id, span_id as there is not point in adding them since they are always unique and no pattern to be used with some kind of bloom filter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment