code usege of sds (and maybe other data types) #739

zvi-code · 2024-07-02T21:40:21Z

Originally posted by @zvi-code in #541 (comment)

This change makes sense to me overall and tradeoffs are good.

My concern is with that conceptually it's not a good idea that we have 2 conflicting patterns:

use the internal encoding information of sds [pointer is odd]

embed sds into some other non-native allocation buffer, potentially at arbitrary offset

I feel this has very big potential for very hard to track bugs

Trying to arrange my thoughts on this. The above is just an example for a wider issue In the code design. IMO there are two types of 'sds' usages:

A way to carry metadata about a string buffer through IO flow
A compact way to encode string buffer metadata in memory with good locality and low memory overhead

It is clear to me why use 'sds' for 2. For 1 I feel it could be wrong choice and this choice also has performance costs as it forces memory access to unpack the info many times in IO flow. For example sdsfree access the memory on free, even though it could have been avoided. This has high costs when memory access is a factor.
I also think there is an alternative, to take an approach slightly similar to rust, have some struct 'runtimeSds' with sds as member and metadata, need to check if it can be passed to functions by value, to avoid the need to allocate it on the heap. Thoughts?

hpatro · 2024-07-09T18:29:25Z

@zvi-code Thanks for starting this issue. Right now I'm a bit unclear about your suggestion. I would like to understand more :).

Also not sure if you're referring to the same thing which I also had experimented a little while ago. Let me know if we are on the same page :).

Overall, my idea was not to create robj/sds during the IO flow for each argument. Hence, an additional copy isn't required and was instead maintaining a seqbuf struct to refer the data in client's querybuf present already.

typedef struct seqbuf {
   char *buf;
   uint32_t len;
} seqbuf;

Few of the helper methods to handle to-fro between data storage layer and network layer:

seqbuf *seqbuf_init_cstr(char *buf, long len);
seqbuf *seqbuf_init_sds(sds buf);
void seqbuf_clonedata(seqbuf *sb);
sds seqbuf_createsds(seqbuf *sb);
struct redisObject *seqbuf_createStringObject(seqbuf *sb);
void seqbuf_clear(seqbuf *sb);
void seqbuf_free(seqbuf *sb);

One of the challenges which I observed with introducing another type in to the Valkey codebase is the developers need to be aware when to use which type and the initial change would be quite messy as there a lot of touchpoints. Currently, it's pretty homogenous throughout the codebase so easier to navigate albeit paying some of the cost.

zvi-code changed the title ~~> This change makes sense to me overall and tradeoffs are good.~~ Should we rethink how we use sds (and maybe other data types) ? Jul 2, 2024

zvi-code changed the title ~~Should we rethink how we use sds (and maybe other data types) ?~~ code uses of sds (and maybe other data types) ? Jul 2, 2024

zvi-code changed the title ~~code uses of sds (and maybe other data types) ?~~ code uses of sds (and maybe other data types) Jul 2, 2024

zvi-code changed the title ~~code uses of sds (and maybe other data types)~~ code usege of sds (and maybe other data types) Jul 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code usege of sds (and maybe other data types) #739

code usege of sds (and maybe other data types) #739

zvi-code commented Jul 2, 2024 •

edited

Loading

hpatro commented Jul 9, 2024

code usege of sds (and maybe other data types) #739

code usege of sds (and maybe other data types) #739

Comments

zvi-code commented Jul 2, 2024 • edited Loading

hpatro commented Jul 9, 2024

zvi-code commented Jul 2, 2024 •

edited

Loading