Optimize `crdt.Array` Data Structure for Improved Performance #980

hackerwins · 2024-08-27T00:58:38Z

Description:

The current implementation of crdt.Array is based on RGATreeList, which uses both SplayTree and LLRBTree for internal element retrieval. While this structure works, there's potential for optimization, especially considering the random access nature of crdt.Array operations.

The current implementation of crdt.Array relies on a data structure called RGATreeList, which incorporates both SplayTree and LLRBTree for managing internal elements. While this approach is functional, there's room for improvement, particularly given the random access nature of crdt.Array operations. The SplayTree is utilized to locate nodes based on their indices during Document.Update calls, while the LLRBTree is employed to find nodes according to their Element.CreatedAt values in various operations.

However, the use of SplayTree may not be ideal for crdt.Array. Initially designed to efficiently handle text editing workloads, the SplayTree's characteristic of moving recently accessed nodes to the root could potentially create unnecessary overhead in the context of crdt.Array's random access patterns. This suggests that a different data structure might be more appropriate for optimizing performance.

To address this, we propose investigating and implementing an alternative data structure that is better suited to the access patterns of crdt.Array. The goal of this change is to enhance the performance of local editing operations, potentially leading to improved efficiency and responsiveness in the system overall.

Considerations for Implementation

Benchmark current vs. proposed structures in various scenarios
Assess memory usage of the new structure
Consider scalability for future requirements(Array.Move, ...)

Why:

Implementing a more fitting data structure for crdt.Array will likely improve its performance during local editing, especially given the random access nature of the operations. This optimization will lead to smoother user experiences and better resource management within the application.

The text was updated successfully, but these errors were encountered:

hackerwins · 2024-09-03T14:08:20Z

Related to #989

binary-ho · 2024-09-04T06:29:38Z

Can i try this?

m4ushold · 2024-09-04T12:49:10Z

Intuitively, using BBSTs (Balanced Binary Search Trees) such as Treap, AVL, or LLRB trees, which have an upper bound of $\log{n}$ for time complexity, seems preferable.
However, since crdt.Array does not require weight-based access, data structures like Y-fast tries or Van Emde Boas trees could be potential alternatives. Nevertheless, given that searches are based on relative positions between nodes rather than key-based access, it’s uncertain whether these structures can effectively support this. Further research and discussion seem necessary.
If anyone has a better idea, please let me know :)

m4ushold · 2024-09-07T05:16:21Z

I have reviewed the Y-fast Trie and the Van Emde Boas Tree. Both of these data structures are designed for integer-based keys, which may not be suitable unless there is a way to efficiently update the key values over a range. Therefore, it seems more practical to experiment with several balanced binary search trees (BBSTs) instead.

AVL Tree: This data structure requires additional memory to store metadata related to the height of each node, which might result in higher memory usage compared to other trees.
Treap: This appears to be one of the more straightforward choices.
LLRB Tree: It's already implemented, so I think I'll just need to add a few features.
RB Tree: Implementing this could be hard.
It would be advisable to select a few of these options for experimentation.

hackerwins added the enhancement 🌟 New feature or request label Aug 27, 2024

hackerwins added this to Yorkie Project - 2024 Aug 27, 2024

github-project-automation bot moved this to Backlog in Yorkie Project - 2024 Aug 27, 2024

hackerwins mentioned this issue Sep 3, 2024

Apply max height splay method #989

Closed

2 tasks

binary-ho self-assigned this Sep 22, 2024

coderabbitai bot mentioned this issue Oct 2, 2024

Balancing Tree to Prevent Skewness #1025

Open

2 tasks

devleejb moved this from Backlog to In progress in Yorkie Project - 2024 Nov 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize `crdt.Array` Data Structure for Improved Performance #980

Optimize `crdt.Array` Data Structure for Improved Performance #980

hackerwins commented Aug 27, 2024

hackerwins commented Sep 3, 2024

binary-ho commented Sep 4, 2024

m4ushold commented Sep 4, 2024 •

edited

Loading

m4ushold commented Sep 7, 2024

Optimize crdt.Array Data Structure for Improved Performance #980

Optimize crdt.Array Data Structure for Improved Performance #980

Comments

hackerwins commented Aug 27, 2024

hackerwins commented Sep 3, 2024

binary-ho commented Sep 4, 2024

m4ushold commented Sep 4, 2024 • edited Loading

m4ushold commented Sep 7, 2024

Optimize `crdt.Array` Data Structure for Improved Performance #980

Optimize `crdt.Array` Data Structure for Improved Performance #980

m4ushold commented Sep 4, 2024 •

edited

Loading