Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store messagepack values in database instead of JSON #288

Merged
merged 2 commits into from
Jun 28, 2023

Conversation

hendrikvanantwerpen
Copy link
Collaborator

Database access (particularly writing, it seems) is fairly slow. Currently the database is populated with JSON objects, which are pretty verbose. This PR changes that to use messagepack values, which are binary and optimized for small size, if I understand things correctly.

A rough comparison of index times shows a speedup of almost 2x for some files and a database size reduction of almost 3x. The time reduction is more pronounced for smaller files, probably because more of the index work is writing the data, compared to computing it.

@hendrikvanantwerpen hendrikvanantwerpen requested a review from a team as a code owner June 22, 2023 18:49
@hendrikvanantwerpen hendrikvanantwerpen self-assigned this Jun 22, 2023
@github-actions
Copy link

Performance Summary

Comparing base 46d016b with head 024a5e9 on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript index -D base.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
2.375^                                   #                                    
     |                                   #                                    
     |                                   #                                    
     |                                   #                                    
     |                                   #                                    
     |           @                       #                                    
     |           @   :                  :#           :     :                  
     |           @   :                  :#           :     :                  
     |          :@   :              :   :#           :     :                  
     |     :   @:@   :   :  :      ::   :#: @        :    ::    :    :: :     
     |     :   @:@:  :  :: ::@  :  :: : :#: @        ::   ::    :    ::::     
     |    @::  @:@: :: ::: ::@  :  :: :::#: @      ::::  :::   @:  :@:::: @:  
   0 +----------------------------------------------------------------------->Gi
     0                                                                   229.4
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
2.375^                                     #                                  
     |                                     #                                  
     |                                     #                                  
     |                                     #                                  
     |                                     #                                  
     |            @                        #                                  
     |            @                       :#                                  
     |            @       :               :#                                  
     |           :@       :          ::   :#                                  
     |      :    :@  ::   :          ::   :#         @     @            ::    
     |      :   ::@: :  :::  @:     @:: : :#  :  :   @     @        :::::::   
     |    @@:   ::@: :  : :  @::    @:: :::# :: :: : @:    @   :: : :::::::   
   0 +----------------------------------------------------------------------->Gi
     0                                                                   199.0

@github-actions
Copy link

Performance Summary

Comparing base 46d016b with head 458db76 on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript index -D base.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
1.308^                                              #                         
     |      @         :                             #     :                   
     |      @         :                             #     :                   
     |      @         :                             #     :                   
     |      @         :                             #     :                   
     |      @         :            :               :#     :             :     
     |      @         :            ::  :           :#     :           : :     
     |      @    :    :     :      ::  :           :#     :          @:::     
     |      @    :    :  :  :      ::  :           :#     :          @:::     
     |      @    :    :  :: :    : :: ::  :   :  : :#     :          @::::    
     |    @@@    : ::::  :: :    : :: ::  :   :  :::#:    :       :  @::::    
     |    @ @    ::: :: ::: :  ::: :: :: ::   :  :::#: :@@:  :: : :::@:::: :@ 
   0 +----------------------------------------------------------------------->Gi
     0                                                                   207.8
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
2.374^                                    ##                                  
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |                                    #                                   
     |       @    :                       #                                   
     |       @    :                       #                                   
     |       @    :                  :   @#                                   
     |       @   ::   :   :          :   @#          :          :    :  :     
     |       @   :: @ :  ::  :    : ::   @#  :       :          :    : ::     
     |     @:@   :@@@::  ::  :@  :: :: ::@# ::     ::::  ::     : ::@:::: @   
   0 +----------------------------------------------------------------------->Gi
     0                                                                   190.6

@github-actions
Copy link

Performance Summary

Comparing base e11f004 with head acb0fab on microsoft/TypeScript@v4.9.5. For details see workflow artifacts. Note that performance is tested on the last commits with changes in stack-graphs, not on every commit.

Before
--------------------------------------------------------------------------------
Command:            base/target/release/tree-sitter-stack-graphs-typescript index -D base.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 base-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
1.545^           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |           #                                                            
     |           #        ::                                                  
     |          @#        :          :                                  ::    
     |          @#        :          :     :                          : :     
     |          @#     ::@:          :     :                          :::     
     |          @#     : @: :::      : :::::          ::              :::     
     |     @@   @#::   : @: ::    :: : : : : ::      ::            ::::::     
     |    :@    @#: :: : @: ::   :: :: : : : :  : :  ::    :  :::: : :::: : : 
   0 +----------------------------------------------------------------------->Gi
     0                                                                   230.3
After
--------------------------------------------------------------------------------
Command:            head/target/release/tree-sitter-stack-graphs-typescript index -D head.sqlite --max-file-time=30 --hide-error-details -- test/src/compiler
Massif arguments:   --massif-out-file=perf.out
ms_print arguments: --x=72 --y=12 head-perf-results/perf.out
--------------------------------------------------------------------------------


    GB
1.055^            #                                                           
     |            #                                                           
     |            #                                                           
     |           @#                  :                                        
     |           @#                  :                                  :     
     |           @#                  :                                : :     
     |           @#       :  :      ::                               @: :     
     |           @#       :  :      ::    :           :              @: :     
     |       @   @#       :  :      :: : ::        @ ::              @: :     
     |      :@   @# :    ::  :      :: ::::        @ ::    :         @:::     
     |     ::@   @#::  @ ::  :  :   :: :::::   ::: @:::  :::     : ::@::: :   
     |     ::@   @#::::@ :::::  :::::: ::::::  ::::@:::: :@: : :@: ::@:::::@: 
   0 +----------------------------------------------------------------------->Gi
     0                                                                   179.8

@hendrikvanantwerpen hendrikvanantwerpen merged commit d093083 into main Jun 28, 2023
@hendrikvanantwerpen hendrikvanantwerpen deleted the binary-database branch June 28, 2023 10:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants