-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Feat/glom like key searching #44
base: main
Are you sure you want to change the base?
Conversation
res = f"✨ Scenario: {'🔹' * self.readers}{'🔻' * self.writers} ({self.readers}r{self.writers}w)" | ||
res += ", 🔸 compression" if self.use_compression else "" | ||
res += ", 💎 big file" if self.big_file else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will put this in a different PR to keep this one clean
""" | ||
The Indexer takes the name of a database file, and tries to load the .index file | ||
of the corresponding database file. | ||
|
||
The name of the index file is the name of the database file, with the extension | ||
.index and all "/" replaced with "___" | ||
|
||
The content of the index file is a json object, where the keys are keys inside | ||
the database json file, and the values are lists of 5 elements: | ||
- start_index: The index of the first byte of the value of the key in the database file | ||
- end_index: The index of the last byte of the value of the key in the database file | ||
- indent_level: The indent level of the key in the database file | ||
- indent_with: The indent string used. | ||
- value_hash: The hash of the value bytes | ||
""" | ||
|
||
__slots__ = ("data", "path") | ||
|
||
def __init__(self, db_name: str): | ||
# Make path of index file | ||
db_name = db_name.replace("/", "___") | ||
self.path = os.path.join(config.storage_directory, ".ddb", f"{db_name}.index") | ||
|
||
os.makedirs(os.path.dirname(self.path), exist_ok=True) | ||
if not os.path.exists(self.path): | ||
self.data = {} | ||
return | ||
|
||
try: | ||
with open(self.path, "rb") as f: | ||
self.data = orjson.loads(f.read()) | ||
except orjson.JSONDecodeError: | ||
self.data = {} | ||
|
||
|
||
def get(self, key): | ||
""" | ||
Returns a list of 5 elements for a key if it exists, otherwise None | ||
Elements:[start_index, end_index, indent_level, indent_with, value_hash] | ||
""" | ||
return self.data.get(key, None) | ||
|
||
|
||
def write(self, key, start_index, end_index, indent_level, indent_with, value_hash, old_value_end): | ||
""" | ||
Write index information for a key to the index file | ||
""" | ||
|
||
if self.data.get(key, None) is not None: | ||
delta = end_index - old_value_end | ||
for entry in self.data.values(): | ||
if entry[0] > old_value_end: | ||
entry[0] += delta | ||
entry[1] += delta | ||
|
||
self.data[key] = [start_index, end_index, indent_level, indent_with, value_hash] | ||
with open(self.path, "wb") as f: | ||
f.write(orjson.dumps(self.data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@UmbrellaMalware let's stick with tabs so the diff looks clean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pep8 says spaces are preferred
This can become a problem if someone else wants to propose changes to the project
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I know. I personally prefer tabs, and think they are faster to work with. But if this project becomes bigger, I agree that it should adhere to most python community standards. But I'd like to keep the merge clean, the indentation and formatting stuff should be part of a separate issue/MR
25989d2
to
4d0c4b8
Compare
@UmbrellaMalware I did a quick rebase to incorporate die latest changes in main |
I need to test new functionality, I just found a bug in the search algorithm)) I'm also doing some documentation updates. |
Do you mean the issue with reading a key value that is an integer or float? |
No, if u try to read "Key.Subkey1.Subkey2" returns not found, but subkey2 exists |
Things left to do: