-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-visiting the architecture of DC #554
Comments
While working on the issue, i took an approach a generating fixed_value and timed_value tables at runtime for every datasource. Benefits:
Disadvantages:
Another approach that we could take is using a nosql database, which would allow the same database structure as DC currently has and would also address the above listed issues. Because it is a schema less approach we can reduce the number the operations as one row could have 10 cols but the other could have 100. There would be no need to generate tables at runtime. However it would require re-writing the backend from scratch but it would be more robust approach. |
Description
Consider an example of a table of size 6*8, where there are 8 attributes and 6 subjects. In current implementation we are saving combination of every subject with every attribute, which for a table this small does 48 operations instead of just 6 as subject is common for all the 8 attributes of a single record. This significantly slows down the saving of the records in the database when the dataset is too large e.g an excel sheet with 100000 rows and 50 attributes, to save this dataset to database DC run 5 million operations instead of just 100 thousand.
Error log
NA
The text was updated successfully, but these errors were encountered: