Data_Pack level attribute(/"primitive-like") direct access interface (for user) design/requirements/considerations #924

J007X · 2023-03-09T12:30:57Z

Is your feature request related to a problem? Please describe.
Per discussion in earlier meetings and emails, a "Data_Pack" level attribute(/"primitive-like") direct access interface (without using classes) for batch-like/mass retrieval is preferred and thus we need a high-level design/considerations/requirement ticket for this, as this interface will be exposed (like and API) to user , so more discussion is needed. Also this ticket is for organizing sub tasks identified during the requirement/design phase.

Describe the solution you'd like
This (data_pack level) attribute(/primitive-like) direct access interface , will provide higher performance for some typical batch-like/mass retrieval scenarios such as NLP pipeline (such as for POS tagging and NER) using Forte. It also extends the capability for accessing attributes "as range/batch" for one or more tid(s), or using specific type, so that the data can be accessed without the need to using classes (thus avoiding related performance overheads).

Describe alternatives you've considered
several overall design is considered, (including discussion around cached data in data_pack), per recent discussion (with Hector) it is now preferred to focus on the current data_store related implementation to first provide some basic interface (and maybe then later to expand its capabilities).

Some current method design/considerations and sub tasks

Using specified list of attribute names, and type name for accessing the attributes/primitive-like data for most frequently used data types in typical scenarios (such as NLP pipeline) (Name suggestion: get_attr_of_type, similar to the "get" method of data_pack but adding attr_names: List[str] and optional attr_ids list as parameter)
Using specified list of attribute names (or list of attr_id) and tid (or list of tid) for "range-selecting" for attributes for access (Name suggestion: get_attr_data, this will combine the tid/tids methods and attr_name and attr_id options all into one method, as suggested)
return format (for attributes) can be dict for easy access using attribute name (and can be together with entry for compatibility/mixed usage scenarios which could be common)
Also, write-access is very likely be needed in additional for read-access to further boost performance (in batch mode)
Demo python script
Documentation (in source code)

Additional context
This is a higher level interface for user to access , unlike (lower level) interfaces in Data_Store (for provided related services)

hunterhector · 2023-03-09T13:50:32Z

Some suggestions:

Suggested to add subtask for data_store level implementation. What are the functions to be implemented?
Suggested to add subtask for a demo python script, and a documentation. These tasks are important, not only can they help future users, they can also help us sort out the easiness of the interface.
The get_data function could be a good starting point, it returns data in primitive "dict" format, and takes a "request" in the method interface. Thus extending this function will allow one to achieve several goals you mentioned.

J007X · 2023-03-09T14:42:00Z

Hi Hector for 1) the sub_tasks for data_store is in another design ticket #922 (created for lower level service interfaces not exposed) to keep it more clear focused (as these 2 layers could change independently). please check it out (it is also slightly improved for method considerations per discussion).

J007X linked a pull request Mar 22, 2023 that will close this issue

Data pack attribute interface implementation #926

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data_Pack level attribute(/"primitive-like") direct access interface (for user) design/requirements/considerations #924

Data_Pack level attribute(/"primitive-like") direct access interface (for user) design/requirements/considerations #924

J007X commented Mar 9, 2023 •

edited

Loading

hunterhector commented Mar 9, 2023

J007X commented Mar 9, 2023 via email •

edited

Loading

Data_Pack level attribute(/"primitive-like") direct access interface (for user) design/requirements/considerations #924

Data_Pack level attribute(/"primitive-like") direct access interface (for user) design/requirements/considerations #924

Comments

J007X commented Mar 9, 2023 • edited Loading

hunterhector commented Mar 9, 2023

J007X commented Mar 9, 2023 via email • edited Loading

J007X commented Mar 9, 2023 •

edited

Loading

J007X commented Mar 9, 2023 via email •

edited

Loading