-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Utilities to assist in retrieving data from tables #2857
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I found this PR very tough to review in a thorough way.
- There is a ton of boilerplate that obscures the signal I'm trying to see.
- The overall high-level design for the Java code is not apparent, which makes it hard to follow.
- I have a feeling that the java could be simplified in some significant way, but I'm at a loss for saying what this would look like, since I'm still not fully grasping the high-level view.
QueryUtil/build.gradle
Outdated
@@ -0,0 +1,33 @@ | |||
plugins { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I THINK that there has been a drift to sub project folders being lower case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QueryUtil
is a vague name that is just going to lead this to become a dumping ground. It needs a more purposeful name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe QueryUtil
should be dropped and just keep dataadaper
?
Also, python has the experimental
package. Should Java have the same thing? @rcaudy
|
||
import java.util.Objects; | ||
|
||
public class ChunkRetrievalUtil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do all of the classes need to be public? Changing them would reducer our API surface area in a good way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
turned down this & a few others
@@ -0,0 +1,101 @@ | |||
package io.deephaven.queryutil.dataadapter.datafetch.bulk.gen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just example code and not real code, should it go in test or somewhere else so that it doesn't clutter the codebase? @rcaudy
@@ -0,0 +1,177 @@ | |||
package io.deephaven.queryutil.dataadapter.datafetch.bulk.gen; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is generated, it should have a header indicating how it was generated, how to regenerate it, not to edit it, etc. Same for all related files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not generated; it handles generation
.../java/io/deephaven/queryutil/dataadapter/datafetch/bulk/gen/TableDataRetrieverGenerator.java
Outdated
Show resolved
Hide resolved
# source = Table(_JTstTools.testRefreshingTable( | ||
# _JTstTools.i(2, 4).copy().toTracking(), | ||
# _JTableTools.intCol("Sym", 'AAPL', 'AAPL', None), | ||
# _JTableTools.doubleCol("Price", 1.1, 1.2, NULL_DOUBLE), | ||
# _JTableTools.longCol("Size", 10_000_000_000, 1_000, NULL_LONG), | ||
# _JTableTools.col("Exch", "ARCA", 'NASDAQ', None), | ||
# )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kill?
keyed_record_adapter: KeyedRecordAdapter[ | ||
List[str], StockTradeAlternateConstructor] = make_record_adapter_with_constructor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unneeded type hint?
# source = Table(_JTstTools.testRefreshingTable( | ||
# _JTstTools.i(2, 4).copy().toTracking(), | ||
# _JTableTools.intCol("Sym", 'AAPL', 'AAPL', None), | ||
# _JTableTools.doubleCol("Price", 1.1, 1.2, NULL_DOUBLE), | ||
# _JTableTools.longCol("Size", 10_000_000_000, 1_000, NULL_LONG), | ||
# _JTableTools.col("Exch", "ARCA", 'NASDAQ', None), | ||
# )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kill?
def create_stock_trade(sym: str, exch: str, price: float, size: int): | ||
return StockTrade(sym, price, size, exch) | ||
|
||
keyed_record_adapter: KeyedRecordAdapter[List[str], StockTrade] = make_record_adapter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unneeded type hint?
|
||
|
||
if __name__ == '__main__': | ||
unittest.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In PartitionedTables, @rcaudy has resolved the 1- vs many-key values problem. This whole PR should strive for the same for a user experience.
…strings/lists of strings as keys.)
…tional notification awareness to snapshotting in KeyedRecordAdapter.
…eListener to InstrumentedTableUpdateListenerAdapter for simplicity.)
34b1b0b
to
c7f8223
Compare
# Conflicts: # Integrations/build.gradle
…cord_adapter # Conflicts: # Integrations/build.gradle
…cord_adapter # Conflicts: # Util/src/main/java/io/deephaven/util/type/TypeUtils.java # engine/table/src/test/java/io/deephaven/engine/util/TestTypeUtils.java
…cord_adapter # Conflicts: # py/server/deephaven/table.py
I have not looked at this PR recently, so I don't remember the contents. DHC has added functionality to help users extract data. I'm noting it here in case it makes this PR no longer needed. #5595 |
Adds utilities to make it easier to pull data from a table.
There is a
KeyedRecordAdapter
that leverages the ToMapListener to allow pulling data by key (e.g. USym):And a
TableToRecordListener
that lets you listen to updates as objects instead of just row keys:The mapping of columns to fields in whatever object is handled by a RecordAdapterDescriptor:
The RecordAdapterDescriptor can be used to create a
SingleRowRecordAdapter
or aMultiRowRecordAdapter
. A single-row adapter creates an object (MyTradeHolder
orJsonNode
or whatever) and reads data from the columns directly into that object. A multi-row adapter has more steps but is intended to be efficient and chunk-oriented. It will:MyTradeHolder[]
,JsonNode[]
, whatever) and fill it with new empty records.A record adapter instance can be generated based on the descriptor. For example,
com.illumon.iris.db.util.dataadapter.rec.json.JsonRecordAdapterUtil#createJsonRecordAdapterDescriptor(t, "Col1", "Col2", "Col3").createMultiRowRecordAdapter(t)
would figure out the types of "Col1"/"Col2"/"Col3" and generate a class that populates ObjectNodes with those fields.There is also a draft Python version that lets you create data structures by passing data to either a function or straight to some type's constructor:
Random notes:
MultiRowRecordAdapter
is responsible for the real work of pulling data out of tables; bothKeyedRecordAdapter
andTableToRecordListener
use that. (KeyedRecordAdapter
also usesSingleRowRecordAdapter
when fetching single keys.)keyed_record_adapter.py
/PyKeyedRecordAdapter.java
) only supports strings as keys and has no listening component.