Mapping aggregations to relational execution environments #104
AttilaMihaly
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
The way aggregations are handled in FP is fundamentally different from how they are handled in relational execution environments. In FP aggregations are implemented as functions that take a collection as an input and return a single value as output. Relational queries on the other hand always return a relation so they cannot explicitly return a single value. The workaround is that they return a collection with a single element (or in other words a relation with a single row).
Most of the time though, aggregation is combined with grouping. In this case the original collection is grouped by some key and then an aggregation is applied to each group separately. This operation returns another collection instead of a single value therefore it fits nicely in the relational model. On the other hand, in FP these operations are usually chained after each other instead of being applied in a single operation. There are functions used for grouping and then aggregations can be applied to each group using a
map
operation.The bottom line is that the challenge with aggregations is caused by the mismatch between FP where the focus is on individual values and relational algebra that is centered around relations (which are collections of values). A software developer bridges the gap by transforming business problems into the form that fits the execution environment. To solve our challenge we need to formalize those transformations.
Beta Was this translation helpful? Give feedback.
All reactions