Limitation of Streams in the past [General discussion] #23

perellonieto · 2017-08-22T14:21:55Z

What are the implications of using Streams of data with a timestamp in the Future (or in the present moment)?

At this moment, if you ask for a Stream to be computed for the current time it will raise the following Exception

File "some_path/site-packages/hyperstream/stream/stream_instance.py", line 39, in __new__
    raise ValueError("Timestamp {} should not be in the future!".format(timestamp))
ValueError: Timestamp 2017-08-22 14:05:00.308326+00:00 should not be in the future!

I can think of two cases where it could be interesting to allow Streams in the future:

Asking a classifier tool to train from now till 1 hour in the future.
- It would be nice to tell to a classifier, given the data that is given from a specific source stream, keep training until the specified time.
- E.g. Some real time from stock exchanges that gets to a real time stream and that keeps yielding data. The model could take this data every time that it is available and train.
Some dataset where the timestamps are in the future.
- I am not sure how plausible this scenario is.
- But I imagine that if someone wants to use a stream that outputs data from some particular future time.
Asking a tool for predictions in the future
- A model that makes predictions for the future, given that it has been already trained with past data.

tdiethe · 2017-09-20T17:30:32Z

Interesting cases:
1 - If the classifier is incremental/online, then this isn't a problem, as it can just be called with the last weights/latent variable estimates as a parameter (probably stored in a parameter stream). If it's an offline algorithm, that requires a batch of data, but is designed to iterate over the data (e.g. SGD) then it could indeed make sense to do this kind of thing, rather than wait till all the data is available.
2.+3. I hadn't thought of the forecasting case, but yes this makes sense too.

Perhaps you could come up with a couple of simple use cases ... e.g. predicting tomorrow's weather (it can be a dumb predictor that just predicts the same as today), and see what the consequence of removing that constraint would be.

To be honest I think the main reason it's there at the moment is just to protect the user from unintentional mistakes, as this was the most common cause.

It's worth noting that by default I think all of the channels are valid up to now(), so you might be able to write to a stream but not read from it.

perellonieto added enhancement question labels Aug 22, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limitation of Streams in the past [General discussion] #23

Limitation of Streams in the past [General discussion] #23

perellonieto commented Aug 22, 2017 •

edited

Loading

tdiethe commented Sep 20, 2017

Limitation of Streams in the past [General discussion] #23

Limitation of Streams in the past [General discussion] #23

Comments

perellonieto commented Aug 22, 2017 • edited Loading

tdiethe commented Sep 20, 2017

perellonieto commented Aug 22, 2017 •

edited

Loading