Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Structure in influxdb and timescaledb are different #71

Open
komal-lunkad opened this issue Sep 22, 2019 · 7 comments
Open

Data Structure in influxdb and timescaledb are different #71

komal-lunkad opened this issue Sep 22, 2019 · 7 comments
Labels
question Further information is requested

Comments

@komal-lunkad
Copy link

In my influx measurement, the timestamp tag data type is timestamp and that in timescale it is timestamp with timezone.
Also , there us one more field which is of type string seperated by commas and I want to convert it into an array in timescale.

Please help me in which file should I make the required chanages and how to run the code for the same.

@atanasovskib
Copy link
Contributor

Hello @komal-lunkad. Yes Outflux exports the time column from InfluxDB as timestamptz type. Currently Outflux doesn't allow you to influence the schema much during migration or define your custom transformations. But you can do the following.

  1. Run the migrate command and transfer all your data. (Let's say it creates a hyperatable a with a time column that's timestamptz, and a TEXT column b with values like a,b,c
  2. Modify the time column in TimescaleDB ALTER TABLE a ALTER COLUMN time TYPE timestamp;
  3. Create a new column of type array, and populate it with the string_to_array function built in PostgreSQL: ALTER TABLE a ADD COLUMN b_arr text[]; UPDATE a SET b_arr = string_to_array(b, ',');

@atanasovskib atanasovskib added the question Further information is requested label Sep 23, 2019
@komal-lunkad
Copy link
Author

Thank you for the answer @blagojts.
Currently, we already have the table defined on production with the fields holding the above specified data types. So could you please help with how can we make through the code like if we can change any file or so in the code?

@atanasovskib
Copy link
Contributor

atanasovskib commented Sep 23, 2019

You should take a look at the Transformer interface and the jsoncombiner implementation https://github.com/timescale/outflux/tree/develop/internal/transformation

You can implement your own transformer that will run between extraction from InfluxDB and ingestion in TimescaleDB. The Prepare method returns the desired data set definition after the data goes through the transformer. The Start method is what transforms the actual rows.

Then you just need to invoke it when the migration pipes are created

func (p *pipeService) createTransformers(pipeID string, infConn influx.Client, measure string, inputDb string, conf *MigrationConfig) ([]transformation.Transformer, error) {
depending on some config flags

You might add support for array data types, and that's a bit more tricky (not complex, just have to modify multiple files). Start with https://github.com/timescale/outflux/blob/develop/internal/idrf/data_type.go

IDRF stands for intermediate data representation format and its what Outflux uses to keep track of the data types and which columns are present in a data set. After you add the array data type in the enum you might need to update all functions referencing it

@polmonso
Copy link

@blagojts We have tables in our influxdb with the default timestamp and want to migrate to timescaldb to a table with timestamptz. You said that outflux exports the time column as timestamptz, but that's not what we got when we migrated, we got timestamp.

We have other tables in timescaledb already in timestamptz so we are trying to find out how to best integrate the two to do operations mixing them.

In the code of outflux you do distinguish between timestamp and timestamptz, how can we tell the migration to use timestamptz?

@AlbertoRasillo
Copy link

Following up @polmonso comment, we got this error message:

existing table in target db is not compatible with required. Required column time of type IDRFTimestamp is not compatible with existing type Timestamptz

@polmonso
Copy link

bump?

@ctron
Copy link

ctron commented Aug 12, 2021

I ran into the same issue. I was able to resolve it by letting outflux create the schema, and then alter the time column type afterwards using:

ALTER TABLE temperatures ALTER COLUMN time TYPE TIMESTAMP WITH TIME ZONE USING time AT TIME ZONE 'UTC';

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants