Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate historical data into A&R system #237

Closed
5 of 7 tasks
Tracked by #269 ...
kgorana opened this issue Oct 20, 2023 · 8 comments
Closed
5 of 7 tasks
Tracked by #269 ...

Migrate historical data into A&R system #237

kgorana opened this issue Oct 20, 2023 · 8 comments
Assignees
Labels

Comments

@kgorana
Copy link

kgorana commented Oct 20, 2023

Description

There are roughly 167,000 records to be brought over. We want to move away from the spreadsheet as the source of truth so it is important that this migration captures the spreadsheet in its entirety.

The migration will have to parse the spreadsheet on a row-by-row basis and create legacy record objects for each row. For some rows, new parks and subareas will also need to be created if they are not yet in the new system.

The record activity type (Frontcountry Camping, Day Use, etc) will have to be inferred from the populated columns in each row. Columns that do not fit into an existing activity type will be captured in a new activity, Legacy Data.

Columns that belong into an existing activity type but do not exist within the new A&R system will be brought over as new legacy attributes in the migrated records. For example: Net Revenue totals are not saved in our new database, but are shown in historical data records. These historical totals should be captured in their own fields, eg legacy_frontcountryCampingPartiesNetRevenue.

Attached are spreadsheets containing field maps between new and historical data.

Handling legacy data in the FE and API will be handled in future tickets. The focus of this ticket is simply getting legacy data into the db.

Acceptance Criteria:

  • All records brought over from historical data spreadsheet
    • Legacy fields added to records where necessary
    • Legacy fields listed in master spreadsheet for tracking purposes
  • New legacy parks and subareas created as necessary
  • New activity type Legacy Data created to capture columns that dont belong to an existing activity
    • config object not needed for Legacy Data - there will never be a case where this record will be empty.
  • All legacy records, parks, and subareas flagged with isLegacy: true boolean and are locked.
  • Create LEGACY PARK & LEGACY SUBAREA (Similar to TEST PARK/TEST SUBAREA) so QA can test legacy items.

Notes

  • Migration progress is visible and can resume in event of interruption
    • May even need to back out of migration altogether
  • The spreadsheet is so large, it may be easier to read as a csv as opposed to xlsx.
  • Proposed Legacy Data object:
    • a corresponding config object probably will not be necessary, since config objects are not associated with a particular date - but this should be inspected. Some calculations may depend on modifiers that are typically stored in the config attribute of a record.
{
  pk: HIST::Legacy Data,
  sk: 202209,
  parkName: LEGACY PARK,
  orcs: HIST,
  date: 202209,
  subAreaId: HIST,
  activity: Legacy Data,
  isLegacy: true,
  legacy_additionalNotes,
  legacy_dataSource
}

Below is a finalized spreadsheet of mapped fields & legacy object models.

historical_data_comparison - Master.xlsx

Linked issues

blocks

created by

is blocked by

is cloned by

@kgorana
Copy link
Author

kgorana commented Oct 20, 2023

Cameron Pettit
April 3, 2023 at 10:39 AM

@JLWade Nope I think Recreation Services Team has done her part for now - there were some failed rows when migrating to dev that I assumed could be fixed before the migration to test/prod, so I sent those to Sally and she’s returned them to me ready to go - I’ve just been caught up with other stuff. I’m just getting back to this stuff now, integrating her changes as we speak. I’ll update the attached spreadsheet with Sally’s changes so we can use it as the master record of which data was used for our prod migrations (we don’t really care to save a record for dev/test).

Once I’ve updated it I will kick off the migration to test so we can keep moving this along.


Jessica Wade
April 3, 2023 at 10:29 AM

@cameronpettit thinking we can see if Sally can help us get some of the prep work done for this ticket - is there anything specific that needs to be done for the other historical data? Do we need sally’s eyes on the attached spreadsheet?


Jessica Wade
December 14, 2022 at 8:47 AM

@janemoxd thats exactly what Cam suggested yesterday too :slight_smile: BRS-971: Migrate historical data from 2017-2019 into A&R system on TEST was cloned for that


Jane Mountain
December 14, 2022 at 8:29 AM

Just adding a note here that a (possibly) sensible approach to this would be to import the data needed for variance triggers and validating by POs (last couple of years), get all of that set up and working first.

Then later, we could import the older data that sounds like it won’t be used very much?


Jessica Wade
November 14, 2022 at 3:51 PM

Before this can be done we need answers to:
what is the purpose of having all this historical data in the A&R system? (2000-2021)
do we need to migrate ALL of it? Or can we split it and only bring over some
is there another BI tool that can be used to analyze this data?
how are you using the data after export?
having comparison data is more important than storage of data
issue with brining it all in is accuracy and lots of areas for error
can we archive any data?
answers from meeting with Jane, Cam, Jeff and Sally on Dec 13, 2022 and put into BRS-971


Jessica Wade
November 7, 2022 at 4:04 PM

even better!


Cameron Pettit
November 7, 2022 at 4:00 PM

Yes, the new system should allow us to make new records for any point in history, we aren’t limited to the system launch day and afterwards. Adding a record (created by the new system) for missing data is something we can already do.


Jessica Wade
November 7, 2022 at 3:51 PM

That makes sense - thank you! It would likely only be for missing data so sounds like we could figure something out.


Cameron Pettit
November 7, 2022 at 3:45 PM

Our stance was to lock all legacy data on import. If I remember correctly, this is because the legacy data is too large/complex to verify that every single entry can be mapped to our new system. If we don’t differentiate that the legacy data was created by some system other than our own, it opens up a lot of potential for issues when we are trying to manipulate the data down the line.

I’d be hesitant to allow any editing to historical records - because then it’s not a historical record anymore. If edits need to be made to a historical record, then a new record should be created with the new system, and then that historical record could be disposed. So theres ways to achieve what you’re asking, but I think we should steer clear of the ‘editing historical records’ path.


Jessica Wade
November 7, 2022 at 3:14 PM

@cameronpettit can you refresh my memory - do these legacy records need to be “locked” upon import? I remember that being something we talked about - and if so, is there a way to unlock them in specific instances?
Use case for this is that some data from the UseDirect era (2019, 2020, 2021) might require POs to enter data that was omitted during that time.

@Dianadec
Copy link
Contributor

Hearing that this would be a complex piece of work.

@LindsayMacfarlane
Copy link

LindsayMacfarlane commented Oct 15, 2024

October 11, 2024 prioritization session: A&R team identified this ticket/issue as something that can wait to be implemented (medium -priority).

Linked to ticket 348 - Old data isn't showing export.

Mural board: https://app.mural.co/t/bcparks2575/m/bcparks2575/1728427370222/4c9714df914c7ec8bdcf20a9ee1999cc9e664a27?wid=0-1728427872927

@LindsayMacfarlane
Copy link

This ticket no longer appears to be blocked. Given this is a high priority for A&R, thinking we can pull this one in and action.

@Dianadec
Copy link
Contributor

Dianadec commented Nov 20, 2024

Standup notes:

  • Cam: previous migration was a lot of work
  • Recommends doing the migration in chunks of one year at a time.
  • Get A&R to partition out the years for us - have them review it

@davidclaveau
Copy link
Collaborator

This is ready to be deployed to TEST, @manuji!

Once we're deployed to TEST, I'll run the migrations for the historical datasets and get all of the data from 2000 - 2016 into the TEST database.

For testing, the data will have the same functionality as legacy data in A&R currently, so the data won't appear in Export reports but will show in the Enter Data section of the app with that historical tag attached to it.

Let me know when you deploy and I'll get started on that right away 😄

@manuji
Copy link

manuji commented Dec 20, 2024

Tested on TEST: Passed

  • Verified that the data in excel sheets have been uploaded to the test environment by spot checking data from each excel sheet.
  • Passing to PO review.

@LindsayMacfarlane
Copy link

Thank you @davidclaveau and to you @manuji for testing. I have not specifically tested and approve :)

@sallyhaynes @nicolebottoms - FYI this is currently in TEST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

No branches or pull requests

6 participants