Skip to content

Data Cleaning Methods

Chris Erickson edited this page Jun 26, 2023 · 3 revisions

Data Cleaning Methods

Some columns have associated helper functions to assist with data cleaning. However, some column attributes (Deck Archetype, Match Format, Match Type) can not be derived from GameLog files and must be revised manually using either Missing Match Data or Revise Record(s).

Missing Match Data

Data => Input Missing Match Data

- Matches import with empty P1/P2_Arch, P1/P2_Subarch, (Limited)_Format, Match_Type columns by default.
- Cycle through Matches with empty columns and manually fill them in.	

Missing Game_Winner

Data => Input Missing Game_Winner Data

- The 'Games.Game_Winner' column will be set to 'NA' if the game's winner could not be determined.
- Cycle through affected Games and manually select a Game_Winner based on trailing Game Actions. 
- All tables will be automatically updated accordingly.

Best Guess Deck Names

Data => Apply Best Guess for Deck Names

- The 'Matches.P1/P2_Subarch' columns will be set to 'NA' by default after importing.
- Import sample decklists and apply best guess deck names in the 'Matches.P1/P2_Subarch' columns.
- Sample decklists from YYYY-MM to YYYY-MM are included and will be updated at the end of every month.

- Clicking 'Apply to All' will overwrite any existing P1/P2_Subarch values.
- Click 'Apply to Unknowns' if you do not wish to overwrite your previous changes to these columns.
- Matches with Format set to Draft/Sealed/Cube will have deck name set to colors played (eg. WU/RG/etc.)

Associated Draft_IDs

Data => Apply Associated Draft_IDs to Limited Matches

- Choose whether to cycle through all Limited Matches or only those with Draft_ID set to 'NA'.
- This will cycle through Matches that have been set to Booster Draft or Cube.
- The cards played will be compared against cards picked in each Draft to find Applicable Draft_IDs.
- Choose from the list of Applicable Draft_IDs to apply the Draft_ID to the Match.
- Match results will automatically be applied to the match result columns in the 'Drafts' Table.

Revise Record(s) Button

- Selected row(s) in the 'Matches' table can be manually revised.
- If multiple rows are selected, the revision will apply to all selected rows.
- This is only applicable to rows in the 'Matches' or 'Drafts' tables.

Remove Record(s) Button

- Selected row(s) in the 'Matches' table can be removed from your database.
- All associated Games and Plays data will also be removed.
- Removed Matches can be ignored, meaning they will not be included in future imports.
- This is only applicable to rows in the 'Matches' or 'Drafts' tables.

Input Options File

- Control the dropdown menu options available when making revisions.
- Add or delete options under their respective header.
- Each option MUST be on it's own line.
- Do not alter the pre-existing headers in this file.