Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match structure of all data points so each DataSeries contain the same keys #1600

Merged
merged 1 commit into from
Nov 7, 2023

Conversation

envex
Copy link
Collaborator

@envex envex commented Nov 6, 2023

What does this implement/fix?

https://6062ad4a2d14cd0021539c1b-jcjslbrrgq.chromatic.com/?path=/story/polaris-viz-charts-barchart-playground--mis-matched-data

Now that we're allowing consumers to create any type of report we were running into an issue where data being fed into BarChart didn't have the same data between each DataSeries, which would cause the chart to crash.

For example, data would come in like:

[
      {
        name: 'Canada',
        data: [
          {key: 'Mice', value: 13.28},
          {key: 'Dogs', value: 23.43},
          {key: 'Cats', value: 6.64},
          {key: 'Birds', value: 54.47},
        ],
      },
      {
        name: 'United States',
        data: [
          {key: 'Lizards', value: 350.13},
          {key: 'Turtles', value: 223.43},
          {key: 'Mice', value: 15.38},
          {key: 'Snakes', value: 122.68},
          {key: 'Dogs', value: 31.54},
          {key: 'Birds', value: 94.84},
        ],
      },
      {
        name: 'China',
        data: [
          {key: 'Snakes', value: 0},
          {key: 'Dogs', value: 0},
        ],
      },
    ]

Now we're going to fill all the data so each DataSeries has key/values that match the entire data set, not just the single DataSeries itself.

Copy link

github-actions bot commented Nov 6, 2023

size-limit report 📦

Path Size Loading time (3g) Running time (snapdragon) Total time
polaris-viz-core-cjs 61.3 KB (0%) 1.3 s (0%) 2.3 s (+52.11% 🔺) 3.5 s
polaris-viz-cjs 211.24 KB (+0.06% 🔺) 4.3 s (+0.06% 🔺) 4.3 s (+41.14% 🔺) 8.5 s
polaris-viz-esm 173.61 KB (+0.06% 🔺) 3.5 s (+0.06% 🔺) 2.1 s (+1.75% 🔺) 5.5 s
polaris-viz-css 4.57 KB (0%) 92 ms (0%) 429 ms (-13.55% 🔽) 520 ms
polaris-viz-esnext 178.71 KB (+0.07% 🔺) 3.6 s (+0.07% 🔺) 2.4 s (-1.31% 🔽) 6 s

@envex envex force-pushed the envex/fill-mismatched-data branch from 67d0342 to 8ecfc59 Compare November 7, 2023 13:32
@envex envex changed the title Playing around with mismatched data Match structure of all data points so each DataSeries contain the same keys Nov 7, 2023
]);
});

it('loops through a large data set in less than 10ms', () => {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is even worth testing, but the original implementation below would get between 80 and 100ms. The new implementation gets around 4ms.

for (const {data} of dataSeries) {
  for (const {key} of data) {
    allKeys.add(`${key}`);
  }
}

return dataSeries.map(({name, data}) => ({
  name,
  data: [...allKeys].map((key) => {
    const matchedValue = data.find((item) => item.key === key);
    return matchedValue ? matchedValue : {key, value: null};
  }),
}));

@envex envex marked this pull request as ready for review November 7, 2023 13:37
Comment on lines 4 to 6
const areAnyDataLengthsDifferent = dataSeries.some(
({data}) => data.length !== dataSeries[0].data.length,
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on this check to see if we need to fill in missing points is going to miss cases where the dataSeries is the same length but the keys are different. I think we'll want to check that the keys are the same across each series instead

e.g. This is a valid case where we'd want to continue but would early return

  {
    name: 'Canada',
    data: [
      {key: 'Cats', value: 6.64},
      {key: 'Birds', value: 54.47},
    ],
  },
  {
    name: 'United States',
    data: [
      {key: 'Lizards', value: 350.13},
      {key: 'Turtles', value: 223.43},
    ],
  },
];

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I thought about that but I'm not sure how often we'd hit this case.

I was originally doing an equality check on the next array in this .some() but it adds about 10ms onto the time.

I wonder if we should just always run the code and not try and bail early.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should just always run the code and not try and bail early.

Agreed, probably worth just running it every time. Looking for extraneous keys would be a hard one to check too

  {
    name: 'Canada',
    data: [
      {key: 'Cats', value: 6.64},
      {key: 'Birds', value: 54.47},
    ],
  },
  {
    name: 'United States',
    data: [
      {key: 'Cats', value: 350.13},
      {key: 'Birds', value: 223.43},
      {key: 'Snakes', value: 100.00},
    ],
  },
];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm ... I always just assume if there is a data edge case there are going to be merchant out there that are going to see it 🤷‍♀️

I'm team lets not bail early and we can always come back and optimize it later. Curious what others think tho

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I removed the early bailout

@envex envex force-pushed the envex/fill-mismatched-data branch from 160a053 to 0e54a1c Compare November 7, 2023 15:52
Copy link

@bencmilton bencmilton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 🚀

@envex envex merged commit 5d29795 into main Nov 7, 2023
4 checks passed
@envex envex deleted the envex/fill-mismatched-data branch September 12, 2024 18:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants