SPIKE: Performance Measurement #10432

cholly75 · 2024-07-26T17:57:19Z

This is a spike to inform potential future stories about ways to improve performance in DAWSON. Although we currently have some metrics we can observe in AWS, we believe that we can enhance the user experience in DAWSON by understanding potential areas of opportunity w/r/t the front end and instrumenting DAWSON to measure performance and establish standards and goals for future development.

The objectives of this spike:

Identify good and bad common/key patterns in our design/implementation that influence performance
Identify potential avenues for instrumenting DAWSON to measure performance at the individual user and overall system levels
- 3rd party libraries
- Custom application code
- Other
Identify key areas of opportunity for performance gains - scope is user-impacting performance
Present/discuss findings w/ wider team
Timebox spike to 2 weeks

Pre-Conditions

Acceptance Criteria

Notes

Any potential solutions should avoid storing measured data on 3rd party/non-Court infrastructure
Self-hosted 3rd party apps may be acceptable as long as data stays on the Court side
Need to figure out what current measurement avenues are
Need to figure out where to store documentation

Tasks

Test Cases

Story Definition of Ready (updated on 12/23/22)

The following criteria must be met in order for the user story to be picked up by the Flexion development team.
The user story must:

Is framed in business/user need, the value has been addressed.
Includes acceptance criteria
Has been refined
Pre conditions have been satisfied.

Process:
Flexion developers and designers will test if the story meets acceptance criteria and test cases in Flexion dev and staging environments (“standard testing”). If additional acceptance criteria or testing scenarios are discovered while the story is in progress, a new story should be created, added to the backlog and prioritized by the product owner.

Definition of Done (Updated 5-19-22)

Product Owner

Acceptance criteria have been met and validated on the Court's migration environment
Add scenario to testing document, if applicable (https://docs.google.com/spreadsheets/d/1FUHKC_YrT-PosaWD5gRVmsDzI1HS_U-8CyMIb-qX9EA/edit?usp=sharing)

UX

Business test scenarios have been refined to meet all acceptance criteria
Usability has been validated
Wiki has been updated (if applicable)
Story has been tested on a mobile device (for external users only)

Engineering

The text was updated successfully, but these errors were encountered:

katiecissell · 2024-07-30T18:45:23Z

Pre-refinement/refinement notes:

If we self host a third party service instead of using a hosted service, is that ok? As long as 3rd party doesn't get our data.
Do we want to restrict this to only user impacting performance or other kinds as well? Fine with narrowing to user impacting.
Could we go over what metrics/tools we have today? Yes. But do we have any besides kibana or open search? But they don't log everything.
Where will what we find/decide on in this ticket be documented? Could use google drive?
How will we enforce/monitor any ideas/decisions made during this spike? Depends on nature of the decision. Will have to talk about it.
When is the spike considered done? Time box to two weeks.
Do we only want to do this moving forward or do we need to retroactively collect? Generally, use whatever data we have. If we think it will be useful, we'll figure out a way to get it.

En-8 · 2024-07-31T15:30:55Z

I've had a pretty positive experience in the past with using OpenTelemetry, which may be worth considering here (https://opentelemetry.io/docs/what-is-opentelemetry/). Some of it may be redundant with metrics we already collect, but even still it may inform ways of thinking about performance measurement, observability, and metrics.

zachrog · 2024-08-15T21:46:48Z

What performance questions are we trying to answer?

What actions are taking the longest to complete?
What sequences are taking the longest to complete?
Over time, can I see increases or decreases in duration of sequences?
- Esp in regard to e.g. seeing the impact of performance optimizations.
How long do websocket requests take to complete? -- time from socket open to completion notification
How long are API requests taking to complete?
How long does it take to make DB queries + opensearch queries?

zachrog · 2024-08-26T17:28:54Z

After talking with the team we have decided to try both AWS RUM and custom Kibana logging for performance measurements. All the work for this SPIKE is on 3 branches.

AWS RUM: rum-experiment
Custom Kibana logging: 10432-performance-to-info-cluster
Cron Lambda w/ slack notification: 10432-report-cron-lambda

cholly75 added the Need to Refine label Jul 29, 2024

cholly75 removed the Need to Refine label Jul 31, 2024

cruzjone-flexion self-assigned this Aug 1, 2024

zachrog self-assigned this Aug 13, 2024

zachrog assigned En-8 and JayFlexy Aug 19, 2024

zachrog closed this as completed Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPIKE: Performance Measurement #10432

SPIKE: Performance Measurement #10432

cholly75 commented Jul 26, 2024 •

edited

Loading

katiecissell commented Jul 30, 2024 •

edited

Loading

En-8 commented Jul 31, 2024

zachrog commented Aug 15, 2024

zachrog commented Aug 26, 2024

SPIKE: Performance Measurement #10432

SPIKE: Performance Measurement #10432

Comments

cholly75 commented Jul 26, 2024 • edited Loading

Pre-Conditions

Acceptance Criteria

Notes

Tasks

Test Cases

Story Definition of Ready (updated on 12/23/22)

Definition of Done (Updated 5-19-22)

katiecissell commented Jul 30, 2024 • edited Loading

En-8 commented Jul 31, 2024

zachrog commented Aug 15, 2024

zachrog commented Aug 26, 2024

cholly75 commented Jul 26, 2024 •

edited

Loading

katiecissell commented Jul 30, 2024 •

edited

Loading