Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIKE: Performance Measurement #10432

Closed
33 tasks
cholly75 opened this issue Jul 26, 2024 · 4 comments
Closed
33 tasks

SPIKE: Performance Measurement #10432

cholly75 opened this issue Jul 26, 2024 · 4 comments
Assignees

Comments

@cholly75
Copy link
Collaborator

cholly75 commented Jul 26, 2024

This is a spike to inform potential future stories about ways to improve performance in DAWSON. Although we currently have some metrics we can observe in AWS, we believe that we can enhance the user experience in DAWSON by understanding potential areas of opportunity w/r/t the front end and instrumenting DAWSON to measure performance and establish standards and goals for future development.

The objectives of this spike:

  • Identify good and bad common/key patterns in our design/implementation that influence performance
  • Identify potential avenues for instrumenting DAWSON to measure performance at the individual user and overall system levels
    • 3rd party libraries
    • Custom application code
    • Other
  • Identify key areas of opportunity for performance gains - scope is user-impacting performance
  • Present/discuss findings w/ wider team
  • Timebox spike to 2 weeks

Pre-Conditions

Acceptance Criteria

Notes

  • Any potential solutions should avoid storing measured data on 3rd party/non-Court infrastructure
  • Self-hosted 3rd party apps may be acceptable as long as data stays on the Court side
  • Need to figure out what current measurement avenues are
  • Need to figure out where to store documentation

Tasks

Test Cases

Story Definition of Ready (updated on 12/23/22)

The following criteria must be met in order for the user story to be picked up by the Flexion development team.
The user story must:

  • Is framed in business/user need, the value has been addressed.
  • Includes acceptance criteria
  • Has been refined
  • Pre conditions have been satisfied.

Process:
Flexion developers and designers will test if the story meets acceptance criteria and test cases in Flexion dev and staging environments (“standard testing”). If additional acceptance criteria or testing scenarios are discovered while the story is in progress, a new story should be created, added to the backlog and prioritized by the product owner.

Definition of Done (Updated 5-19-22)

Product Owner

UX

  • Business test scenarios have been refined to meet all acceptance criteria
  • Usability has been validated
  • Wiki has been updated (if applicable)
  • Story has been tested on a mobile device (for external users only)

Engineering

  • Automated test scripts have been written, including visual tests for newly added PDFs.
  • Field level and page level validation errors (front-end and server-side) integrated and functioning.
  • Verify that language for docket record for internal users and external users is identical.
  • New screens have been added to pa11y scripts.
  • All new functionality verified to work with keyboard and macOS voiceover https://www.apple.com/voiceover/info/guide/_1124.html.
  • READMEs, other appropriate docs, and swagger/APIs fully updated.
  • UI should be touch optimized and responsive for external only (functions on supported mobile devices and optimized for screen sizes as required).
  • Interactors should validate entities before calling persistence methods.
  • Code refactored for clarity and to remove any known technical debt.
  • If new docket entries have been added as seed data to efcms-local.json, 3 local s3 files corresponding to that docketEntryId have been added to web-api/storage/fixtures/s3/noop-documents-local-us-east-1
  • Acceptance criteria for the story has been met.
  • If there are special instructions in order to deploy into the next environment, add them as a comment in the story.
  • If the work completed for the story requires a reindex without a migration, or any other special deploy steps, apply these changes to the following flexion branches:
    • experimental1
    • experimental2
    • experimental3
    • experimental4
    • experimental5
    • experimental6
    • develop
  • Reviewed by UX on a deployed environment.
  • Reviewed by PO on a deployed environment. Can be deployed to the Court's test environment if prod-like data is required. Otherwise deployed to any experimental environment.
  • Deployed to the Court's staging environment.
@katiecissell
Copy link

katiecissell commented Jul 30, 2024

Pre-refinement/refinement notes:

  • If we self host a third party service instead of using a hosted service, is that ok? As long as 3rd party doesn't get our data.
  • Do we want to restrict this to only user impacting performance or other kinds as well? Fine with narrowing to user impacting.
  • Could we go over what metrics/tools we have today? Yes. But do we have any besides kibana or open search? But they don't log everything.
  • Where will what we find/decide on in this ticket be documented? Could use google drive?
  • How will we enforce/monitor any ideas/decisions made during this spike? Depends on nature of the decision. Will have to talk about it.
  • When is the spike considered done? Time box to two weeks.
  • Do we only want to do this moving forward or do we need to retroactively collect? Generally, use whatever data we have. If we think it will be useful, we'll figure out a way to get it.

@En-8
Copy link
Collaborator

En-8 commented Jul 31, 2024

I've had a pretty positive experience in the past with using OpenTelemetry, which may be worth considering here (https://opentelemetry.io/docs/what-is-opentelemetry/). Some of it may be redundant with metrics we already collect, but even still it may inform ways of thinking about performance measurement, observability, and metrics.

@cruzjone-flexion cruzjone-flexion self-assigned this Aug 1, 2024
@zachrog zachrog self-assigned this Aug 13, 2024
@zachrog
Copy link
Collaborator

zachrog commented Aug 15, 2024

What performance questions are we trying to answer?

  • What actions are taking the longest to complete?
  • What sequences are taking the longest to complete?
  • Over time, can I see increases or decreases in duration of sequences?
    • Esp in regard to e.g. seeing the impact of performance optimizations.
  • How long do websocket requests take to complete? -- time from socket open to completion notification
  • How long are API requests taking to complete?
  • How long does it take to make DB queries + opensearch queries?

@zachrog
Copy link
Collaborator

zachrog commented Aug 26, 2024

After talking with the team we have decided to try both AWS RUM and custom Kibana logging for performance measurements. All the work for this SPIKE is on 3 branches.

  • AWS RUM: rum-experiment
  • Custom Kibana logging: 10432-performance-to-info-cluster
  • Cron Lambda w/ slack notification: 10432-report-cron-lambda

@zachrog zachrog closed this as completed Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants