Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide an anomaly detection mechanism in the dashboard #227

Open
GoutsmitSam opened this issue Nov 7, 2024 · 4 comments
Open

Provide an anomaly detection mechanism in the dashboard #227

GoutsmitSam opened this issue Nov 7, 2024 · 4 comments

Comments

@GoutsmitSam
Copy link
Contributor

GoutsmitSam commented Nov 7, 2024

Scenario

  • Alert the user if an unusual amount of errors and / or messages occur during a certain timeperiod.
  • Altering can be either a direct visualisation in the dashboard or a new type of Custom Alert within Dashboard (or both)

Input:

  • Usual number of errors and messages per flow per period (to refine)

Output

  • Visualisaton in the dashboard
  • New type of Custom Alert

Image

@GoutsmitSam
Copy link
Contributor Author

07/11 During meeting with Ruud en Perrine, two broad options were discussed how this could be set up

Some more research is needed to determine which one would be best feature- and pricewise.

@grik001
Copy link
Collaborator

grik001 commented Nov 12, 2024

I have attached two files: MergedParentFlow and WorkFlowEvent.

For the current alerts, we push the flowId and timestamp to LogAnalytics. A heartbeat signal is sent every x minutes to indicate that the flow is active. Other logs are only sent when an error is detected in the ImportJob for each flow individually.

We have highlighted the most important fields in both files, but we recommend reviewing the entire schema in detail. Please don’t hesitate to reach out if you have any questions.

MergedParentFlow:

This is the parent flow which in the database would be a single document in the Flows Collection(single row in the dashboard Flow page). We currently ONLY use the merged flows for both heartbeat and error detection

StatusID - used to identify if the flow has failed, completed or is active
ChainID - used as a unique identifer for the document and to group Workflow events together
EventTimestamp - the start time that the flow was triggered
FlowId - an array of matching FlowIds

WorkFlowEvent:

This represents the chain of Logic Apps associated with a specific parent flow. We’ve included this sample in case you would like to base your anomaly detection on more granular data.

EventTimestamp - the start time that the LogicApp was triggered
StatusID - used to identify if the specific LogicApp has failed, completed or is active

Feel free to ping us if you have further questions.

WorkFlowEvent.txt
MergedParentFlow.txt

@GoutsmitSam
Copy link
Contributor Author

@ruud-wichers-schreur @PerrineDeBrabant Do you have enough information on the data we could deliver for this anomaly detection, or would you need more specific information, in order to start investigating whether ADX or AI Studio would be the best way forward with this?

@ruud-wichers-schreur
Copy link

ruud-wichers-schreur commented Nov 13, 2024

Will read in more detail later this week, but since you query log analytics withKQL already we could possibly also use KQL included machine learning operators, functions and plugins for time series analysis, anomaly detection, forecasting, and root cause analysis.

Where we first make a time series, and then find anomalies in it.

https://learn.microsoft.com/en-us/azure/azure-monitor/logs/kql-machine-learning-azure-monitor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants