Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS S3 CloudWatch connector can only be used with a single CloudWatch log group #9056

Closed
paulschwarzenberger opened this issue Sep 20, 2023 · 13 comments
Assignees
Labels
Connector Connector specialty review needed enhancement New feature or request

Comments

@paulschwarzenberger
Copy link

paulschwarzenberger commented Sep 20, 2023

Describe the bug
The Sentinel AWS S3 CloudWatch connector uses a Sentinel data table for CloudWatch, which only has two columns that can be populated by the ingested event: timestamp and message. This works fine for a single CloudWatch log group, but most AWS customers have 100s or 1000s of CloudWatch log groups. There's currently no way to tell which AWS account the event is coming from or which log group or log stream is generating the event.

It's therefore not practical to create meaningful alerts or incidents, e.g. a particular event occurred in the production AWS account.

For this to be a workable solution for enterprise customers, 3 additional columns should be added to the CloudWatch data table: AWS Account ID, CloudWatch log group name, CloudWatch log stream name.

To Reproduce
Steps to reproduce the behavior:

  1. Follow Microsoft instructions to set up connection from a AWS CloudWatch log group to Sentinel
  2. Add a second Lambda function for a second Log Group
  3. Run a query in Sentinel AWSCloudWatch
  4. View the resultant logs
  5. There is no way of telling which AWS account, log group or log stream is generating the event

Expected behavior

  • AWS Account ID, CloudWatch log group name, CloudWatch log stream name should be added as extra CloudWatch data table columns
  • Documentation and code examples for log ingestion from AWS consistent with this

Screenshots

CloudWatch events ingested to Sentinel:
230920-sentinel-cloudwatch-table

Additional context
The documentation and example code is not an efficient architecture for an enterprise. A separate Lambda function would be required for every CloudWatch log group. The recommended approach from AWS is to use CloudWatch Log Subscription filters, Kinesis stream to a single S3 bucket and then a single Lambda function to transform the data into the correct format for Sentinel. Please contact me via LinkedIn if you'd like a demo of how we've implemented this.

However this is a peripheral issue, the most important point is the need for 3 additional columns in the CloudWatch data table.

@github-actions
Copy link
Contributor

Thank you for submitting an Issue to the Azure Sentinel GitHub repo! You should expect an initial response to your Issue from the team within 5 business days. Note that this response may be delayed during holiday periods. For urgent, production-affecting issues please raise a support ticket via the Azure Portal.

@v-amolpatil v-amolpatil added the Connector Connector specialty review needed label Sep 20, 2023
@v-sudkharat
Copy link
Contributor

Hello @paulschwarzenberger, thanks for flagging this issue, we will soon get back to you on this. Thanks!

@v-sudkharat v-sudkharat added the enhancement New feature or request label Sep 26, 2023
@v-sudkharat
Copy link
Contributor

Hi @paulschwarzenberger, we are checking this issue with team, we will share you update on this. Thanks!

@bobsyourmom
Copy link

bobsyourmom commented Oct 5, 2023

The script also has hardcoded timestamps.
Shouldn't it be written to use time offsets?
eg:
END_TIME_UTC = datetime.utcnow()
START_TIME_UTC = END_TIME_UTC - timedelta(minutes=10)

@v-sudkharat
Copy link
Contributor

v-sudkharat commented Oct 9, 2023

Hi @paulschwarzenberger, as this is an enhancement issue, we received the update from data collection team, the CloudWatch logs offers flexibility schema. (Ex. Message contains the informative information). we would like to extract the log group, stream, and account name and direct them to specific columns. It seems difficult to establish a universal method of extracting this information directly from the message itself given that each customer's CloudWatch logs may vary.
so, could you please configure the Data Collection Rule for those addition columns? and share the feedback with us.
Thanks!

@v-sudkharat
Copy link
Contributor

Hi @paulschwarzenberger, hope you are doing well. We are waiting for your response on above comment. Thanks!

@paulschwarzenberger
Copy link
Author

In our opinion, Microsoft Sentinel should provide a connector solution for AWS CloudWatch logs which is highly scalable and doesn't require custom configuration by each customer.

As I described, the current solution isn't usable in practice because it only works with a single CloudWatch log group whereas all AWS customers have many CloudWatch log groups, often 100s or 1,000s. Your example Lambda function isn't currently usable, because it's not practical to have a separate Lambda function for every single log group.

The AWS recommended approach for CloudWatch log aggregation is to use a central S3 bucket, Firehose data stream(s) and CloudWatch Log Subscription Filters as detailed in this link.

I suggest that you update the Sentinel AWS S3 connector for CloudWatch to ingest logs delivered to S3 in this manner. An example log from Firehose data stream is:
{ "messageType": "DATA_MESSAGE", "owner": "012345678901", "logGroup": "TestLogGroup", "logStream": "TestLogStream", "subscriptionFilters": [ "app-logs-to-s3-sentinel-dev" ], "logEvents": [ { "id": "37793097787735394733383965267607845781987661504445874176", "timestamp": 1694701116545, "message": "{\"cat\":\"client\",\"outcome\":\"accepted\",\"client\":\"10.10.10.10\",\"gid\":\"sg-0c67f0fty50ehh6bn\",\"instance\":\"i-0115ccccf78956ddc\",\"timestamp\":\"2023-09-14T14:18:36.545938648Z\"}" } ] }

The above log format produced by Kinesis Data Firehose delivery streams for CloudWatch Log Subscription filters is consistent and predictable, so ideal for mapping values into a standard Sentinel data table.

In our implementation, we use Kinesis Data Firehose delivery streams from CloudWatch Log Subscription Filters, going to a S3 bucket for pre-processed CloudWatch data, which triggers a Lambda function to transform the data and copy to another S3 bucket that is integrated with Sentinel via the AWS S3 CloudWatch connector.

An alternative architecture might be to use a data transform Lambda within the FireHose delivery stream, however I haven't tested that.

The Lambda transform function used in our implementation is based on your example, and extracts the fields from the logEvents list to populate the CloudWatch data table.

I'd like to see the Sentinel team modify the CloudWatch data table, adding columns for owner (AWS account ID), logGroup, logStream, and optionally subscriptionFilters, and to ingest the logs from a Kinesis Firehose delivery stream, rather than directly from CloudWatch logs. This would provide a usable, practical, fully scalable solution for AWS customers who wish to use Sentinel as their SIEM for 100s or 1000s of log groups. Benefits of this approach are that it can be used "out of the box" with no custom configuration on the Azure side, it's highly scalable, and straightforward to add new AWS accounts and new CloudWatch log groups.

I'd be happy to show you example infrastructure on a call, and I can provide a copy of our Lambda transform code if that would be helpful.

@v-sudkharat
Copy link
Contributor

Hi @paulschwarzenberger, this is an enhancement issue, we are reaching out to data collection team. Once we receive an update from the team, we will update you. Thanks!

@v-sudkharat
Copy link
Contributor

Hi @paulschwarzenberger, could you please share you mail id with us? Thanks!

@paulschwarzenberger
Copy link
Author

No problem, it's paul@celidor.net

@v-sudkharat
Copy link
Contributor

Hi @paulschwarzenberger, Thanks for sharing the Mail ID with us. We will share these details with our data collection team, So, team can reach out to you to get the information they require. Thanks!
As this is an enhancement issue, we are closing this issue. If you still need support for this issue, feel free to re-open it any time. Thank you for your co-operation.

@mjubb
Copy link

mjubb commented Nov 3, 2023

Just want to +1 Paul's excellent description of the problem set and suitable solutions.

@mehmettaskiner
Copy link

@v-sudkharat Just wondering if there are any updates on the firehose solution suggested by @paulschwarzenberger

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Connector Connector specialty review needed enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants