forked from opensearch-project/OpenSearch
-
Notifications
You must be signed in to change notification settings - Fork 25
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add Pyhton module to implement Amazon Security Lake integration (#186)
* Migrate from #147 * Update amazon-security-lake integration - Improved documentation. - Python code has been moved to `wazuh-indexer/integrations/amazon-security-lake/src`. - Development environment now uses OpenSearch 2.12.0. - The `wazuh.integration.security.lake` container now displays logs, by watching logstash's log file. - [**NEEDS FIX**] As a temporary solution, the `INDEXER_USERNAME` and `INDEXER_PASSWORD` values have been added as an environment variable to the `wazuh.integration.security.lake` container. These values should be set at Dockerfile level, but isn't working, probably due to permission denied on invocation of the `setup.sh` script. - [**NEEDS FIX**] As a temporary solution, the output file of the `indexer-to-file` pipeline as been moved to `/var/log/logstash/indexer-to-file`. Previous path `/usr/share/logstash/pipeline/indexer-to-file.json` results in permission denied. - [**NEEDS FIX**] As a temporary solution, the input.opensearch.query has been replaced with `match_all`, as the previous one does not return any data, probably to the use of time filters `gt: now-1m`. - Standard output enable for `/usr/share/logstash/pipeline/indexer-to-file.json`. - [**NEEDS FIX**] ECS compatibility disabled: `echo "pipeline.ecs_compatibility: disabled" >> /etc/logstash/logstash.yml` -- to be included automatically - Python3 environment path added to the `indexer-to-integrator` pipeline. * Disable ECS compatibility (auto) - Adds pipeline.ecs_compatibility: disabled at Dockerfile level. - Removes `INDEXER_USERNAME` and `INDEXER_PASSWORD` as environment variables on the `wazuh.integration.security.lake` container. * Add @timestamp field to sample alerts * Fix Logstash pipelines * Add working indexer-to-s3 pipeline * Add working Python script up to S3 upload * Add latest changes * Remove duplicated line
- Loading branch information
Showing
20 changed files
with
1,225 additions
and
1,146 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
pyarrow>=10.0.1 | ||
parquet-tools>=0.2.15 | ||
pydantic==2.6.1 | ||
pydantic==2.6.1 | ||
boto3==1.34.46 |
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
import parquet.parquet |
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,122 @@ | ||
#!/env/bin/python3 | ||
# vim: bkc=yes bk wb | ||
|
||
import sys | ||
import os | ||
import datetime | ||
import transform | ||
import pyarrow as pa | ||
import pyarrow.parquet as pq | ||
import logging | ||
import boto3 | ||
from botocore.exceptions import ClientError | ||
|
||
# NOTE work in progress | ||
def upload_file(table, file_name, bucket, object_name=None): | ||
"""Upload a file to an S3 bucket | ||
:param table: PyArrow table with events data | ||
:param file_name: File to upload | ||
:param bucket: Bucket to upload to | ||
:param object_name: S3 object name. If not specified then file_name is used | ||
:return: True if file was uploaded, else False | ||
""" | ||
|
||
client = boto3.client( | ||
service_name='s3', | ||
aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'], | ||
aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'], | ||
region_name=os.environ['AWS_REGION'], | ||
endpoint_url='http://s3.ninja:9000', | ||
) | ||
|
||
# If S3 object_name was not specified, use file_name | ||
if object_name is None: | ||
object_name = os.path.basename(file_name) | ||
|
||
# Upload the file | ||
try: | ||
client.put_object(Bucket=bucket, Key=file_name, Body=open(file_name, 'rb')) | ||
except ClientError as e: | ||
logging.error(e) | ||
return False | ||
return True | ||
|
||
|
||
def main(): | ||
'''Main function''' | ||
# Get current timestamp | ||
timestamp = datetime.datetime.now(datetime.timezone.utc).isoformat() | ||
|
||
# Generate filenames | ||
filename_raw = f"/tmp/integrator-raw-{timestamp}.json" | ||
filename_ocsf = f"/tmp/integrator-ocsf-{timestamp}.json" | ||
filename_parquet = f"/tmp/integrator-ocsf-{timestamp}.parquet" | ||
|
||
# 1. Extract data | ||
# ================ | ||
raw_data = [] | ||
for line in sys.stdin: | ||
raw_data.append(line) | ||
|
||
# Echo piped data | ||
with open(filename_raw, "a") as fd: | ||
fd.write(line) | ||
|
||
# 2. Transform data | ||
# ================ | ||
# a. Transform to OCSF | ||
ocsf_data = [] | ||
for line in raw_data: | ||
try: | ||
event = transform.converter.from_json(line) | ||
ocsf_event = transform.converter.to_detection_finding(event) | ||
ocsf_data.append(ocsf_event.model_dump()) | ||
|
||
# Temporal disk storage | ||
with open(filename_ocsf, "a") as fd: | ||
fd.write(str(ocsf_event) + "\n") | ||
except AttributeError as e: | ||
print("Error transforming line to OCSF") | ||
print(event) | ||
print(e) | ||
|
||
# b. Encode as Parquet | ||
try: | ||
table = pa.Table.from_pylist(ocsf_data) | ||
pq.write_table(table, filename_parquet) | ||
except AttributeError as e: | ||
print("Error encoding data to parquet") | ||
print(e) | ||
|
||
# 3. Load data (upload to S3) | ||
# ================ | ||
if upload_file(table, filename_parquet, os.environ['AWS_BUCKET']): | ||
# Remove /tmp files | ||
pass | ||
|
||
|
||
def _test(): | ||
ocsf_event = {} | ||
with open("./wazuh-event.sample.json", "r") as fd: | ||
# Load from file descriptor | ||
for raw_event in fd: | ||
try: | ||
event = transform.converter.from_json(raw_event) | ||
print("") | ||
print("-- Wazuh event Pydantic model") | ||
print("") | ||
print(event.model_dump()) | ||
ocsf_event = transform.converter.to_detection_finding(event) | ||
print("") | ||
print("-- Converted to OCSF") | ||
print("") | ||
print(ocsf_event.model_dump()) | ||
|
||
except KeyError as e: | ||
raise (e) | ||
|
||
|
||
if __name__ == '__main__': | ||
main() | ||
# _test() |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
{"cluster":{"name":"wazuh-cluster","node":"wazuh-manager"},"agent":{"id":"003","ip":"10.0.0.180","name":"ip-10-0-0-180.us-west-1.compute.internal"},"@timestamp":"2024-03-14T12:57:05.730Z","data":{"audit":{"exe":"/usr/sbin/sshd","type":"NORMAL","cwd":"/home/wazuh","file":{"name":"/var/sample"},"success":"yes","command":"ssh"}},"@version":"1","manager":{"name":"wazuh-manager"},"location":"","decoder":{},"id":"1580123327.49031","predecoder":{},"timestamp":"2024-03-14T12:57:05.730+0000","rule":{"description":"Audit: Command: /usr/sbin/ssh","firedtimes":3,"level":3,"id":"80791","mail":false,"groups":["audit","audit_command"],"gdpr":["IV_30.1.g"]}} |
Oops, something went wrong.