This innovative project revolutionizes the way organizations handle data visualization and analysis. By leveraging the power of AWS services, this solution provides a seamless and dynamic approach to generating dashboards based on datasets uploaded to an S3 bucket.
The core functionality of this project lies in its ability to automatically create and update AWS QuickSight dashboards when new CSV files are uploaded to a designated S3 bucket. This automation significantly reduces the manual effort required in dashboard creation and maintenance, enabling more efficient data analysis and decision-making.
- Automatic Updates: Dashboards are created or updated in real-time as new data is uploaded to S3.
- Flexibility: Supports various data formats, with a primary focus on CSV files.
- Course of Action (COA) Analysis: Enables easy comparison of different scenarios or strategies.
- What-If Analysis: Facilitates exploration of potential outcomes based on varying input parameters.
- AWS Infrastructure: Utilizes robust AWS services for reliable and scalable operations.
- Efficient Data Processing: Leverages AWS QuickSight's optimized data handling capabilities.
- Use AWS S3 notifications to trigger Lambda functions.
- Suffix keys (file names with extensions) are used to limit notifications to the exact files of interest, reducing unnecessary API calls.
- Routes messages (S3 Alerts) to AWS Step Function or AWS SQS depending on processing requirements.
- Creates and uploads a manifest file to AWS S3 for QuickSight to read.
- Throttling Management: APIs are throttled at 5 transactions per second (TPS) per user principal and 25 TPS per account. SQS helps manage these limits by controlling the rate of API calls.
- Queue Notifications: Receives notifications after a QuickSight manifest is created.
- Dataset Processing: Dataset processing happens in this stage.
- Message Processing: Lambda trigger #2 processes these messages one by one.
- State Management: Manages state and stores the dataset lifecycle, from creation to dashboard generation.
- Event-Driven Processing: Uses state changes and events in DynamoDB to trigger specialized processing by AWS Lambda.
Lambda Functions
DynamoDB/DynamoDB Streams
- Monitors
aggregated_embark.csv
for row count differences. - Calls the data refresh API to update the dataset for the final dashboard once the row count is stable after successive aggregations.
- Dataset Syncing: Listens for the
DATASETS_SYNCED
event. - Dynamic Dashboard Creation: Creates dashboards from the dashboard definition file.
- Sheet and Visual Filters: Filters dashboard sheets and visuals based on actual datasets in the assessment.
(Lambdas #5-#15)
- Preprocessing: Manages dataset preprocessing for aggregations and joins.
- Aggregation and Joining: Performs joins and aggregates all Class of Supply Embark files.
- Sync Verification: Checks files in AWS S3 against datasets in DynamoDB for sync status.