Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: add support to clarify.py for time series explainability jobs #4503

Merged
merged 65 commits into from
Mar 21, 2024

Conversation

rvasahu-amazon
Copy link
Contributor

@rvasahu-amazon rvasahu-amazon commented Mar 14, 2024

A successor to this PR.

Description of changes:

  • Added TimeSeriesDataConfig, TimeSeriesModelConfig, and AsymmetricShapleyValueConfig (child of ExplainabilityConfig) classes. Added unit tests for these new components.
  • Modified DataConfig, and ModelConfig to take TimeSeriesDataConfig and TimeSeriesModelConfig respectively as parameters. Added unit tests for these changes.
  • Modified _AnalysisConfigGenerator to support the modified DataConfig and ModelConfig plus the new AsymmetricShapleyValueConfig. Added unit tests for these changes.

Testing done:

Clarify unit test results:

============================================= 145 passed in 1.02s ==============================================

Merge Checklist

Put an x in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.

General

  • I have read the CONTRIBUTING doc
  • I certify that the changes I am introducing will be backward compatible, and I have discussed concerns about this, if any, with the Python SDK team
  • I used the commit message format described in CONTRIBUTING
  • I have passed the region in to all S3 and STS clients that I've initialized as part of this change.
  • I have updated any necessary documentation, including READMEs and API docs (if appropriate)

Tests

  • I have added tests that prove my fix is effective or that my feature works (if appropriate)
  • I have added unit and/or integration tests as appropriate to ensure backward compatibility of the changes
  • I have checked that my tests are not configured for a specific region or account (if appropriate)
  • I have used unique_name_from_base to create resource names in integ tests (if appropriate)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

rvasahu-amazon and others added 30 commits October 18, 2023 15:51
…Clarify. These components are TimeSeriesDataConfig, TimeSeriesModelConfig, and AsymmetricSHAPConfig, alomg with unit tests for them.

fix: Modified DataConfig, ModelConfig, and _AnalysisConfigGenerator to support the new components and time series explainability.
documentation: added docstrings for the new components and their tests.
change: renamed TimeSeriesDataConfig.predictor_config to
time_series_model_config

change: modified default value of forecast_horizon to 1
change: removed default value for num_samples

change: changed default value for explanation_type

change: added more explicit type hint for explanation_type

documentation: added more information for parameters

change: reworked unit tests for AsymmetricSHAPConfig
change: _merge_explainability_configs reordered to put validation first

change: unit tests reworked
change: removed now-redundant check in time_series_case
change: updated ``TimeSeriesDataConfig`` unit tests to reflect above
change
…_shap config

change: add unit tests for _AnalysisConfigGenerator.explainability

fix: slightly modify AsymmetricSHAPConfig mock builder function

documentation: minor docstring update in tests
change: add a check to _AnalysisConfiGenerator.bias_and_explainability
to prevent time series components being provided, added unit tests for
this
documentation: reword `target_time_series` parameter description

documentation: remove TODOs
…d ``granularity``

update tests and documentation accordingly
@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 6d1312f
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: a8ad1d1
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@aws-byeldos
Copy link

let's improve the validation, so it checks if baseline is S3 uri

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: b535f53
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 222fb75
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: ba8ac77
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: ff58181
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 6274f5f
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 6274f5f
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mufaddal-rohawala
Copy link
Member

AWS CodeBuild CI Report

  • CodeBuild project: sagemaker-python-sdk-pr
  • Commit ID: 89e6100
  • Result: FAILED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@benieric benieric self-assigned this Mar 21, 2024
@benieric benieric merged commit ece48d4 into aws:master Mar 21, 2024
7 of 8 checks passed
malav-shastri pushed a commit to malav-shastri/sagemaker-python-sdk that referenced this pull request Jun 20, 2024
… jobs (aws#4503)

* feature: Added components to support time series explainability with Clarify. These components are TimeSeriesDataConfig, TimeSeriesModelConfig, and AsymmetricSHAPConfig, alomg with unit tests for them.
fix: Modified DataConfig, ModelConfig, and _AnalysisConfigGenerator to support the new components and time series explainability.
documentation: added docstrings for the new components and their tests.

* fix: removed field use_future_covariates and related unit tests from TimeSeriesModelConfig

* change: rename ``TimeSeriesDataConfig.analysis_config`` to ``time_series_data_config``

* change: validate DataConfig content_type and accept_type for TS exp.

change: renamed TimeSeriesDataConfig.predictor_config to
time_series_model_config

change: modified default value of forecast_horizon to 1

* change: reworked validation in AsymmetricSHAPConfig

change: removed default value for num_samples

change: changed default value for explanation_type

change: added more explicit type hint for explanation_type

documentation: added more information for parameters

change: reworked unit tests for AsymmetricSHAPConfig

* change: time series case no longer uses _merge_explainability_configs

change: _merge_explainability_configs reordered to put validation first

change: unit tests reworked

* fix: minor style changes to meet formatting reqs

* change: modified how time_series_case flag is set

change: removed now-redundant check in time_series_case

* fix: set time_series_case to False to prevent exception

* change: params for ``TimeSeriesDataConfig`` now must all be same type

change: updated ``TimeSeriesDataConfig`` unit tests to reflect above
change

* fix: schema entries for related_ts and item_metadata to keep list items same type

* change: remove forecast_horizon from TimeSeriesModelConfig

* fix: modified type hints in ``TimeSeriesDataConfig`` to match schema

* change: add errors when ts data or model config are given but no asym_shap config

change: add unit tests for _AnalysisConfigGenerator.explainability

fix: slightly modify AsymmetricSHAPConfig mock builder function

documentation: minor docstring update in tests

* change: remove flag ``time_series_case``, modify unit tests accordingly

change: add a check to _AnalysisConfiGenerator.bias_and_explainability
to prevent time series components being provided, added unit tests for
this

* change: rename `AsymmetricSHAPConfig` to `AsymmetricShapleyValueConfig`

* documentation: add description to ``AsymmetricShapleyValueConfig``

documentation: reword `target_time_series` parameter description

documentation: remove TODOs

* change: split ``explanation_type`` into ``explanation_ direction`` and ``granularity``

update tests and documentation accordingly

* fix: rename ``explanation_granularity`` to ``granularity``

* change: rename ``explanation_direction`` to ``direction``

* change: rename ``item_metadata`` to ``static_covariates``

change: add ``dataset_format`` as a parameter for time series cases

change: allow features jmespaths to be none for time series cases

change: add validation to prevent non-json dataset formats for time
series cases

test: update unit tests to reflect above changes

* fix: update clarify files to meet formatting reqs

* change: require headers for time series explainability

* feat: add (early version of) baseline config to asym shap val config

* refactor: change baseline config param names to keep with convention and be more cx friendly

* refactor: baseline config from list to dictionary where key is item_id value

* fix: set dataset_uri from s3_data_input_path

* fix: undo previous bug fix

what i believed was a bug was actually intended behaviour

* feat: add ``ITEM_RECORDS`` as a supported dataset format

change: remove ``headers`` as a requirement for time series

doc: add example dataset formats to ``TimeSeriesJSONDatasetFormat``

* doc: make docs for TimeSeriesJSONDatasetFormat sphinx-compliant

change: add ``item_records`` as a supported format to the schema

doc: add documentation for baseline

doc: remove references to deprecated ``forecast_horizon``

* feat: validation for asymmetric shapley value config baseline

doc: fix baseline doc to be sphinx-compliant

* feat: add validation for static covariates in tsx baseline

* fix: add call to validate baseline static covariates method

* fix: check if baseline dict is s3 uri in scv validation function

* fix: replace all added asserts with ValueError

---------

Co-authored-by: Mufaddal Rohawala <89424143+mufaddal-rohawala@users.noreply.github.com>
jiapinw pushed a commit to jiapinw/sagemaker-python-sdk that referenced this pull request Jun 25, 2024
… jobs (aws#4503)

* feature: Added components to support time series explainability with Clarify. These components are TimeSeriesDataConfig, TimeSeriesModelConfig, and AsymmetricSHAPConfig, alomg with unit tests for them.
fix: Modified DataConfig, ModelConfig, and _AnalysisConfigGenerator to support the new components and time series explainability.
documentation: added docstrings for the new components and their tests.

* fix: removed field use_future_covariates and related unit tests from TimeSeriesModelConfig

* change: rename ``TimeSeriesDataConfig.analysis_config`` to ``time_series_data_config``

* change: validate DataConfig content_type and accept_type for TS exp.

change: renamed TimeSeriesDataConfig.predictor_config to
time_series_model_config

change: modified default value of forecast_horizon to 1

* change: reworked validation in AsymmetricSHAPConfig

change: removed default value for num_samples

change: changed default value for explanation_type

change: added more explicit type hint for explanation_type

documentation: added more information for parameters

change: reworked unit tests for AsymmetricSHAPConfig

* change: time series case no longer uses _merge_explainability_configs

change: _merge_explainability_configs reordered to put validation first

change: unit tests reworked

* fix: minor style changes to meet formatting reqs

* change: modified how time_series_case flag is set

change: removed now-redundant check in time_series_case

* fix: set time_series_case to False to prevent exception

* change: params for ``TimeSeriesDataConfig`` now must all be same type

change: updated ``TimeSeriesDataConfig`` unit tests to reflect above
change

* fix: schema entries for related_ts and item_metadata to keep list items same type

* change: remove forecast_horizon from TimeSeriesModelConfig

* fix: modified type hints in ``TimeSeriesDataConfig`` to match schema

* change: add errors when ts data or model config are given but no asym_shap config

change: add unit tests for _AnalysisConfigGenerator.explainability

fix: slightly modify AsymmetricSHAPConfig mock builder function

documentation: minor docstring update in tests

* change: remove flag ``time_series_case``, modify unit tests accordingly

change: add a check to _AnalysisConfiGenerator.bias_and_explainability
to prevent time series components being provided, added unit tests for
this

* change: rename `AsymmetricSHAPConfig` to `AsymmetricShapleyValueConfig`

* documentation: add description to ``AsymmetricShapleyValueConfig``

documentation: reword `target_time_series` parameter description

documentation: remove TODOs

* change: split ``explanation_type`` into ``explanation_ direction`` and ``granularity``

update tests and documentation accordingly

* fix: rename ``explanation_granularity`` to ``granularity``

* change: rename ``explanation_direction`` to ``direction``

* change: rename ``item_metadata`` to ``static_covariates``

change: add ``dataset_format`` as a parameter for time series cases

change: allow features jmespaths to be none for time series cases

change: add validation to prevent non-json dataset formats for time
series cases

test: update unit tests to reflect above changes

* fix: update clarify files to meet formatting reqs

* change: require headers for time series explainability

* feat: add (early version of) baseline config to asym shap val config

* refactor: change baseline config param names to keep with convention and be more cx friendly

* refactor: baseline config from list to dictionary where key is item_id value

* fix: set dataset_uri from s3_data_input_path

* fix: undo previous bug fix

what i believed was a bug was actually intended behaviour

* feat: add ``ITEM_RECORDS`` as a supported dataset format

change: remove ``headers`` as a requirement for time series

doc: add example dataset formats to ``TimeSeriesJSONDatasetFormat``

* doc: make docs for TimeSeriesJSONDatasetFormat sphinx-compliant

change: add ``item_records`` as a supported format to the schema

doc: add documentation for baseline

doc: remove references to deprecated ``forecast_horizon``

* feat: validation for asymmetric shapley value config baseline

doc: fix baseline doc to be sphinx-compliant

* feat: add validation for static covariates in tsx baseline

* fix: add call to validate baseline static covariates method

* fix: check if baseline dict is s3 uri in scv validation function

* fix: replace all added asserts with ValueError

---------

Co-authored-by: Mufaddal Rohawala <89424143+mufaddal-rohawala@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants