sagemaker.core.model_monitor.clarify_model_monitoring#

This module contains code related to Amazon SageMaker Explainability AI Model Monitoring.

These classes assist with suggesting baselines and creating monitoring schedules for monitoring bias metrics and feature attribution of SageMaker Endpoints.

Classes

BiasAnalysisConfig(bias_config[, headers, label])

Analysis configuration for ModelBiasMonitor.

ClarifyBaseliningConfig(analysis_config[, ...])

Data class to hold some essential analysis configuration of ClarifyBaseliningJob

ClarifyBaseliningJob(processing_job)

Provides functionality to retrieve baseline-specific output from Clarify baselining job.

ClarifyModelMonitor([role, instance_count, ...])

Base class of Amazon SageMaker Explainability API model monitors.

ClarifyMonitoringExecution(...[, output_kms_key])

Provides functionality to retrieve monitoring-specific files output from executions.

ExplainabilityAnalysisConfig(...[, headers, ...])

Analysis configuration for ModelExplainabilityMonitor.

ModelBiasMonitor([role, instance_count, ...])

Amazon SageMaker model monitor to monitor bias metrics of an endpoint.

ModelExplainabilityMonitor([role, ...])

Amazon SageMaker model monitor to monitor feature attribution of an endpoint.

class sagemaker.core.model_monitor.clarify_model_monitoring.BiasAnalysisConfig(bias_config, headers=None, label=None)[source]#

Bases: object

Analysis configuration for ModelBiasMonitor.

class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyBaseliningConfig(analysis_config, features_attribute=None, inference_attribute=None, probability_attribute=None, probability_threshold_attribute=None)[source]#

Bases: object

Data class to hold some essential analysis configuration of ClarifyBaseliningJob

class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyBaseliningJob(processing_job)[source]#

Bases: BaseliningJob

Provides functionality to retrieve baseline-specific output from Clarify baselining job.

baseline_statistics(**_)[source]#

Not implemented.

The class doesn’t support statistics.

Raises:

NotImplementedError

suggested_constraints(file_name=None, kms_key=None)[source]#

Returns a sagemaker.model_monitor.

Constraints object representing the constraints JSON file generated by this baselining job.

Parameters:
  • file_name (str) – Keep this parameter to align with method signature in super class, but it will be ignored.

  • kms_key (str) – The kms key to use when retrieving the file.

Returns:

The Constraints object representing the file that

was generated by the job.

Return type:

sagemaker.model_monitor.Constraints

Raises:

UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.

class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyModelMonitor(role=None, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)[source]#

Bases: ModelMonitor

Base class of Amazon SageMaker Explainability API model monitors.

This class is an abstract base class, please instantiate its subclasses if you want to monitor bias metrics or feature attribution of an endpoint.

get_latest_execution_logs(wait=False)[source]#

Get the processing job logs for the most recent monitoring execution

Parameters:

wait (bool) – Whether the call should wait until the job completes (default: False).

Raises:

ValueError – If no execution job or processing job for the last execution has run

Returns: None

latest_monitoring_statistics(**_)[source]#

Not implemented.

The class doesn’t support statistics.

Raises:

NotImplementedError

list_executions()[source]#

Get the list of the latest monitoring executions in descending order of “ScheduledTime”.

Returns:

List of

ClarifyMonitoringExecution in descending order of “ScheduledTime”.

Return type:

[sagemaker.model_monitor.ClarifyMonitoringExecution]

run_baseline(**_)[source]#

Not implemented.

‘.run_baseline()’ is only allowed for ModelMonitor objects. Please use suggest_baseline instead.

Raises:

NotImplementedError

class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyMonitoringExecution(sagemaker_session, job_name, inputs, output, output_kms_key=None)[source]#

Bases: MonitoringExecution

Provides functionality to retrieve monitoring-specific files output from executions.

model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

statistics(**_)[source]#

Not implemented.

The class doesn’t support statistics.

Raises:

NotImplementedError

class sagemaker.core.model_monitor.clarify_model_monitoring.ExplainabilityAnalysisConfig(explainability_config, model_config, headers=None, label_headers=None)[source]#

Bases: object

Analysis configuration for ModelExplainabilityMonitor.

class sagemaker.core.model_monitor.clarify_model_monitoring.ModelBiasMonitor(role=None, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)[source]#

Bases: ClarifyModelMonitor

Amazon SageMaker model monitor to monitor bias metrics of an endpoint.

Please see the __init__ method of its base class for how to instantiate it.

JOB_DEFINITION_BASE_NAME = 'model-bias-job-definition'#
classmethod attach(monitor_schedule_name, sagemaker_session=None)[source]#

Sets this object’s schedule name to the name provided.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters:
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.core.helper.session_helper.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

create_monitoring_schedule(endpoint_input=None, ground_truth_input=None, analysis_config=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#

Creates a monitoring schedule.

Parameters:
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput. (default: None)

  • ground_truth_input (str) – S3 URI to ground truth dataset. (default: None)

  • analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call. (default: None)

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output” (default: None)

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file. (default: None)

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp. (default: None)

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily. (default: None)

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs. (default: True)

  • batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to run the monitoring schedule on the batch transform (default: None)

  • data_analysis_start_time (str) – Start time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)

  • data_analysis_end_time (str) – End time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)

delete_monitoring_schedule()[source]#

Deletes the monitoring schedule and its job definition.

classmethod monitoring_type()[source]#

Type of the monitoring job.

suggest_baseline(data_config, bias_config, model_config, model_predicted_label_config=None, wait=False, logs=False, job_name=None, kms_key=None)[source]#

Suggests baselines for use with Amazon SageMaker Model Monitoring Schedules.

Parameters:
  • data_config (DataConfig) – Config of the input/output data.

  • bias_config (BiasConfig) – Config of sensitive groups.

  • model_config (ModelConfig) – Config of the model and its endpoint to be created.

  • model_predicted_label_config (ModelPredictedLabelConfig) – Config of how to extract the predicted label from the model output.

  • wait (bool) – Whether the call should wait until the job completes (default: False).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • kms_key (str) – The ARN of the KMS key that is used to encrypt the user code file (default: None).

Returns:

The ProcessingJob object representing the

baselining job.

Return type:

sagemaker.processing.ProcessingJob

update_monitoring_schedule(endpoint_input=None, ground_truth_input=None, analysis_config=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#

Updates the existing monitoring schedule.

If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.

Parameters:
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • ground_truth_input (str) – S3 URI to ground truth dataset.

  • analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

  • batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to run the monitoring schedule on the batch transform

class sagemaker.core.model_monitor.clarify_model_monitoring.ModelExplainabilityMonitor(role=None, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)[source]#

Bases: ClarifyModelMonitor

Amazon SageMaker model monitor to monitor feature attribution of an endpoint.

Please see the __init__ method of its base class for how to instantiate it.

JOB_DEFINITION_BASE_NAME = 'model-explainability-job-definition'#
classmethod attach(monitor_schedule_name, sagemaker_session=None)[source]#

Sets this object’s schedule name to the name provided.

This allows subsequent describe_schedule or list_executions calls to point to the given schedule.

Parameters:
  • monitor_schedule_name (str) – The name of the schedule to attach to.

  • sagemaker_session (sagemaker.core.helper.session_helper.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.

create_monitoring_schedule(endpoint_input=None, analysis_config=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#

Creates a monitoring schedule.

Parameters:
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput. (default: None)

  • analysis_config (str or ExplainabilityAnalysisConfig) – URI to the analysis_config for the explainability job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to

  • transform (run the monitoring schedule on the batch)

  • data_analysis_start_time (str) – Start time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)

  • data_analysis_end_time (str) – End time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)

delete_monitoring_schedule()[source]#

Deletes the monitoring schedule and its job definition.

classmethod monitoring_type()[source]#

Type of the monitoring job.

suggest_baseline(data_config, explainability_config, model_config, model_scores=None, wait=False, logs=False, job_name=None, kms_key=None)[source]#

Suggest baselines for use with Amazon SageMaker Model Monitoring Schedules.

Parameters:
  • data_config (DataConfig) – Config of the input/output data.

  • explainability_config (ExplainabilityConfig) – Config of the specific explainability method. Currently, only SHAP is supported.

  • model_config (ModelConfig) – Config of the model and its endpoint to be created.

  • model_scores (int or str or ModelPredictedLabelConfig) – Index or JMESPath expression to locate the predicted scores in the model output. This is not required if the model output is a single score. Alternatively, it can be an instance of ModelPredictedLabelConfig to provide more parameters like label_headers.

  • wait (bool) – Whether the call should wait until the job completes (default: False).

  • logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).

  • job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.

  • kms_key (str) – The ARN of the KMS key that is used to encrypt the user code file (default: None).

Returns:

The ProcessingJob object representing the

baselining job.

Return type:

sagemaker.processing.ProcessingJob

update_monitoring_schedule(endpoint_input=None, analysis_config=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#

Updates the existing monitoring schedule.

If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.

Parameters:
  • endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.

  • analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.

  • output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”

  • constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.

  • schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.

  • enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.

  • role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.

  • instance_count (int) – The number of instances to run the jobs with.

  • instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.

  • volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).

  • volume_kms_key (str) – A KMS key for the job’s volume.

  • output_kms_key (str) – The KMS key id for the job’s outputs.

  • max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600

  • env (dict) – Environment variables to be passed to the job.

  • network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.

  • batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to run the monitoring schedule on the batch transform

  • data_analysis_start_time (str) – Start time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)

  • data_analysis_end_time (str) – End time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)