sagemaker.core.model_monitor.clarify_model_monitoring#
This module contains code related to Amazon SageMaker Explainability AI Model Monitoring.
These classes assist with suggesting baselines and creating monitoring schedules for monitoring bias metrics and feature attribution of SageMaker Endpoints.
Classes
|
Analysis configuration for ModelBiasMonitor. |
|
Data class to hold some essential analysis configuration of ClarifyBaseliningJob |
|
Provides functionality to retrieve baseline-specific output from Clarify baselining job. |
|
Base class of Amazon SageMaker Explainability API model monitors. |
|
Provides functionality to retrieve monitoring-specific files output from executions. |
|
Analysis configuration for ModelExplainabilityMonitor. |
|
Amazon SageMaker model monitor to monitor bias metrics of an endpoint. |
|
Amazon SageMaker model monitor to monitor feature attribution of an endpoint. |
- class sagemaker.core.model_monitor.clarify_model_monitoring.BiasAnalysisConfig(bias_config, headers=None, label=None)[source]#
Bases:
objectAnalysis configuration for ModelBiasMonitor.
- class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyBaseliningConfig(analysis_config, features_attribute=None, inference_attribute=None, probability_attribute=None, probability_threshold_attribute=None)[source]#
Bases:
objectData class to hold some essential analysis configuration of ClarifyBaseliningJob
- class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyBaseliningJob(processing_job)[source]#
Bases:
BaseliningJobProvides functionality to retrieve baseline-specific output from Clarify baselining job.
- baseline_statistics(**_)[source]#
Not implemented.
The class doesn’t support statistics.
- Raises:
NotImplementedError –
- suggested_constraints(file_name=None, kms_key=None)[source]#
Returns a sagemaker.model_monitor.
Constraints object representing the constraints JSON file generated by this baselining job.
- Parameters:
file_name (str) – Keep this parameter to align with method signature in super class, but it will be ignored.
kms_key (str) – The kms key to use when retrieving the file.
- Returns:
- The Constraints object representing the file that
was generated by the job.
- Return type:
sagemaker.model_monitor.Constraints
- Raises:
UnexpectedStatusException – This is thrown if the job is not in a ‘Complete’ state.
- class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyModelMonitor(role=None, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)[source]#
Bases:
ModelMonitorBase class of Amazon SageMaker Explainability API model monitors.
This class is an
abstract base class, please instantiate its subclasses if you want to monitor bias metrics or feature attribution of an endpoint.- get_latest_execution_logs(wait=False)[source]#
Get the processing job logs for the most recent monitoring execution
- Parameters:
wait (bool) – Whether the call should wait until the job completes (default: False).
- Raises:
ValueError – If no execution job or processing job for the last execution has run
Returns: None
- latest_monitoring_statistics(**_)[source]#
Not implemented.
The class doesn’t support statistics.
- Raises:
NotImplementedError –
- class sagemaker.core.model_monitor.clarify_model_monitoring.ClarifyMonitoringExecution(sagemaker_session, job_name, inputs, output, output_kms_key=None)[source]#
Bases:
MonitoringExecutionProvides functionality to retrieve monitoring-specific files output from executions.
- model_config: ClassVar[ConfigDict] = {'arbitrary_types_allowed': True, 'extra': 'forbid', 'protected_namespaces': (), 'validate_assignment': True}#
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class sagemaker.core.model_monitor.clarify_model_monitoring.ExplainabilityAnalysisConfig(explainability_config, model_config, headers=None, label_headers=None)[source]#
Bases:
objectAnalysis configuration for ModelExplainabilityMonitor.
- class sagemaker.core.model_monitor.clarify_model_monitoring.ModelBiasMonitor(role=None, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)[source]#
Bases:
ClarifyModelMonitorAmazon SageMaker model monitor to monitor bias metrics of an endpoint.
Please see the __init__ method of its base class for how to instantiate it.
- JOB_DEFINITION_BASE_NAME = 'model-bias-job-definition'#
- classmethod attach(monitor_schedule_name, sagemaker_session=None)[source]#
Sets this object’s schedule name to the name provided.
This allows subsequent describe_schedule or list_executions calls to point to the given schedule.
- Parameters:
monitor_schedule_name (str) – The name of the schedule to attach to.
sagemaker_session (sagemaker.core.helper.session_helper.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.
- create_monitoring_schedule(endpoint_input=None, ground_truth_input=None, analysis_config=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#
Creates a monitoring schedule.
- Parameters:
endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput. (default: None)
ground_truth_input (str) – S3 URI to ground truth dataset. (default: None)
analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call. (default: None)
output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output” (default: None)
constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file. (default: None)
monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp. (default: None)
schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily. (default: None)
enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs. (default: True)
batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to run the monitoring schedule on the batch transform (default: None)
data_analysis_start_time (str) – Start time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)
data_analysis_end_time (str) – End time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)
- suggest_baseline(data_config, bias_config, model_config, model_predicted_label_config=None, wait=False, logs=False, job_name=None, kms_key=None)[source]#
Suggests baselines for use with Amazon SageMaker Model Monitoring Schedules.
- Parameters:
data_config (
DataConfig) – Config of the input/output data.bias_config (
BiasConfig) – Config of sensitive groups.model_config (
ModelConfig) – Config of the model and its endpoint to be created.model_predicted_label_config (
ModelPredictedLabelConfig) – Config of how to extract the predicted label from the model output.wait (bool) – Whether the call should wait until the job completes (default: False).
logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).
job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.
kms_key (str) – The ARN of the KMS key that is used to encrypt the user code file (default: None).
- Returns:
- The ProcessingJob object representing the
baselining job.
- Return type:
sagemaker.processing.ProcessingJob
- update_monitoring_schedule(endpoint_input=None, ground_truth_input=None, analysis_config=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#
Updates the existing monitoring schedule.
If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.
- Parameters:
endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.
ground_truth_input (str) – S3 URI to ground truth dataset.
analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.
output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”
constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.
schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.
enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.
role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.
instance_count (int) – The number of instances to run the jobs with.
instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.
volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key (str) – A KMS key for the job’s volume.
output_kms_key (str) – The KMS key id for the job’s outputs.
max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600
env (dict) – Environment variables to be passed to the job.
network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.
batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to run the monitoring schedule on the batch transform
- class sagemaker.core.model_monitor.clarify_model_monitoring.ModelExplainabilityMonitor(role=None, instance_count=1, instance_type='ml.m5.xlarge', volume_size_in_gb=30, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, base_job_name=None, sagemaker_session=None, env=None, tags=None, network_config=None)[source]#
Bases:
ClarifyModelMonitorAmazon SageMaker model monitor to monitor feature attribution of an endpoint.
Please see the __init__ method of its base class for how to instantiate it.
- JOB_DEFINITION_BASE_NAME = 'model-explainability-job-definition'#
- classmethod attach(monitor_schedule_name, sagemaker_session=None)[source]#
Sets this object’s schedule name to the name provided.
This allows subsequent describe_schedule or list_executions calls to point to the given schedule.
- Parameters:
monitor_schedule_name (str) – The name of the schedule to attach to.
sagemaker_session (sagemaker.core.helper.session_helper.Session) – Session object which manages interactions with Amazon SageMaker APIs and any other AWS services needed. If not specified, one is created using the default AWS configuration chain.
- create_monitoring_schedule(endpoint_input=None, analysis_config=None, output_s3_uri=None, constraints=None, monitor_schedule_name=None, schedule_cron_expression=None, enable_cloudwatch_metrics=True, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#
Creates a monitoring schedule.
- Parameters:
endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput. (default: None)
analysis_config (str or ExplainabilityAnalysisConfig) – URI to the analysis_config for the explainability job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.
output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”
constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.
monitor_schedule_name (str) – Schedule name. If not specified, the processor generates a default job name, based on the image name and current timestamp.
schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.
enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.
batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to
transform (run the monitoring schedule on the batch)
data_analysis_start_time (str) – Start time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)
data_analysis_end_time (str) – End time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)
- suggest_baseline(data_config, explainability_config, model_config, model_scores=None, wait=False, logs=False, job_name=None, kms_key=None)[source]#
Suggest baselines for use with Amazon SageMaker Model Monitoring Schedules.
- Parameters:
data_config (
DataConfig) – Config of the input/output data.explainability_config (
ExplainabilityConfig) – Config of the specific explainability method. Currently, only SHAP is supported.model_config (
ModelConfig) – Config of the model and its endpoint to be created.model_scores (int or str or
ModelPredictedLabelConfig) – Index or JMESPath expression to locate the predicted scores in the model output. This is not required if the model output is a single score. Alternatively, it can be an instance of ModelPredictedLabelConfig to provide more parameters like label_headers.wait (bool) – Whether the call should wait until the job completes (default: False).
logs (bool) – Whether to show the logs produced by the job. Only meaningful when wait is True (default: False).
job_name (str) – Processing job name. If not specified, the processor generates a default job name, based on the image name and current timestamp.
kms_key (str) – The ARN of the KMS key that is used to encrypt the user code file (default: None).
- Returns:
- The ProcessingJob object representing the
baselining job.
- Return type:
sagemaker.processing.ProcessingJob
- update_monitoring_schedule(endpoint_input=None, analysis_config=None, output_s3_uri=None, constraints=None, schedule_cron_expression=None, enable_cloudwatch_metrics=None, role=None, instance_count=None, instance_type=None, volume_size_in_gb=None, volume_kms_key=None, output_kms_key=None, max_runtime_in_seconds=None, env=None, network_config=None, batch_transform_input=None, data_analysis_start_time=None, data_analysis_end_time=None)[source]#
Updates the existing monitoring schedule.
If more options than schedule_cron_expression are to be updated, a new job definition will be created to hold them. The old job definition will not be deleted.
- Parameters:
endpoint_input (str or sagemaker.model_monitor.EndpointInput) – The endpoint to monitor. This can either be the endpoint name or an EndpointInput.
analysis_config (str or BiasAnalysisConfig) – URI to analysis_config for the bias job. If it is None then configuration of the latest baselining job will be reused, but if no baselining job then fail the call.
output_s3_uri (str) – S3 destination of the constraint_violations and analysis result. Default: “s3://<default_session_bucket>/<job_name>/output”
constraints (sagemaker.model_monitor.Constraints or str) – If provided it will be used for monitoring the endpoint. It can be a Constraints object or an S3 uri pointing to a constraints JSON file.
schedule_cron_expression (str) – The cron expression that dictates the frequency that this job run. See sagemaker.model_monitor.CronExpressionGenerator for valid expressions. Default: Daily.
enable_cloudwatch_metrics (bool) – Whether to publish cloudwatch metrics as part of the baselining or monitoring jobs.
role (str) – An AWS IAM role. The Amazon SageMaker jobs use this role.
instance_count (int) – The number of instances to run the jobs with.
instance_type (str) – Type of EC2 instance to use for the job, for example, ‘ml.m5.xlarge’.
volume_size_in_gb (int) – Size in GB of the EBS volume to use for storing data during processing (default: 30).
volume_kms_key (str) – A KMS key for the job’s volume.
output_kms_key (str) – The KMS key id for the job’s outputs.
max_runtime_in_seconds (int) – Timeout in seconds. After this amount of time, Amazon SageMaker terminates the job regardless of its current status. Default: 3600
env (dict) – Environment variables to be passed to the job.
network_config (sagemaker.network.NetworkConfig) – A NetworkConfig object that configures network isolation, encryption of inter-container traffic, security group IDs, and subnets.
batch_transform_input (sagemaker.model_monitor.BatchTransformInput) – Inputs to run the monitoring schedule on the batch transform
data_analysis_start_time (str) – Start time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)
data_analysis_end_time (str) – End time for the data analysis window for the one time monitoring schedule (NOW), e.g. “-PT1H” (default: None)