sagemaker.mlops.workflow.retry#
Pipeline parameters and conditions for workflow.
Classes
|
RetryPolicy base class |
|
SageMaker Job ExceptionType enum. |
|
RetryPolicy for exception thrown by SageMaker Job. |
|
Step ExceptionType enum. |
|
RetryPolicy for a retryable step. |
- class sagemaker.mlops.workflow.retry.RetryPolicy(backoff_rate: float = 2.0, interval_seconds: int = 1, max_attempts: int | None = None, expire_after_mins: int | None = None)[source]#
Bases:
EntityRetryPolicy base class
- backoff_rate#
The multiplier by which the retry interval increases during each attempt (default: 2.0)
- Type:
float
- interval_seconds#
An integer that represents the number of seconds before the first retry attempt (default: 1)
- Type:
int
- max_attempts#
A positive integer that represents the maximum number of retry attempts. (default: None)
- Type:
int
- expire_after_mins#
A positive integer that represents the maximum minute to expire any further retry attempt (default: None)
- Type:
int
- backoff_rate: float#
- expire_after_mins: int#
- interval_seconds: int#
- max_attempts: int#
- class sagemaker.mlops.workflow.retry.SageMakerJobExceptionTypeEnum(*args, value=<object object>, **kwargs)[source]#
Bases:
EnumSageMaker Job ExceptionType enum.
- CAPACITY_ERROR = 'SageMaker.CAPACITY_ERROR'#
- INTERNAL_ERROR = 'SageMaker.JOB_INTERNAL_ERROR'#
- RESOURCE_LIMIT = 'SageMaker.RESOURCE_LIMIT'#
- class sagemaker.mlops.workflow.retry.SageMakerJobStepRetryPolicy(exception_types: List[SageMakerJobExceptionTypeEnum] | None = None, failure_reason_types: List[SageMakerJobExceptionTypeEnum] | None = None, backoff_rate: float = 2.0, interval_seconds: int = 1, max_attempts: int | None = None, expire_after_mins: int | None = None)[source]#
Bases:
RetryPolicyRetryPolicy for exception thrown by SageMaker Job.
- exception_types#
The SageMaker exception to match for this policy. The SageMaker exceptions captured here are the exceptions thrown by synchronously creating the job. For instance the resource limit exception.
- Type:
- failure_reason_types#
the SageMaker failure reason types to match for this policy. The failure reason type is presented in FailureReason field of the Describe response, it indicates the runtime failure reason for a job.
- Type:
- backoff_rate#
The multiplier by which the retry interval increases during each attempt (default: 2.0)
- Type:
float
- interval_seconds#
An integer that represents the number of seconds before the first retry attempt (default: 1)
- Type:
int
- max_attempts#
A positive integer that represents the maximum number of retry attempts. (default: None)
- Type:
int
- expire_after_mins#
A positive integer that represents the maximum minute to expire any further retry attempt (default: None)
- Type:
int
- backoff_rate: float#
- exception_type_list: List[SageMakerJobExceptionTypeEnum]#
- expire_after_mins: int#
- interval_seconds: int#
- max_attempts: int#
- class sagemaker.mlops.workflow.retry.StepExceptionTypeEnum(*args, value=<object object>, **kwargs)[source]#
Bases:
EnumStep ExceptionType enum.
- SERVICE_FAULT = 'Step.SERVICE_FAULT'#
- THROTTLING = 'Step.THROTTLING'#
- class sagemaker.mlops.workflow.retry.StepRetryPolicy(exception_types: List[StepExceptionTypeEnum], backoff_rate: float = 2.0, interval_seconds: int = 1, max_attempts: int | None = None, expire_after_mins: int | None = None)[source]#
Bases:
RetryPolicyRetryPolicy for a retryable step. The pipeline service will retry
sagemaker.workflow.retry.StepRetryExceptionTypeEnum.SERVICE_FAULT and sagemaker.workflow.retry.StepRetryExceptionTypeEnum.THROTTLING regardless of pipeline step type by default. However, for step defined as retryable, you can override them by specifying a StepRetryPolicy.
- exception_types#
the exception types to match for this policy
- Type:
List[StepExceptionTypeEnum]
- backoff_rate#
The multiplier by which the retry interval increases during each attempt (default: 2.0)
- Type:
float
- interval_seconds#
An integer that represents the number of seconds before the first retry attempt (default: 1)
- Type:
int
- max_attempts#
A positive integer that represents the maximum number of retry attempts. (default: None)
- Type:
int
- expire_after_mins#
A positive integer that represents the maximum minute to expire any further retry attempt (default: None)
- Type:
int
- backoff_rate: float#
- expire_after_mins: int#
- interval_seconds: int#
- max_attempts: int#