sagemaker.mlops.workflow.retry

Contents

sagemaker.mlops.workflow.retry#

Pipeline parameters and conditions for workflow.

Classes

RetryPolicy([backoff_rate, ...])

RetryPolicy base class

SageMakerJobExceptionTypeEnum(*args[, value])

SageMaker Job ExceptionType enum.

SageMakerJobStepRetryPolicy([...])

RetryPolicy for exception thrown by SageMaker Job.

StepExceptionTypeEnum(*args[, value])

Step ExceptionType enum.

StepRetryPolicy(exception_types[, ...])

RetryPolicy for a retryable step.

class sagemaker.mlops.workflow.retry.RetryPolicy(backoff_rate: float = 2.0, interval_seconds: int = 1, max_attempts: int | None = None, expire_after_mins: int | None = None)[source]#

Bases: Entity

RetryPolicy base class

backoff_rate#

The multiplier by which the retry interval increases during each attempt (default: 2.0)

Type:

float

interval_seconds#

An integer that represents the number of seconds before the first retry attempt (default: 1)

Type:

int

max_attempts#

A positive integer that represents the maximum number of retry attempts. (default: None)

Type:

int

expire_after_mins#

A positive integer that represents the maximum minute to expire any further retry attempt (default: None)

Type:

int

backoff_rate: float#
expire_after_mins: int#
interval_seconds: int#
max_attempts: int#
to_request() Dict[str, Any] | List[Dict[str, Any]][source]#

Get the request structure for workflow service calls.

validate_backoff_rate(_, value)[source]#

Validate the input back off rate type

validate_expire_after_mins(_, value)[source]#

Validate expire after mins

validate_interval_seconds(_, value)[source]#

Validate the input interval seconds

validate_max_attempts(_, value)[source]#

Validate the input max attempts

class sagemaker.mlops.workflow.retry.SageMakerJobExceptionTypeEnum(*args, value=<object object>, **kwargs)[source]#

Bases: Enum

SageMaker Job ExceptionType enum.

CAPACITY_ERROR = 'SageMaker.CAPACITY_ERROR'#
INTERNAL_ERROR = 'SageMaker.JOB_INTERNAL_ERROR'#
RESOURCE_LIMIT = 'SageMaker.RESOURCE_LIMIT'#
class sagemaker.mlops.workflow.retry.SageMakerJobStepRetryPolicy(exception_types: List[SageMakerJobExceptionTypeEnum] | None = None, failure_reason_types: List[SageMakerJobExceptionTypeEnum] | None = None, backoff_rate: float = 2.0, interval_seconds: int = 1, max_attempts: int | None = None, expire_after_mins: int | None = None)[source]#

Bases: RetryPolicy

RetryPolicy for exception thrown by SageMaker Job.

exception_types#

The SageMaker exception to match for this policy. The SageMaker exceptions captured here are the exceptions thrown by synchronously creating the job. For instance the resource limit exception.

Type:

List[SageMakerJobExceptionTypeEnum]

failure_reason_types#

the SageMaker failure reason types to match for this policy. The failure reason type is presented in FailureReason field of the Describe response, it indicates the runtime failure reason for a job.

Type:

List[SageMakerJobExceptionTypeEnum]

backoff_rate#

The multiplier by which the retry interval increases during each attempt (default: 2.0)

Type:

float

interval_seconds#

An integer that represents the number of seconds before the first retry attempt (default: 1)

Type:

int

max_attempts#

A positive integer that represents the maximum number of retry attempts. (default: None)

Type:

int

expire_after_mins#

A positive integer that represents the maximum minute to expire any further retry attempt (default: None)

Type:

int

backoff_rate: float#
exception_type_list: List[SageMakerJobExceptionTypeEnum]#
expire_after_mins: int#
interval_seconds: int#
max_attempts: int#
to_request() Dict[str, Any] | List[Dict[str, Any]][source]#

Gets the request structure for retry policy.

class sagemaker.mlops.workflow.retry.StepExceptionTypeEnum(*args, value=<object object>, **kwargs)[source]#

Bases: Enum

Step ExceptionType enum.

SERVICE_FAULT = 'Step.SERVICE_FAULT'#
THROTTLING = 'Step.THROTTLING'#
class sagemaker.mlops.workflow.retry.StepRetryPolicy(exception_types: List[StepExceptionTypeEnum], backoff_rate: float = 2.0, interval_seconds: int = 1, max_attempts: int | None = None, expire_after_mins: int | None = None)[source]#

Bases: RetryPolicy

RetryPolicy for a retryable step. The pipeline service will retry

sagemaker.workflow.retry.StepRetryExceptionTypeEnum.SERVICE_FAULT and sagemaker.workflow.retry.StepRetryExceptionTypeEnum.THROTTLING regardless of pipeline step type by default. However, for step defined as retryable, you can override them by specifying a StepRetryPolicy.

exception_types#

the exception types to match for this policy

Type:

List[StepExceptionTypeEnum]

backoff_rate#

The multiplier by which the retry interval increases during each attempt (default: 2.0)

Type:

float

interval_seconds#

An integer that represents the number of seconds before the first retry attempt (default: 1)

Type:

int

max_attempts#

A positive integer that represents the maximum number of retry attempts. (default: None)

Type:

int

expire_after_mins#

A positive integer that represents the maximum minute to expire any further retry attempt (default: None)

Type:

int

backoff_rate: float#
expire_after_mins: int#
interval_seconds: int#
max_attempts: int#
to_request() Dict[str, Any] | List[Dict[str, Any]][source]#

Gets the request structure for retry policy.