sagemaker.core.inference_config#

Configuration classes for SageMaker inference endpoints.

This module provides configuration classes for different types of SageMaker inference endpoints including async, serverless, and resource requirements.

Classes

AsyncInferenceConfig([output_path, ...])

Configuration object for async inference endpoints.

ServerlessInferenceConfig([...])

Configuration object for serverless inference endpoints.

class sagemaker.core.inference_config.AsyncInferenceConfig(output_path=None, max_concurrent_invocations_per_instance=None, kms_key_id=None, notification_config=None, failure_path=None)[source]#

Bases: object

Configuration object for async inference endpoints.

This object specifies configuration related to async endpoint. Use this configuration when trying to create async endpoint and make async inference.

class sagemaker.core.inference_config.ResourceRequirements(requests: Dict[str, int] | None = None, limits: Dict[str, int] | None = None)[source]#

Bases: object

Configures the compute resources for a Model.

get_compute_resource_requirements() dict[source]#

Returns a dict of resource requirements.

class sagemaker.core.inference_config.ServerlessInferenceConfig(memory_size_in_mb: int = 2048, max_concurrency: int = 5, provisioned_concurrency: int | None = None)[source]#

Bases: object

Configuration object for serverless inference endpoints.

This object specifies configuration related to serverless endpoint. Use this configuration when trying to create serverless endpoint and make serverless inference.