sagemaker.serve.predictor_async#
Placeholder docstring
Classes
|
Make async prediction requests to an Amazon SageMaker endpoint. |
- class sagemaker.serve.predictor_async.AsyncPredictor(predictor, name=None)[source]#
Bases:
objectMake async prediction requests to an Amazon SageMaker endpoint.
- delete_endpoint(delete_endpoint_config=True)[source]#
Delete the Amazon SageMaker endpoint backing this async predictor.
This also delete the endpoint configuration attached to it if delete_endpoint_config is True.
- Parameters:
delete_endpoint_config (bool, optional) – Flag to indicate whether to delete endpoint configuration together with endpoint. Defaults to True. If True, both endpoint and endpoint configuration will be deleted. If False, only endpoint will be deleted.
- disable_data_capture()[source]#
Disables data capture by updating DataCaptureConfig.
This function updates the DataCaptureConfig for the Predictor’s associated Amazon SageMaker Endpoint to disable data capture. For a more customized experience, refer to update_data_capture_config, instead.
- enable_data_capture()[source]#
Enables data capture by updating DataCaptureConfig.
This function updates the DataCaptureConfig for the Predictor’s associated Amazon SageMaker Endpoint to enable data capture. For a more customized experience, refer to update_data_capture_config, instead.
- endpoint_context()[source]#
Retrieves the lineage context object representing the endpoint.
Examples
predictor = Predictor() context = predictor.endpoint_context() models = context.models()
- Returns:
The context for the endpoint.
- Return type:
ContextEndpoint
- list_monitors()[source]#
Generates ModelMonitor objects (or DefaultModelMonitors).
Objects are generated based on the schedule(s) associated with the endpoint that this predictor refers to.
- Returns:
- A list of
ModelMonitor (or DefaultModelMonitor) objects.
- Return type:
[sagemaker.model_monitor.model_monitoring.ModelMonitor]
- predict(data=None, input_path=None, initial_args=None, inference_id=None, waiter_config=<sagemaker.serve.async_inference.waiter_config.WaiterConfig object>)[source]#
Wait and return the Async Inference result from the specified endpoint.
- Parameters:
data (object) – Input data for which you want the model to provide inference. If a serializer was specified in the encapsulated Predictor object, the result of the serializer is sent as input data. Otherwise the data must be sequence of bytes, and the predict method then sends the bytes in the request body as is.
input_path (str) – Amazon S3 URI contains input data for which you want the model to provide async inference. (Default: None)
initial_args (dict[str,str]) – Optional. Default arguments for boto3
invoke_endpoint_asynccall. (Default: None).inference_id (str) – If you provide a value, it is added to the captured data when you enable data capture on the endpoint (Default: None).
waiter_config (sagemaker.async_inference.waiter_config.WaiterConfig) – Configuration for the waiter. (Default: {“Delay”: 15 seconds, “MaxAttempts”: 60}
- Raises:
ValueError – If both input data and input Amazon S3 path are not provided
- Returns:
- Inference for the given input. If a deserializer was specified when creating
the Predictor, the result of the deserializer is returned. Otherwise the response returns the sequence of bytes as is.
- Return type:
object
- predict_async(data=None, input_path=None, initial_args=None, inference_id=None)[source]#
Return the Async Inference ouput Amazon S3 path from the specified endpoint.
- Parameters:
data (object) – Input data for which you want the model to provide inference. If a serializer was specified in the encapsulated Predictor object, the result of the serializer is sent as input data. Otherwise the data must be sequence of bytes, and the predict method then upload the data to the
input_s3_path. Ifinput_s3_pathis None, upload the data toinput_path (str) – Amazon S3 URI contains input data for which you want the model to provide async inference. (Default: None)
initial_args (dict[str,str]) – Optional. Default arguments for boto3
invoke_endpoint_asynccall. (Default: None).inference_id (str) – If you provide a value, it is added to the captured data when you enable data capture on the endpoint (Default: None).
- Raises:
ValueError – If both input data and input Amazon S3 path are not provided
- Returns:
Inference response for the given input. It provides method to check the result in the Amazon S3 output path.
- Return type:
- update_data_capture_config(data_capture_config)[source]#
Updates the DataCaptureConfig for the Predictor’s associated Amazon SageMaker Endpoint.
Update is done using the provided DataCaptureConfig.
- Parameters:
data_capture_config (sagemaker.model_monitor.DataCaptureConfig) – The DataCaptureConfig to update the predictor’s endpoint to use.
- update_endpoint(initial_instance_count=None, instance_type=None, accelerator_type=None, model_name=None, tags=None, kms_key=None, data_capture_config_dict=None, wait=True)[source]#
Update the existing endpoint with the provided attributes.
This creates a new EndpointConfig in the process. If
initial_instance_count,instance_type,accelerator_type, ormodel_nameis specified, then a new ProductionVariant configuration is created; values from the existing configuration are not preserved if any of those parameters are specified.- Parameters:
initial_instance_count (int) – The initial number of instances to run in the endpoint. This is required if
instance_type,accelerator_type, ormodel_nameis specified. Otherwise, the values from the existing endpoint configuration’s ProductionVariants are used.instance_type (str) – The EC2 instance type to deploy the endpoint to. This is required if
initial_instance_countoraccelerator_typeis specified. Otherwise, the values from the existing endpoint configuration’sProductionVariantsare used.accelerator_type (str) – The type of Elastic Inference accelerator to attach to the endpoint, e.g. “ml.eia1.medium”. If not specified, and
initial_instance_count,instance_type, andmodel_nameare alsoNone, the values from the existing endpoint configuration’sProductionVariantsare used. Otherwise, no Elastic Inference accelerator is attached to the endpoint.model_name (str) – The name of the model to be associated with the endpoint. This is required if
initial_instance_count,instance_type, oraccelerator_typeis specified and if there is more than one model associated with the endpoint. Otherwise, the existing model for the endpoint is used.tags (list[dict[str, str]]) – The list of tags to add to the endpoint config. If not specified, the tags of the existing endpoint configuration are used. If any of the existing tags are reserved AWS ones (i.e. begin with “aws”), they are not carried over to the new endpoint configuration.
kms_key (str) – The KMS key that is used to encrypt the data on the storage volume attached to the instance hosting the endpoint If not specified, the KMS key of the existing endpoint configuration is used.
data_capture_config_dict (dict) – The endpoint data capture configuration for use with Amazon SageMaker Model Monitoring. If not specified, the data capture configuration of the existing endpoint configuration is used.
wait (bool) – Wait for updating to finish