sagemaker.core.serializers.base#

Implements base methods for serializing data for an inference endpoint.

Classes

BaseSerializer()

Abstract base class for creation of new serializers.

CSVSerializer([content_type])

Serialize data of various formats to a CSV-formatted string.

DataSerializer([content_type])

Serialize data in any file by extracting raw bytes from the file.

IdentitySerializer([content_type])

Serialize data by returning data without modification.

JSONLinesSerializer([content_type])

Serialize data to a JSON Lines formatted string.

JSONSerializer([content_type])

Serialize data to a JSON formatted string.

LibSVMSerializer([content_type])

Serialize data of various formats to a LibSVM-formatted string.

NumpySerializer([dtype, content_type])

Serialize data to a buffer using the .npy format.

RecordSerializer([content_type])

Serialize a NumPy array for an inference request.

SimpleBaseSerializer([content_type])

Abstract base class for creation of new serializers.

SparseMatrixSerializer([content_type])

Serialize a sparse matrix to a buffer using the .npz format.

StringSerializer([content_type])

Encode the string to utf-8 bytes.

TorchTensorSerializer([content_type])

Serialize torch.Tensor to a buffer by converting tensor to numpy and call NumpySerializer.

class sagemaker.core.serializers.base.BaseSerializer[source]#

Bases: ABC

Abstract base class for creation of new serializers.

Provides a skeleton for customization requiring the overriding of the method serialize and the class attribute CONTENT_TYPE.

abstract property CONTENT_TYPE#

The MIME type of the data sent to the inference endpoint.

abstract serialize(data)[source]#

Serialize data into the media type specified by CONTENT_TYPE.

Parameters:

data (object) – Data to be serialized.

Returns:

Serialized data used for a request.

Return type:

object

class sagemaker.core.serializers.base.CSVSerializer(content_type='text/csv')[source]#

Bases: SimpleBaseSerializer

Serialize data of various formats to a CSV-formatted string.

serialize(data)[source]#

Serialize data of various formats to a CSV-formatted string.

Parameters:

data (object) – Data to be serialized. Can be a NumPy array, list, file, Pandas DataFrame, or buffer.

Returns:

The data serialized as a CSV-formatted string.

Return type:

str

class sagemaker.core.serializers.base.DataSerializer(content_type='file-path/raw-bytes')[source]#

Bases: SimpleBaseSerializer

Serialize data in any file by extracting raw bytes from the file.

serialize(data)[source]#

Serialize file data to a raw bytes.

Parameters:

data (object) – Data to be serialized. The data can be a string representing file-path or the raw bytes from a file.

Returns:

The data serialized as raw-bytes from the input.

Return type:

raw-bytes

class sagemaker.core.serializers.base.IdentitySerializer(content_type='application/octet-stream')[source]#

Bases: SimpleBaseSerializer

Serialize data by returning data without modification.

This serializer may be useful if, for example, you’re sending raw bytes such as from an image file’s .read() method.

serialize(data)[source]#

Return data without modification.

Parameters:

data (object) – Data to be serialized.

Returns:

The unmodified data.

Return type:

object

class sagemaker.core.serializers.base.JSONLinesSerializer(content_type='application/jsonlines')[source]#

Bases: SimpleBaseSerializer

Serialize data to a JSON Lines formatted string.

serialize(data)[source]#

Serialize data of various formats to a JSON Lines formatted string.

Parameters:

data (object) – Data to be serialized. The data can be a string, iterable of JSON serializable objects, or a file-like object.

Returns:

The data serialized as a string containing newline-separated

JSON values.

Return type:

str

class sagemaker.core.serializers.base.JSONSerializer(content_type='application/json')[source]#

Bases: SimpleBaseSerializer

Serialize data to a JSON formatted string.

serialize(data)[source]#

Serialize data of various formats to a JSON formatted string.

Parameters:

data (object) – Data to be serialized.

Returns:

The data serialized as a JSON string.

Return type:

str

class sagemaker.core.serializers.base.LibSVMSerializer(content_type='text/libsvm')[source]#

Bases: SimpleBaseSerializer

Serialize data of various formats to a LibSVM-formatted string.

The data must already be in LIBSVM file format: <label> <index1>:<value1> <index2>:<value2> …

It is suitable for sparse datasets since it does not store zero-valued features.

serialize(data)[source]#

Serialize data of various formats to a LibSVM-formatted string.

Parameters:

data (object) – Data to be serialized. Can be a string or a file-like object.

Returns:

The data serialized as a LibSVM-formatted string.

Return type:

str

Raises:

ValueError – If unable to handle input format

class sagemaker.core.serializers.base.NumpySerializer(dtype=None, content_type='application/x-npy')[source]#

Bases: SimpleBaseSerializer

Serialize data to a buffer using the .npy format.

serialize(data)[source]#

Serialize data to a buffer using the .npy format.

Parameters:

data (object) – Data to be serialized. Can be a NumPy array, list, file, or buffer.

Returns:

A buffer containing data serialzied in the .npy format.

Return type:

io.BytesIO

class sagemaker.core.serializers.base.RecordSerializer(content_type='application/x-recordio-protobuf')[source]#

Bases: SimpleBaseSerializer

Serialize a NumPy array for an inference request.

serialize(data)[source]#

Serialize a NumPy array into a buffer containing RecordIO records.

Parameters:

data (numpy.ndarray) – The data to serialize.

Returns:

A buffer containing the data serialized as records.

Return type:

io.BytesIO

class sagemaker.core.serializers.base.SimpleBaseSerializer(content_type='application/json')[source]#

Bases: BaseSerializer

Abstract base class for creation of new serializers.

This class extends the API of :class:~`sagemaker.serializers.BaseSerializer` with more user-friendly options for setting the Content-Type header, in situations where it can be provided at init and freely updated.

property CONTENT_TYPE#

The data MIME type set in the Content-Type header on prediction endpoint requests.

class sagemaker.core.serializers.base.SparseMatrixSerializer(content_type='application/x-npz')[source]#

Bases: SimpleBaseSerializer

Serialize a sparse matrix to a buffer using the .npz format.

serialize(data)[source]#

Serialize a sparse matrix to a buffer using the .npz format.

Sparse matrices can be in the csc, csr, bsr, dia or coo formats.

Parameters:

data (scipy.sparse.spmatrix) – The sparse matrix to serialize.

Returns:

A buffer containing the serialized sparse matrix.

Return type:

io.BytesIO

class sagemaker.core.serializers.base.StringSerializer(content_type='text/plain')[source]#

Bases: SimpleBaseSerializer

Encode the string to utf-8 bytes.

serialize(data)[source]#

Encode the string to utf-8 bytes.

Parameters:

data (object) – Data to be serialized.

Returns:

The data serialized as raw-bytes from the input.

Return type:

raw-bytes

class sagemaker.core.serializers.base.TorchTensorSerializer(content_type='tensor/pt')[source]#

Bases: SimpleBaseSerializer

Serialize torch.Tensor to a buffer by converting tensor to numpy and call NumpySerializer.

Parameters:

data (object) – Data to be serialized. The data must be of torch.Tensor type.

Returns:

The data serialized as raw-bytes from the input.

Return type:

raw-bytes

serialize(data)[source]#

Serialize torch.Tensor to a buffer.

Parameters:

data (object) – Data to be serialized. The data must be of torch.Tensor type.

Returns:

The data serialized as raw-bytes from the input.

Return type:

raw-bytes