SageMaker V3 JumpStart Model Example#

This notebook demonstrates how to use SageMaker V3 ModelBuilder with JumpStart models for easy model deployment and inference.

Prerequisites#

Note: Ensure you have sagemaker and ipywidgets installed in your environment. The ipywidgets package is required to monitor endpoint deployment progress in Jupyter notebooks.

# Import required libraries
import json
import uuid

from sagemaker.serve.model_builder import ModelBuilder
from sagemaker.core.jumpstart.configs import JumpStartConfig
from sagemaker.core.resources import EndpointConfig
from sagemaker.train.configs import Compute

Step 1: Configure JumpStart Model#

We’ll use a HuggingFace Falcon model from JumpStart for this example.

# Configuration
MODEL_ID = "huggingface-llm-falcon-7b-bf16"
MODEL_NAME_PREFIX = "js-v3-example-model"
ENDPOINT_NAME_PREFIX = "js-v3-example-endpoint"

# Generate unique identifiers
unique_id = str(uuid.uuid4())[:8]
model_name = f"{MODEL_NAME_PREFIX}-{unique_id}"
endpoint_name = f"{ENDPOINT_NAME_PREFIX}-{unique_id}"

print(f"Model name: {model_name}")
print(f"Endpoint name: {endpoint_name}")

Step 2: Create ModelBuilder from JumpStart Config#

The ModelBuilder can automatically configure itself from a JumpStart model ID.

# Initialize model_builder object with JumpStart configuration
compute = Compute(instance_type="ml.g5.2xlarge")
jumpstart_config = JumpStartConfig(model_id=MODEL_ID)
model_builder = ModelBuilder.from_jumpstart_config(jumpstart_config=jumpstart_config, compute=compute)

print("ModelBuilder created successfully from JumpStart config!")

Step 3: Build the Model#

Build the model artifacts and prepare for deployment.

# Build the model
core_model = model_builder.build(model_name=model_name)
print(f"Model Successfully Created: {core_model.model_name}")

Step 4: Deploy the Model#

Deploy the model to a SageMaker endpoint for real-time inference.

# Deploy the model to an endpoint
core_endpoint = model_builder.deploy(endpoint_name=endpoint_name)
print(f"Endpoint Successfully Created: {core_endpoint.endpoint_name}")

Step 5: Test the Endpoint#

Send a test request to the deployed endpoint.

# Test the endpoint with a sample query
test_data = {"inputs": "What are falcons?", "parameters": {"max_new_tokens": 32}}

result = core_endpoint.invoke(
    body=json.dumps(test_data),
    content_type="application/json"
)

# Decode and display the result
prediction = json.loads(result.body.read().decode('utf-8'))
print(f"Model Response: {prediction}")

Step 6: Clean Up Resources#

Clean up the created resources to avoid ongoing charges.

# Clean up resources
core_endpoint_config = EndpointConfig.get(endpoint_config_name=core_endpoint.endpoint_name)

# Delete in the correct order
core_model.delete()
core_endpoint.delete()
core_endpoint_config.delete()

print("All resources successfully deleted!")

Summary#

This notebook demonstrated:

  1. Creating a ModelBuilder from JumpStart configuration

  2. Building a model from JumpStart

  3. Deploying to a SageMaker endpoint

  4. Making inference requests

  5. Cleaning up resources

The V3 ModelBuilder makes it easy to work with JumpStart models with minimal configuration required!