SageMaker V3 JumpStart Model Example#
This notebook demonstrates how to use SageMaker V3 ModelBuilder with JumpStart models for easy model deployment and inference.
Prerequisites#
Note: Ensure you have sagemaker and ipywidgets installed in your environment. The ipywidgets package is required to monitor endpoint deployment progress in Jupyter notebooks.
# Import required libraries
import json
import uuid
from sagemaker.serve.model_builder import ModelBuilder
from sagemaker.core.jumpstart.configs import JumpStartConfig
from sagemaker.core.resources import EndpointConfig
from sagemaker.train.configs import Compute
Step 1: Configure JumpStart Model#
We’ll use a HuggingFace Falcon model from JumpStart for this example.
# Configuration
MODEL_ID = "huggingface-llm-falcon-7b-bf16"
MODEL_NAME_PREFIX = "js-v3-example-model"
ENDPOINT_NAME_PREFIX = "js-v3-example-endpoint"
# Generate unique identifiers
unique_id = str(uuid.uuid4())[:8]
model_name = f"{MODEL_NAME_PREFIX}-{unique_id}"
endpoint_name = f"{ENDPOINT_NAME_PREFIX}-{unique_id}"
print(f"Model name: {model_name}")
print(f"Endpoint name: {endpoint_name}")
Step 2: Create ModelBuilder from JumpStart Config#
The ModelBuilder can automatically configure itself from a JumpStart model ID.
# Initialize model_builder object with JumpStart configuration
compute = Compute(instance_type="ml.g5.2xlarge")
jumpstart_config = JumpStartConfig(model_id=MODEL_ID)
model_builder = ModelBuilder.from_jumpstart_config(jumpstart_config=jumpstart_config, compute=compute)
print("ModelBuilder created successfully from JumpStart config!")
Step 3: Build the Model#
Build the model artifacts and prepare for deployment.
# Build the model
core_model = model_builder.build(model_name=model_name)
print(f"Model Successfully Created: {core_model.model_name}")
Step 4: Deploy the Model#
Deploy the model to a SageMaker endpoint for real-time inference.
# Deploy the model to an endpoint
core_endpoint = model_builder.deploy(endpoint_name=endpoint_name)
print(f"Endpoint Successfully Created: {core_endpoint.endpoint_name}")
Step 5: Test the Endpoint#
Send a test request to the deployed endpoint.
# Test the endpoint with a sample query
test_data = {"inputs": "What are falcons?", "parameters": {"max_new_tokens": 32}}
result = core_endpoint.invoke(
body=json.dumps(test_data),
content_type="application/json"
)
# Decode and display the result
prediction = json.loads(result.body.read().decode('utf-8'))
print(f"Model Response: {prediction}")
Step 6: Clean Up Resources#
Clean up the created resources to avoid ongoing charges.
# Clean up resources
core_endpoint_config = EndpointConfig.get(endpoint_config_name=core_endpoint.endpoint_name)
# Delete in the correct order
core_model.delete()
core_endpoint.delete()
core_endpoint_config.delete()
print("All resources successfully deleted!")
Summary#
This notebook demonstrated:
Creating a ModelBuilder from JumpStart configuration
Building a model from JumpStart
Deploying to a SageMaker endpoint
Making inference requests
Cleaning up resources
The V3 ModelBuilder makes it easy to work with JumpStart models with minimal configuration required!