Deploys an endpoint for a foundation model supported by Hugging Face LLM Inference Containers for Amazon SageMaker.
The module uses AWS Generative AI CDK Constructs.
hugging-face-model-id
- ID of the Hugging Face modelinstance-type
- inference container instance typedeep-learning-container-image
- container image repository and tag
vpc-id
- VPC idsubnet-ids
- VPC subnet idshugging-face-token-secret-name
- ID of the AWS secret with the Hugging Face access token
EndpointArn
- endpoint ARN.RoleArn
- IAM role ARN.
Example manifest:
name: hugging-face-mistral-endpoint
path: modules/fmops/sagemaker-hugging-face-endpoint
targetAccount: primary
parameters:
- name: hugging-face-model-id
value: mistralai/Mistral-7B-Instruct-v0.1
- name: instance-type
value: ml.g5.2xlarge
- name: deep-learning-container-image
value: huggingface-pytorch-tgi-inference:2.0.1-tgi1.1.0-gpu-py39-cu118-ubuntu20.04
- name: vpc_id
valueFrom:
moduleMetadata:
group: networking
name: networking
key: VpcId
- name: subnet_ids
valueFrom:
moduleMetadata:
group: networking
name: networking
key: PrivateSubnetIds