Skip to content

Large model inference

DJLServing has the capability to host large language models and foundation models that does not fit into a single GPU. \ We maintain a collection of deep learning containers (DLC) specifically designed for conducting inference with large models on SageMaker. You can explore the available deep learning containers here. The AWS DLC for LMI provides documentation detailing the description of libraries available for use with these DLCs.

To learn more about Large Model Inference with DJLServing on SageMaker, please see our dedicated docs here.