Large model inference¶

DJLServing can host large language models and foundation models that do not fit into a single GPU. We maintain a collection of deep learning containers (DLC) specifically designed for inference with large models on SageMaker. You can explore the available deep learning containers here.

To learn more about Large Model Inference with DJLServing on SageMaker, please see our dedicated docs here.