Skip to content

Inference Recommendation for Large Model Inference solution

This package includes instance, configuration recommendation for large models using LMI containers.

Installation

To install this package, you can simply do

pip install git+https://github.com/deepjavalibrary/djl-demo.git#subdirectory=aws/sagemaker/llm-instance-recommend

Usage

You can provide a huggingface model id and get corresponding recommendations:

from lmi_recommender.instance_recommender import instance_recommendation
from lmi_recommender.ir_strategy_maker import decision_maker, decision_pretty_print
from lmi_recommender.lmi_config_recommender import lmi_config_recommender

fp16_recommendation = instance_recommendation("OpenAssistant/llama2-13b-orca-8k-3319")
lmi_config_recommender(fp16_recommendation)
gpu_recommendation, neuron_recommendation = decision_maker(fp16_recommendation)
decision_pretty_print(gpu_recommendation)