Accelerating LLM inference with post-training weight and activation using AWQ and GPTQ on Amazon SageMaker AI Amazon Web Services (AWS)
Recent Comments