In this session, Hugging Face will detail their approach in accelerating Transformer machine learning models through research and hardware optimization, to achieve 1 millisecond latency on commodity hardware, and enable any company to deploy these large language models into their production infrastructure, at scale.