Two teams from Professor Jaejin Lee’s laboratory won the prestigious awards at the “Samsung Computer Engineering Challenge 2023”. The challenge was hosted by Samsung SAIT and held from August 21 to October 20, 2023. Team H (Heehoon Kim, Junyeol Ryu) received the grand prize (1st place) and Team ShongShong2 (Jinpyo Kim, Daeyoung Park, Junsik Shin) received the excellence award (2nd place). The teams were awarded at Samsung AI Forum 2023.
With the wide use of the large language models (LLMs) across various fields, accelerating its inference is extremely important. In line with this trend, the task of the challenge was to accelerate the Llama-30B inference on HellaSwag dataset using four NVIDIA V100 GPUs.
The key was to utilize GPU compute capacity and hide communication overhead of model parallelism. The grand prize winning team, Team H, proposed multiple novel techniques:
- A batch scheduling algorithm that minimizes redundant computations for paddings
- The optimizations such as fine-grained-batching for prefill phase to maximize pipeline utilization
- The computation optimizations such as efficient use of KV cache and writing custom GPU kernels
- Replacing GPU communication overhead with custom communication routine
Team H achieved 372.2 seconds, which is a 7.63x speedup over the baseline inference time, while preserving the reported accuracy close to 82.8% in Llama paper. “Participating in the competition allowed me to consider my personal research topics from different perspectives, providing a highly motivating experience”, said Jinpyo Kim, a member of Team ShongShong2.
The two teams are awarded10 million Korean won and5 million Korean won, respectively.