[SNU CSE Industry Seminar Series] Accelerating inference efficiency at scale

Community
arrow_forward_ios
Seminars

Seminars

Name: 백준호

Affiliation: 퓨리오사(Furiosa AI)

Host: 유승주 교수

Date: 6/5/2025 오후 02:00 - 오후 03:00

Location: 301동 203호

Summary

AI agents are eating the world. As AI inference workloads surge across datacenters, modern architectures must efficiently handle diverse tensor contraction patterns. Traditional designs—relying heavily on fixed-size matrix multiplication engines—struggle to deliver the scalability and flexibility needed for today’s models.

RNGD (pronounced "Renegade"), FuriosaAI's second-generation tensor contraction processor, introduces a novel architecture built to meet these challenges. Its coarse-grained processing elements (PEs) can dynamically be configured as a single large compute unit or as multiple independent units, adapting to a wide range of tensor shapes and sizes. This flexibility ensures efficient utilization across varying inference workloads.

RNGD incorporates several key architectural innovations to maximize performance and efficiency, including a circuit switch-based fetch network, input broadcasting, and buffer-based reuse mechanisms that reduce memory bandwidth pressure and improve data locality. These features collectively enable high throughput and energy-efficient computation, making RNGD a compelling solution for sustainable AI inference at scale.

Speaker Introduction

Furiosa AI CEO

expand_less

Online Scheduling via Gradient Descent

expand_more

Computational Illumination for High-Dimensional Visual Computing

List

Seminars

[SNU CSE Industry Seminar Series] Accelerating inference efficiency at scale

Community

[SNU CSE Industry Seminar Series] Accelerating inference efficiency at scale