[SNU CSE Industry Seminar Series] Accelerating inference efficiency at scale ⋅ 세미나 ⋅ 서울대학교 컴퓨터공학부

세미나

[SNU CSE Industry Seminar Series] Accelerating inference efficiency at scale

이름: 백준호

소속: 퓨리오사(Furiosa AI)

주최: 유승주 교수

날짜: 2025/6/05 오후 02:00 - 오후 03:00

위치: 301동 203호

요약

AI agents are eating the world. As AI inference workloads surge across datacenters, modern architectures must efficiently handle diverse tensor contraction patterns. Traditional designs—relying heavily on fixed-size matrix multiplication engines—struggle to deliver the scalability and flexibility needed for today’s models.

RNGD (pronounced "Renegade"), FuriosaAI's second-generation tensor contraction processor, introduces a novel architecture built to meet these challenges. Its coarse-grained processing elements (PEs) can dynamically be configured as a single large compute unit or as multiple independent units, adapting to a wide range of tensor shapes and sizes. This flexibility ensures efficient utilization across varying inference workloads.

RNGD incorporates several key architectural innovations to maximize performance and efficiency, including a circuit switch-based fetch network, input broadcasting, and buffer-based reuse mechanisms that reduce memory bandwidth pressure and improve data locality. These features collectively enable high throughput and energy-efficient computation, making RNGD a compelling solution for sustainable AI inference at scale.

연사 소개

Furiosa AI CEO

Online Scheduling via Gradient Descent

Computational Illumination for High-Dimensional Visual Computing

세미나

[SNU CSE Industry Seminar Series] Accelerating inference efficiency at scale

소식