GPU-Initiated Networking: Redefining the AI NIC for Large-Scale LLM Training and Inference

이름: 서준호 박사
소속: KT Cloud
주최: 김진수 교수
날짜: 2026/3/13 오전 10:00 - 2026/2/13 오전 11:30
위치: 301동 101호
대표 이미지
요약

The scaling of distributed AI requires a fundamental shift in how GPUs and AI Network Interface Cards (NICs/DPUs) interact over the PCIe bus. Relying on the host CPU to manage network queues introduces latency jitter that cripples both Mixture-of-Experts (MoE) training throughput and autoregressive inference generation. This talk deconstructs the mechanisms used to establish true CPU bypass, device-driven communication networks. Attendees will gain a low-level understanding of how AI NICs execute background DMA operations to perfectly mask network latency behind local compute in both training and inference.

연사 소개

Junho Suh

Ph.D. in Computer Science and Engineering with 10+ years of SDN experience, focused on optimizing networking stacks across hardware and software in distributed systems for high-performance workloads. Expert in programmable data plane technologies, including Tofino ASIC, SmartNIC, and DPDK. Proven ability to make critical technical design decisions and to deliver practical solutions on time through research, development, and technical leadership. Connected with peers from academia, industry, and open-source communities.

2025-Present KT Cloud

2023-2025: Intel / Technical Lead in IPU team

2020-2023: Barefoot Networks / Software Enginner

2015-2020: SK Telecom / Research Engineer