Projects

Ongoing Projects

Efficient Hyperscale LLM Inference System based on Scale-out Context Memory

We investigate scale-out context memory architectures for large language models, treating context as a first-class system resource rather than a static data object. Our research spans memory systems, storage, networking, runtime scheduling, and power management to enable efficient management of massive context workloads across heterogeneous resources. The project aims to establish the foundation for future hyperscale AI inference platforms supporting long-context reasoning, personalization, and agent-based AI services.

Ethernet-based GPU Cluster Network Fabric

We are developing Ethernet-based GPU cluster network fabric system and optimization technologies to maximize network efficiency in large-scale GPU cluster environments. This three-year project is carried out in collaboration with Acryl Co., Ltd., Yonsei University, and Sungkyunkwan University.

Key research directions include:

  • Network fabric design for large-scale GPU clusters
  • Communication and transport optimization for distributed AI workloads
  • Performance analysis, monitoring, and evaluation of cluster network efficiency