Scalable, Resilient Federated Learning

Jan 1, 2025 · 1 min read

SRFL targets scalable and resilient federated learning systems across heterogeneous compute and network environments. The project includes:

  • FedDES, a discrete-event based performance simulation framework for federated learning systems.
  • FedMECA, a memory-efficient and concurrent aggregation approach for scalable federated learning.
  • Long-haul RDMA studies for geo-distributed federated learning, including simulation, modeling, and real-world testbed validation.

Related publications: