GetBatch replaces thousands of individual GET requests with a single batch operation
The new API achieves up to 15x throughput improvement for small objects
Production training workloads show 2x reduction in P95 batch retrieval latency
Per-object tail latency is reduced by 3.7x compared to traditional GET requests
The innovation addresses a fundamental bottleneck in distributed ML training pipelines
📖 Full Retelling
Researchers Alex Aizman, Abhishek Gaikwad, and Piotr Żelasko introduced GetBatch, a novel object store API designed to improve machine learning data loading efficiency, in their paper submitted to arXiv on February 25, 2026. The innovation addresses the significant overhead caused by issuing thousands of individual GET requests during machine learning training pipelines, where single training steps require thousands of samples drawn from shards distributed across storage clusters. According to the research team, this per-request overhead often dominates the total data transfer time, creating a bottleneck in ML workflows. The GetBatch API elevates batch retrieval to a first-class storage operation, replacing independent GET operations with a single deterministic, fault-tolerant streaming execution. This approach fundamentally changes how machine learning systems access training data by consolidating what would traditionally be thousands of separate requests into one efficient operation. The researchers demonstrated that GetBatch achieves up to 15x throughput improvement for small objects and, in production training workloads, reduces P95 batch retrieval latency by 2x and P99 per-object tail latency by 3.7x compared to individual GET requests. These performance improvements could significantly accelerate machine learning training times across various applications and industries.
🏷️ Themes
Machine Learning, Distributed Computing, Data Optimization, Storage Efficiency
A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system (only direct attached storage for each node). Clustered file systems can provide featu...
Study of algorithms that improve automatically through experience
Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Within a subdiscipline in machine learning, advances i...
--> Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2602.22434 [Submitted on 25 Feb 2026] Title: GetBatch: Distributed Multi-Object Retrieval for ML Data Loading Authors: Alex Aizman , Abhishek Gaikwad , Piotr Żelasko View a PDF of the paper titled GetBatch: Distributed Multi-Object Retrieval for ML Data Loading, by Alex Aizman and 2 other authors View PDF HTML Abstract: Machine learning training pipelines consume data in batches. A single training step may require thousands of samples drawn from shards distributed across a storage cluster. Issuing thousands of individual GET requests incurs per-request overhead that often dominates data transfer time. To solve this problem, we introduce GetBatch - a new object store API that elevates batch retrieval to a first-class storage operation, replacing independent GET operations with a single deterministic, fault-tolerant streaming execution. GetBatch achieves up to 15x throughput improvement for small objects and, in a production training workload, reduces P95 batch retrieval latency by 2x and P99 per-object tail latency by 3.7x compared to individual GET requests. Comments: 11 pages, 3 figures, 2 tables. Preprint Subjects: Distributed, Parallel, and Cluster Computing (cs.DC) ; Artificial Intelligence (cs.AI cs.DB); Machine Learning (cs.LG) Cite as: arXiv:2602.22434 [cs.DC] (or arXiv:2602.22434v1 [cs.DC] for this version) https://doi.org/10.48550/arXiv.2602.22434 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Abhishek Prakash Gaikwad [ view email ] [v1] Wed, 25 Feb 2026 21:45:18 UTC (667 KB) Full-text links: Access Paper: View a PDF of the paper titled GetBatch: Distributed Multi-Object Retrieval for ML Data Loading, by Alex Aizman and 2 other authors View PDF HTML TeX Source view license Current browse context: cs.DC < prev | next > new | recent | 2026-02 Change to browse by: cs cs.AI cs.DB cs.LG References & Citations NASA ADS Google Scholar Semant...