Overview
A concise summary of the motivation, contributions, and key takeaways.
Deploying Vision-Language-Action (VLA) models in real-world robotics exposes a core multi-task learning challenge: reconciling task interference in multi-task robotic learning. When multiple tasks are jointly fine-tuned in a single stage, gradients from different tasks can conflict, causing negative transfer and reducing per-task performance. Yet maintaining a separate full checkpoint per task is often storage- and deployment-prohibitive. To address this dilemma, we present CORAL, a backbone- and embodiment-agnostic framework designed primarily to mitigate multi-task interference while remaining naturally extensible to a continuous stream of new tasks. CORAL freezes a single pre-trained VLA backbone and attaches one lightweight Low-Rank Adaptation (LoRA) expert per task; at runtime, a dynamic inference engine—the CORAL Manager—routes language instructions to the appropriate expert and swaps experts on the fly with zero inference overhead. This strict parameter isolation avoids complex gating networks and prevents parameter-level cross-task interference by construction; as an added capability, it also enables sequentially introducing new tasks without parameter overwriting caused by catastrophic forgetting. We validate CORAL on a real-world Galaxea R1 dual-arm mobile manipulator and three simulation benchmarks (LIBERO, WidowX, Google Robot), where CORAL overcomes fine-grained instructional ambiguity and substantially outperforms joint training, yielding a practical and scalable system for lifelong multi-task robot learning.
Main contributions
A Scalable System for Lifelong Robot Learning
We propose CORAL, a backbone- and embodiment-agnostic solution that resolves the persistent conflicts between generalization, specialization, and scaling efficiency in real-world VLA deployment.
Multi-Task Scaling without Interference
By routing distinct tasks to dedicated, strictly isolated task experts, CORAL resolves fine-grained instructional ambiguity to significantly outperform joint fine-tuning. Because experts are disjoint, the framework inherently avoids interference across experts by construction.
Breaking the Storage Barrier
A single expert can be trained using only task-specific demonstrations and achieves performance comparable to full fine-tuning, while being hundreds of times smaller than a full model—leaving substantial capacity for adaptation to new tasks.
Method
A two-stage pipeline that disentangles general embodiment learning from task-specific specialization.
The CORAL pipeline consists of two training stages followed by a dynamic inference engine. First, we perform embodiment-aware general pre-training to build a frozen base model that captures broad control patterns and visual-linguistic grounding. Second, for each task we fine-tune a lightweight, task-specific LoRA expert while keeping the base model completely frozen. During inference, the CORAL Manager dynamically swaps these compact LoRA experts on the fly, enabling scalable multi-task deployment with zero additional inference overhead.
Embodiment-Aware General Pre-training
We first train or fine-tune the base policy model on diverse data spanning all available initial tasks. This phase allows the model to deeply understand the robot's general control patterns, kinematics, and the common visual-linguistic structure of the environment. The resulting base model is then frozen permanently, serving as a shared foundation upon which all lightweight, task-specific LoRA experts are built.
Lightweight Task-Specific LoRA Experts
For each task—both initial and newly emerging ones—we train an independent, lightweight LoRA expert while keeping the base model completely frozen. Training is kept intentionally brief (limited to a small number of optimization steps), acting as a gentle implicit regularization that prevents overfitting and ensures the expert enhances task-specific success rates without degrading broad generalization inherited from the base model.
Dynamic Expert Routing (CORAL Manager)
At inference time, the CORAL Manager handles dynamic loading, switching, and unloading of LoRA experts on a single frozen base model. Unlike standard Mixture-of-Experts architectures with learned routing networks, CORAL exploits that each language instruction inherently identifies its task, enabling explicit routing without gating complexity. The entire expert switching procedure completes within 100 ms with zero additional inference overhead.

Evaluation
Comprehensive results across simulation benchmarks and real-world robotic settings.
Simulation Benchmarks
LIBERO Benchmark
CORAL establishes a definitive new state-of-the-art on the LIBERO benchmark. When applied to SimVLA, it achieves an unprecedented 99.3% overall average success rate, decisively outperforming heavily pre-trained baselines like X-VLA. Furthermore, integrating CORAL with π0.5 yields a remarkable 98.4% success rate, delivering a significant +1.5% absolute improvement over the standard π0.5 model, particularly excelling on the most challenging long-horizon tasks.
WidowX Robot Tasks
Demonstrating exceptional real-to-sim transfer capabilities in high-fidelity simulated environments, CORAL achieves a staggering 97.9% average success rate on the Simpler-Bridge (WidowX) tasks. It completely dominates large-scale baseline models, flawlessly executing tasks like Spoon and Carrot with a perfect 100% success rate.
Google Robot Tasks
On the highly demanding Simpler-Fractal (Google Robot) benchmark, CORAL reaches an impressive 84.9% average success rate across diverse variant aggregation scores. It exhibits exceptional robustness, significantly surpassing previous leading models like X-VLA and RT-2-X by substantial margins.
Real-World Deployment
Cross-Scene Zero-Shot Generalization
We evaluate on 8 complex real-world tasks in completely unseen, held-out scenes, emphasizing dexterous and fine-grained bimanual manipulation. CORAL significantly enhances the base model's cross-scene robustness by activating task-specific LoRA experts.
New Capability Acquisition
CORAL is evaluated on acquiring entirely new, out-of-domain capabilities: Open Door (3 variants) and Press Elevator Button (5 variants). CORAL achieves performance comparable to resource-intensive independent full fine-tuning while demanding only a tiny fraction of the storage footprint.
Qualitative examples
We deploy CORAL on a real Galaxea R1 dual-arm mobile manipulator to acquire entirely new capabilities. Through lightweight LoRA experts, the robot smoothly executes intricate, multi-stage manipulation tasks—such as operating different types of doors and interacting with various elevator buttons—demonstrating real-time, zero-overhead inference in the physical world.
Citation
BibTeX entry for this work.
BibTeX
@article{luo2026coral,
title={CORAL: Scalable Multi-Task Robot Learning via LoRA Experts},
author={Luo, Yuankai and Chen, Woping and Liang, Tong and Li, Zhenguo},
journal={arXiv preprint arXiv:2603.09298},
year={2026}
}