2/20/2026 | USA | technology | ✓ Verified - arxiv.org

Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization

#deep reinforcement learning #portfolio allocation #mean‑variance optimization #Sharpe ratio #maximum drawdown #backtesting #financial engineering #machine learning #quantitative finance #risk management

📌 Key Takeaways

Authors compare deep reinforcement learning (DRL) portfolio agents to mean‑variance optimization (MVO) on historical data.
Backtests reveal DRL delivers higher Sharpe ratios, lower maximum drawdowns, and stronger absolute returns compared to MVO.
The paper discusses practical adjustments needed to align DRL models with real‑world portfolio constraints and MVO expectations.
The study was presented at FinPlan‑23, part of the ICAPS 2023 conference, and submitted to arXiv on 19 Feb 2026.
The authors are Srijan Sood, Kassiani Papasotiriou, Marius Vaiciulis and Tucker Balch from focused research teams in finance and AI.

📖 Full Retelling

The paper *Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean‑Variance Optimization* brings together four researchers—Srijan Sood, Kassiani Papasotiriou, Marius Vaiciulis and Tucker Balch—from the University of Michigan and other institutions to conduct a rigorous comparison of two portfolio‑building paradigms. The authors tested a model‑free deep reinforcement learning (DRL) agent against the classic mean‑variance optimization (MVO) framework on historical market data, demonstrating that the DRL approach can deliver higher Sharpe ratios, lower maximum drawdowns, and stronger absolute returns when properly adapted for practical use. The work was presented at the FinPlan‑23 workshop of the 33rd International Conference on Automated Planning and Scheduling (ICAPS 2023) and was submitted to arXiv on 19 Feb 2026 under the identifier 2602.17098. The study provides a detailed walk‑through of how to make DRL viable for real‑world portfolio management, including necessary adjustments to the traditional MVO method for a fair comparison. Backtest results show that, across multiple metrics, the DRL agent consistently outperforms MVO, underscoring the promise of reinforcement learning in dynamic financial decision‑making.

🏷️ Themes

Quantitative finance, Portfolio management, Deep reinforcement learning, Mean‑variance optimization, Risk‑adjusted performance metrics, Research methodology, Applied machine learning

Entity Intersection Graph

No entity connections available yet for this article.

Deep Analysis

Why It Matters

The study shows that DRL can outperform traditional MVO in portfolio allocation, offering a data-driven alternative for financial professionals. It highlights the practical adjustments needed for DRL to be viable in real markets.

Context & Background

DRL applied to finance has gained traction for dynamic asset allocation
Mean-Variance Optimization remains the industry standard for risk-return trade-offs
The paper compares DRL agents against MVO using backtests on historical data

What Happens Next

Future work may involve integrating DRL models into live trading platforms and testing robustness across market regimes. The authors plan to release code and datasets to encourage replication.

Frequently Asked Questions

What is the main advantage of DRL over MVO according to the paper?

DRL agents achieve higher Sharpe ratios and lower drawdowns in backtests, indicating better risk-adjusted returns.

Did the authors provide the code for their DRL model?

The paper mentions a 9-page PDF with figures, but code availability is not confirmed in the abstract.

How does the study address practical deployment of DRL?

It discusses necessary adjustments for MVO and outlines how to make DRL work in practice, including data preprocessing and reward design.

}

Original Source

              --> Quantitative Finance > Portfolio Management arXiv:2602.17098 (q-fin) [Submitted on 19 Feb 2026] Title: Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization Authors: Srijan Sood , Kassiani Papasotiriou , Marius Vaiciulis , Tucker Balch View a PDF of the paper titled Deep Reinforcement Learning for Optimal Portfolio Allocation: A Comparative Study with Mean-Variance Optimization, by Srijan Sood and 3 other authors View PDF HTML Abstract: Portfolio Management is the process of overseeing a group of investments, referred to as a portfolio, with the objective of achieving predetermined investment goals. Portfolio optimization is a key component that involves allocating the portfolio assets so as to maximize returns while minimizing risk taken. It is typically carried out by financial professionals who use a combination of quantitative techniques and investment expertise to make decisions about the portfolio allocation. Recent applications of Deep Reinforcement Learning have shown promising results when used to optimize portfolio allocation by training model-free agents on historical market data. Many of these methods compare their results against basic benchmarks or other state-of-the-art DRL agents but often fail to compare their performance against traditional methods used by financial professionals in practical settings. One of the most commonly used methods for this task is Mean-Variance Portfolio Optimization , which uses historical time series information to estimate expected asset returns and covariances, which are then used to optimize for an investment objective. Our work is a thorough comparison between model-free DRL and MVO for optimal portfolio allocation. We detail the specifics of how to make DRL for portfolio optimization work in practice, also noting the adjustments needed for MVO. Backtest results demonstrate strong performance of the DRL agent across many metrics, including Sharpe ratio, ...
            

Read full article at source

Source

arxiv.org