In recent years, machine learning has been touted as a game changer for investment management. The authors of “Machine Learning and Fund Characteristics Help to Select Mutual Funds With Positive Alpha,” published in the December 2023 issue of the Journal of Financial Economics, claimed that machine-learning methods could identify long-only mutual fund portfolios earning significant out-of-sample annual alphas of 2.4% net of all costs. For believers in active management, this was the financial equivalent of the search for the Holy Grail.
What New Research Uncovered About Machine Learning
Two years later, a new set of researchers performed a replication analysis of the 2023 study for their own paper, “Does Machine Learning Really Help to Select Mutual Funds With Positive Alpha?” These researchers found that the original results were driven by a coding error that inadvertently gave the algorithms access to future information—a classic case of look-ahead bias.
The error was technical but consequential. When constructing portfolio returns, the original code updated portfolio weights using next month’s returns rather than the current month’s returns—essentially peeking into the future when making investment decisions, which is impossible in real-world investing. After correcting this error, the impressive outperformance vanished entirely. The annual returns dropped by 1.37 to 1.42 percentage points for the best-performing algorithms, and none remained statistically significant. The authors also identified a survivorship bias in the original study.
What Actually Works in Machine Learning (and What Doesn’t)
The researchers didn’t stop at identifying the error. They conducted a complete independent replication using fresh data from 1980 to 2024 and revealed several important insights.
First, machine learning cannot identify mutual fund portfolios that beat the market on a long-only basis. Whether using sophisticated techniques like random forests and gradient boosting or simpler linear models, none of the approaches generated statistically significant positive returns when investing only in top-predicted funds.
This finding held across:
- Different time periods (through 2024).
- Various forecasting horizons (12 to 36 months).
- Multiple risk-adjustment models.
However, machine learning proved effective at identifying funds likely to underperform.
The algorithms consistently identified bottom-decile portfolios that delivered significantly negative returns—around negative 2% to negative 3% annually. Both sophisticated machine learning methods and simple linear regression successfully flagged the worst performers.
When researchers constructed long-short portfolios (buying predicted winners, shorting predicted losers), they found annual returns of 3.00% to 3.05% that were statistically significant. However, virtually all of this performance came from the short side—avoiding the losers.
Linear Models Hold Their Own
Surprisingly, simple linear models (ordinary least squares and elastic net regression) performed just as well as—and sometimes better than—sophisticated nonlinear machine learning methods over 12-month horizons.
The advantage of complex machine learning only emerged at longer forecasting horizons (36 months), where nonlinear methods maintained their predictive power, while linear models lost statistical significance.
Their findings led the authors to conclude: “Our results suggest that ML adds value primarily through the consistent avoidance of poorly performing funds rather than by identifying long-only outperforming ones.”
Key Takeaways for Investors
1. Beware the Winner-Picking Fantasy
The dream of using artificial intelligence to systematically identify outperforming mutual funds remains elusive. Even with sophisticated algorithms and comprehensive data, generating consistent long-only outperformance above benchmark returns proved impossible.
2. The Real Value: Screening Out Underperformers
The practical application of machine learning in fund selection lies not in picking winners, but in avoiding losers. Algorithms can effectively flag funds exhibiting characteristics associated with future underperformance:
- Negative past alpha t-statistics
- Poor value-added metrics
- Unfavorable risk factor exposures
3. Simple May Be Better
For investors with 12-month horizons, sophisticated machine learning offers little advantage over simpler statistical approaches. Basic regression models using fund characteristics like past alpha, expense ratios, and factor loadings can be nearly as effective—and far more interpretable.
4. Time Horizon Considerations
For institutional investors with longer time horizons (more than three years), more sophisticated machine learning methods performed better, though they identified only future underperformers, not outperformers. The researchers found that nonlinear techniques maintained their advantage over linear models only at 36-month forecasting horizons.
The Bigger Picture
This research serves as an important reminder that in finance, extraordinary claims require extraordinary evidence. The original findings were remarkable because they contradicted decades of empirical research showing that active fund management rarely beats the market after fees, and there has been no evidence of strategies that have successfully identified the few winners ahead of time—though many attempts have failed (such as active share).
The corrected analysis aligns with this body of evidence: On average, actively managed mutual funds underperform, and while some tools can help identify likely underperformers, the search for systematically superior performers remains unsuccessful.
Practical Advice for Investors
Rather than chasing the promise of machine-learning-selected outperformers, consider these evidence-based strategies:
- Use screening tools to avoid red flags: Poor past performance, high expenses, unfavorable factor exposures, and young fund age are warning signs worth heeding.
- Focus on costs: Because predicting winners is so difficult, minimizing expense ratios becomes even more critical to net returns.
- Consider systematic (passive) alternatives: The difficulty of both identifying winning active managers and implementing effective active selection strategies strengthens the case for low-cost index funds and other systematic strategies such as those of Avantis, AQR, Bridgeway, and Dimensional.
- Be skeptical of backtests: This episode illustrates how easily coding errors or methodological choices can create illusory performance. Always look for independent replications and out-of-sample validation.
Machine Learning Isn’t Magic, Even in the Age of AI
While these tools offer some value in flagging funds to avoid, they haven’t solved the fundamental challenge of active management: consistently identifying future outperformers.
For most investors, the lesson remains clear: Focus on controlling costs, maintaining diversification, and being deeply skeptical of anyone claiming to have discovered a systematic way to beat the market using active management strategies.
