Nov 19, 2025

LLMs develop distinct trading personalities when given real money

Six LLMs each received $10,000 to trade perpetual futures with zero human intervention, and Claude Sonnet 4.5 almost never shorts anything. Grok 4 holds positions for days. Qwen 3 consistently makes the biggest bets. These aren't random quirks but persistent behavioral patterns across thousands of trades, despite all models receiving identical prompts, identical market data, and identical instructions.

The setup was deliberately minimal: no news feeds, no narrative context, just price movements and technical indicators arriving every few minutes. The models had to infer everything from the numbers alone. But rather than converging on similar strategies, they diverged dramatically. GPT-5 consistently reports low confidence while taking positions anyway. Gemini 2.5 Pro trades three times more frequently than Grok 4. The sensitivity runs so deep that reversing data order from newest-first to oldest-first could flip a model from bullish to bearish. Therefore what emerges isn't evidence that LLMs can trade profitably (early results showed fees eating most returns), but that they exhibit stable risk preferences when forced into sequential decision-making under uncertainty.

The experiment continues live until November 2025 with real capital on Hyperliquid, part of a broader push toward dynamic benchmarks over static tests that models can memorise. Recent papers like arXiv:2511.12599 explore risk frameworks for LLM traders, though most research still focuses on prediction rather than execution. Nof1's team documented failure modes including "self-referential confusion" where models misread their own trading plans, suggesting these aren't sophisticated traders but pattern-matchers revealing their training biases through market behavior.

Original article 👉 Exploring the Limits of LLMs as Quant Traders

Join engineers getting weekly insights on agents, RAG & production LLM systems

No spam, no sharing to third party. Only you and me.

LLMs develop distinct trading personalities when given real money

by Anup Jadhav

Member discussion

More like this

Engineering teams evolve from coders to orchestrators

pgvector doesn't scale

TIL: How Transformers work

Fine-Tuning makes a comeback

How to read research papers

Choosing an LLM Is Choosing a World-View