]> Piment Noir Git Repositories - freqai-strategies.git/commit
fix(ReforceXY): reduce PBRS defaults to prevent reward exploitation
authorJérôme Benoit <jerome.benoit@piment-noir.org>
Tue, 30 Dec 2025 18:20:32 +0000 (19:20 +0100)
committerJérôme Benoit <jerome.benoit@piment-noir.org>
Tue, 30 Dec 2025 18:20:32 +0000 (19:20 +0100)
commit069e60cb56b69e8295a3711541551e7fe3f25b0d
treef0495b930ee4a39633dd68758077e22357cd3e69
parent3e7be4caf0c7114fa126984d913973d37f7f73f0
fix(ReforceXY): reduce PBRS defaults to prevent reward exploitation

Disable hold potential by default and reduce additive ratios to prevent
the agent from exploiting shaping rewards with many short losing trades.

Changes:
- hold_potential_enabled: true -> false (disabled by default)
- hold_potential_ratio: 0.03125 -> 0.001 (reduced when enabled)
- entry_additive_ratio: 0.125 -> 0.0625 (halved)
- exit_additive_ratio: 0.125 -> 0.0625 (halved)

These conservative defaults encourage the agent to focus on actual PnL
rather than gaming intermediate shaping rewards.
ReforceXY/reward_space_analysis/README.md
ReforceXY/reward_space_analysis/reward_space_analysis.py
ReforceXY/reward_space_analysis/tests/constants.py
ReforceXY/user_data/freqaimodels/ReforceXY.py