refactor(tests): Standardize constants, improve documentation and add docstrings...

author Jérôme Benoit <jerome.benoit@piment-noir.org>

Sat, 20 Dec 2025 21:33:38 +0000 (22:33 +0100)

committer GitHub <noreply@github.com>

Sat, 20 Dec 2025 21:33:38 +0000 (22:33 +0100)
author Jérôme Benoit <jerome.benoit@piment-noir.org>
Sat, 20 Dec 2025 21:33:38 +0000 (22:33 +0100)
committer GitHub <noreply@github.com>
Sat, 20 Dec 2025 21:33:38 +0000 (22:33 +0100)
diff --git a/ReforceXY/reward_space_analysis/tests/.docstring_template.md b/ReforceXY/reward_space_analysis/tests/.docstring_template.md

new file mode 100644 (file)

index 0000000..d36f611
--- /dev/null
+++ b/ReforceXY/reward_space_analysis/tests/.docstring_template.md
@@ -0,0 +1,286 @@
+# Test Docstring Template
+
+Use this template as a guide when writing or updating test docstrings.
+
+## Standard Format
+
+```python
+def test_feature_behavior_expected_outcome(self):
+    """Brief one-line summary of what this test verifies (imperative mood).
+
+    **Invariant:** [invariant-category-number] (if applicable)
+
+    Extended description providing context about:
+    - What behavior/property is being tested
+    - Why this test is important
+    - Edge cases or special conditions covered
+
+    **Setup:**
+    - Key parameters: profit_aim=X, risk_reward_ratio=Y
+    - Test scenarios: duration_ratios=[...], modes=[...]
+    - Sample size: N samples
+
+    **Assertions:**
+    - Primary check: What the main assertion verifies
+    - Secondary checks: Additional validations (if any)
+
+    **Tolerance rationale:** (if using custom tolerance)
+    - [TOLERANCE.TYPE]: Reason for this tolerance choice
+      Example: IDENTITY_RELAXED for accumulated errors across 5+ operations
+
+    **See also:**
+    - Related tests: test_other_related_feature
+    - Documentation: Section 3.2 in PBRS guide
+    """
+    # Test implementation
+    pass
+```
+
+## Quick Examples
+
+### Minimal (Simple Test)
+
+```python
+def test_transform_zero_input(self):
+    """All potential transforms should map zero to zero."""
+    for transform in TRANSFORM_MODES:
+        result = apply_transform(0.0, transform)
+        self.assertAlmostEqual(result, 0.0, places=12)
+```
+
+### Standard (Most Tests)
+
+```python
+def test_exit_factor_monotonic_attenuation(self):
+    """Exit factor must decrease monotonically with increasing duration ratio.
+
+    **Invariant:** robustness-exit-monotonic-115
+
+    Validates that for all attenuation modes (linear, sqrt, power, etc.),
+    the exit factor decreases as duration_ratio increases, ensuring
+    that longer-held positions receive progressively smaller rewards.
+
+    **Setup:**
+    - Attenuation modes: [linear, sqrt, power, half_life]
+    - Duration ratios: [0.0, 0.5, 1.0, 1.5, 2.0]
+    - PnL: 0.05, target: 0.10
+
+    **Assertions:**
+    - Strict monotonicity: factor[i] > factor[i+1] for all i
+    - Lower bound: All factors remain non-negative
+
+    **Tolerance rationale:**
+    - IDENTITY_RELAXED: Exit factor computation involves normalization,
+      kernel application, and optional transforms (3-5 operations)
+    """
+    # Test implementation
+    pass
+```
+
+### Complex (Multi-Part Test)
+
+```python
+def test_pbrs_terminal_state_comprehensive(self):
+    """PBRS terminal potential must be zero and shaping must recover last potential.
+
+    **Invariant:** pbrs-terminal-zero-201, pbrs-recovery-202
+
+    Comprehensive validation of PBRS terminal state behavior across all
+    exit potential modes (progressive_release, spike_cancel, canonical).
+    Ensures theoretical PBRS guarantees hold in practice.
+
+    **Background:**
+    PBRS theory (Ng et al., 1999) requires terminal potential = 0 to
+    maintain policy invariance. This test verifies implementation correctness.
+
+    **Test structure:**
+    1. Part A: Terminal potential verification
+       - For each exit mode, compute next_potential at terminal state
+       - Assert: next_potential ≈ 0 within TOLERANCE.IDENTITY_RELAXED
+
+    2. Part B: Shaping recovery verification
+       - Verify: reward_shaping ≈ -gamma * last_potential
+       - Checks proper potential recovery mechanism
+
+    3. Part C: Cumulative drift analysis
+       - Track cumulative shaping over 100-episode sequence
+       - Assert: Bounded drift (no systematic bias accumulation)
+
+    **Setup:**
+    - Exit modes: [progressive_release, spike_cancel, canonical]
+    - Gamma values: [0.9, 0.95, 0.99]
+    - Episodes: 100 per configuration
+    - Sample size: 500 steps per episode
+
+    **Assertions:**
+    - Terminal potential: |next_potential| < TOLERANCE.IDENTITY_RELAXED
+    - Shaping recovery: |shaping + gamma*last_pot| < TOLERANCE.IDENTITY_RELAXED
+    - Cumulative drift: |sum(shaping)| < 10 * TOLERANCE.IDENTITY_RELAXED
+
+    **Tolerance rationale:**
+    - IDENTITY_RELAXED: PBRS calculations involve gamma discounting,
+      potential computations (hold/entry/exit), and reward shaping formula.
+      Each operation accumulates ~1e-16 error; 5-10 operations → 1e-09 bound.
+
+    **See also:**
+    - test_pbrs_spike_cancel_invariance: Focused spike_cancel test
+    - test_pbrs_progressive_release_decay: Decay mechanism validation
+    - docs/PBRS_THEORY.md: Mathematical foundations
+    """
+    # Part A implementation
+    for mode in EXIT_MODES:
+        # ...
+
+    # Part B implementation
+    # ...
+
+    # Part C implementation
+    # ...
+```
+
+## Section Guidelines
+
+### **One-Line Summary**
+
+- Use imperative mood: "Verify X does Y", "Check that A equals B"
+- Be specific: "Exit factor decreases monotonically" not "Test exit factor"
+- Focus on **what** is tested, not **how** it's tested
+
+### **Invariant** (if applicable)
+
+- Format: `**Invariant:** category-name-number`
+- Example: `**Invariant:** pbrs-terminal-zero-201`
+- See `tests/helpers/assertions.py` for invariant documentation
+
+### **Extended Description**
+
+- Explain **why** this test exists
+- Provide context about the feature being tested
+- Mention edge cases or special conditions
+
+### **Setup**
+
+- Key parameters and their values
+- Test scenarios (modes, ratios, sample sizes)
+- Any special configuration
+
+### **Assertions**
+
+- What each major assertion validates
+- Expected relationships or properties
+- Bounds and thresholds
+
+### **Tolerance Rationale**
+
+- **Required** if using non-default tolerance
+- Explain accumulated error sources
+- Justify the specific tolerance magnitude
+- See `constants.py` docstrings for available tolerances
+
+### **See Also**
+
+- Related tests
+- Relevant documentation
+- Theory references
+
+## Common Patterns
+
+### Property-Based Test
+
+```python
+def test_property_holds_for_all_inputs(self):
+    """Property X holds for all valid inputs in domain D.
+
+    Property-based test verifying [property] across [input space].
+    Uses parameterized inputs to ensure comprehensive coverage.
+    """
+```
+
+### Regression Test
+
+```python
+def test_bug_fix_issue_123_no_regression(self):
+    """Regression test for Issue #123: [brief description].
+
+    Ensures fix for [bug description] remains effective.
+    Bug manifested when [conditions]; this test reproduces those conditions.
+
+    **Fixed in:** PR #456 (commit abc1234)
+    """
+```
+
+### Integration Test
+
+```python
+def test_end_to_end_workflow_integration(self):
+    """End-to-end integration test for [workflow].
+
+    Validates complete workflow from [start] to [end], including:
+    - Component A: [responsibility]
+    - Component B: [responsibility]
+    - Component C: [responsibility]
+
+    **Integration points:**
+    - A → B: [interface/data flow]
+    - B → C: [interface/data flow]
+    """
+```
+
+## Anti-Patterns to Avoid
+
+### ❌ Vague One-Liners
+
+```python
+def test_reward_calculation(self):
+    """Test reward calculation."""  # Too vague!
+```
+
+### ❌ Implementation Details
+
+```python
+def test_uses_numpy_vectorization(self):
+    """Test uses numpy for speed."""  # Focus on behavior, not implementation
+```
+
+### ❌ Missing Tolerance Justification
+
+```python
+def test_complex_calculation(self):
+    """Test complex multi-step calculation."""
+    result = complex_function(...)
+    self.assertAlmostEqual(result, expected, delta=TOLERANCE.IDENTITY_RELAXED)
+    # ❌ Why IDENTITY_RELAXED? Should explain!
+```
+
+### ❌ Overly Technical Jargon
+
+```python
+def test_l2_norm_convergence(self):
+    """Test that the L2 norm of the gradient converges under SGD."""
+    # ❌ Unless testing ML internals, use domain language
+```
+
+## Checklist for New Tests
+
+- [ ] One-line summary is clear and specific
+- [ ] Invariant listed (if applicable)
+- [ ] Extended description explains why test exists
+- [ ] Setup section documents key parameters
+- [ ] Assertions section explains what's validated
+- [ ] Tolerance rationale provided (if custom tolerance used)
+- [ ] References to related tests/docs (if applicable)
+- [ ] Test name follows convention: `test_category_behavior_outcome`
+- [ ] Docstring uses proper markdown formatting
+- [ ] No implementation details leaked into description
+
+## References
+
+- **Test Naming:** Follow pytest conventions and project standards
+- **Invariant Documentation:** See `tests/helpers/assertions.py`
+- **Tolerance Selection:** See `tests/constants.py` for available tolerances
+- **Test Organization:** See `tests/README.md`
+
+---
+
+**Note:** Not all sections are required for every test. Simple tests can use minimal format.
+Complex tests should use comprehensive format with all relevant sections.
diff --git a/ReforceXY/reward_space_analysis/tests/README.md b/ReforceXY/reward_space_analysis/tests/README.md

index 09540b983d6f354830b0a8288b9f2ac50f389f94..d55daf6f126fda05dae553b31632d1cbeec97afd 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/README.md
+++ b/ReforceXY/reward_space_analysis/tests/README.md
@@ -62,6 +62,58 @@ class TestMyFeature(RewardSpaceTestBase):
          self.assertFinite(value)  # unittest-style assertion
  ```
  
+### Constants & Configuration
+
+All test constants are centralized in `tests/constants.py` using frozen
+dataclasses as a single source of truth:
+
+```python
+from tests.constants import TOLERANCE, SEEDS, PARAMS, EXIT_FACTOR
+
+# Use directly in tests
+assert abs(result - expected) < TOLERANCE.IDENTITY_RELAXED
+seed_all(SEEDS.FIXED_UNIT)
+```
+
+**Key constant groups:**
+
+- `TOLERANCE.*` - Numerical tolerances (documented in dataclass docstring)
+- `SEEDS.*` - Fixed random seeds for reproducibility
+- `PARAMS.*` - Standard test parameters (PnL, durations, ratios)
+- `EXIT_FACTOR.*` - Exit factor scenarios
+- `CONTINUITY.*` - Continuity check parameters
+- `STATISTICAL.*` - Statistical test thresholds
+
+**Never use magic numbers** - add new constants to `constants.py` instead.
+
+### Tolerance Selection
+
+Choose appropriate numerical tolerances to prevent flaky tests. All tolerance constants are defined and documented in `tests/constants.py` with their rationale.
+
+**Common tolerances:**
+
+- `IDENTITY_STRICT` (1e-12) - Machine-precision checks
+- `IDENTITY_RELAXED` (1e-09) - Multi-step operations with accumulated errors
+- `GENERIC_EQ` (1e-08) - General floating-point equality (default)
+
+Always document non-default tolerance choices with inline comments explaining the error accumulation model.
+
+### Test Documentation
+
+All tests should follow the standardized docstring format in
+**`.docstring_template.md`**:
+
+- One-line summary (imperative mood)
+- Invariant reference (if applicable)
+- Extended description (what and why)
+- Setup (parameters, scenarios, sample sizes)
+- Assertions (what each validates)
+- Tolerance rationale (required for non-default tolerances)
+- See also (related tests/docs)
+
+**Template provides three complexity levels** (minimal, standard, complex) with
+examples for property-based tests, regression tests, and integration tests.
+
  ### Markers
  
  Module-level markers are declared via `pytestmark`:
@@ -187,13 +239,45 @@ Table tracks approximate line ranges and source ownership:
  2. Add a row in Coverage Mapping BEFORE writing the test.
  3. Implement test in correct taxonomy directory; add marker if outside default
     selection.
-4. Optionally declare inline ownership:
+4. Follow the docstring template in `.docstring_template.md`.
+5. Use constants from `tests/constants.py` - never use magic numbers.
+6. Document tolerance choices with inline comments explaining error accumulation.
+7. Optionally declare inline ownership:
     ```python
     # Owns invariant: <id>
     def test_<short_description>(...):
         ...
     ```
-5. Run duplication audit and coverage before committing.
+8. Run duplication audit and coverage before committing.
+
+## Maintenance Guidelines
+
+### Constant Management
+
+All test constants live in `tests/constants.py`:
+
+- Import constants directly: `from tests.constants import TOLERANCE, SEEDS`
+- Never use class attributes for constants (e.g., `self.TEST_*`)
+- Add new constants to appropriate dataclass in `constants.py`
+- Frozen dataclasses prevent accidental modification
+
+### Tolerance Documentation
+
+When using non-default tolerances (anything other than `GENERIC_EQ`), add an
+inline comment explaining the error accumulation:
+
+```python
+# IDENTITY_RELAXED: Exit factor involves normalization + kernel + transform
+assert abs(exit_factor - expected) < TOLERANCE.IDENTITY_RELAXED
+```
+
+### Test Documentation Standards
+
+- Follow `.docstring_template.md` for all new tests
+- Include invariant IDs in docstrings when applicable
+- Document Setup section with parameter choices and sample sizes
+- Explain non-obvious assertions in Assertions section
+- Always include tolerance rationale for non-default choices
  
  ## Duplication Audit
  
@@ -223,6 +307,17 @@ Run after changes to: reward component logic, PBRS mechanics, CLI
  parsing/output, statistical routines, dependency or Python version upgrades, or
  before publishing analysis reliant on invariants.
  
+## Additional Resources
+
+- **`.docstring_template.md`** - Standardized test documentation template with
+  examples for minimal, standard, and complex tests
+- **`constants.py`** - Single source of truth for all test constants (frozen
+  dataclasses with comprehensive documentation)
+- **`helpers/assertions.py`** - 20+ custom assertion functions for invariant
+  validation
+- **`test_base.py`** - Base class with common utilities (`make_ctx`,
+  `seed_all`, etc.)
+
  ---
  
  This README is the single authoritative source for test coverage, invariant
diff --git a/ReforceXY/reward_space_analysis/tests/api/test_api_helpers.py b/ReforceXY/reward_space_analysis/tests/api/test_api_helpers.py

index e1dc2c2202ddeb682bc5a53da54da6992747edbd..caf89860e36cde5654b7ca104dedbbf5cd854e49 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/api/test_api_helpers.py
+++ b/ReforceXY/reward_space_analysis/tests/api/test_api_helpers.py
@@ -26,6 +26,7 @@ from reward_space_analysis import (
      write_complete_statistical_analysis,
  )
  
+from ..constants import PARAMS, SEEDS, TOLERANCE
  from ..test_base import RewardSpaceTestBase
  
  pytestmark = pytest.mark.api
@@ -49,14 +50,14 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
          df = simulate_samples(
              params=self.base_params(max_trade_duration_candles=40),
              num_samples=20,
-            seed=self.SEED_SMOKE_TEST,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.SMOKE_TEST,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=1.5,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          self.assertGreater(len(df), 0)
          any_exit = df[df["reward_exit"] != 0].head(1)
@@ -74,9 +75,9 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
              breakdown = calculate_reward(
                  ctx,
                  self.DEFAULT_PARAMS,
-                base_factor=self.TEST_BASE_FACTOR,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR,
+                base_factor=PARAMS.BASE_FACTOR,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                  short_allowed=True,
                  action_masking=True,
              )
@@ -87,28 +88,28 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
          df_spot = simulate_samples(
              params=self.base_params(max_trade_duration_candles=100),
              num_samples=80,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="spot",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          short_positions_spot = (df_spot["position"] == float(Positions.Short.value)).sum()
          self.assertEqual(short_positions_spot, 0, "Spot mode must not contain short positions")
          df_margin = simulate_samples(
              params=self.base_params(max_trade_duration_candles=100),
              num_samples=80,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          for col in [
              "pnl",
@@ -129,27 +130,27 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
          df1 = simulate_samples(
              params=self.base_params(action_masking="true", max_trade_duration_candles=50),
              num_samples=10,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="spot",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          self.assertIsInstance(df1, pd.DataFrame)
          df2 = simulate_samples(
              params=self.base_params(action_masking="false", max_trade_duration_candles=50),
              num_samples=10,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="spot",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          self.assertIsInstance(df2, pd.DataFrame)
  
@@ -158,14 +159,14 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
          df_futures = simulate_samples(
              params=self.base_params(max_trade_duration_candles=50),
              num_samples=100,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="futures",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          short_positions = (df_futures["position"] == float(Positions.Short.value)).sum()
          self.assertGreater(short_positions, 0, "Futures mode should allow short positions")
@@ -202,7 +203,7 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
          self.assertAlmostEqual(
              _get_float_param({"k": " 17.5 "}, "k", 0.0),
              17.5,
-            places=6,
+            places=TOLERANCE.DECIMAL_PLACES_RELAXED,
              msg="Whitespace trimmed numeric string should parse",
          )
          self.assertEqual(_get_float_param({"k": "1e2"}, "k", 0.0), 100.0)
@@ -275,23 +276,23 @@ class TestAPIAndHelpers(RewardSpaceTestBase):
          test_data = simulate_samples(
              params=self.base_params(max_trade_duration_candles=100),
              num_samples=200,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          with tempfile.TemporaryDirectory() as tmp_dir:
              output_path = Path(tmp_dir)
              write_complete_statistical_analysis(
                  test_data,
                  output_path,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR,
-                seed=self.SEED,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+                seed=SEEDS.BASE,
                  real_df=None,
              )
              main_report = output_path / "statistical_analysis.md"
@@ -325,9 +326,9 @@ class TestPrivateFunctions(RewardSpaceTestBase):
                  breakdown = calculate_reward(
                      context,
                      self.DEFAULT_PARAMS,
-                    base_factor=self.TEST_BASE_FACTOR,
-                    profit_aim=self.TEST_PROFIT_AIM,
-                    risk_reward_ratio=self.TEST_RR,
+                    base_factor=PARAMS.BASE_FACTOR,
+                    profit_aim=PARAMS.PROFIT_AIM,
+                    risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                      short_allowed=True,
                      action_masking=True,
                  )
@@ -354,9 +355,9 @@ class TestPrivateFunctions(RewardSpaceTestBase):
          breakdown = calculate_reward(
              context,
              self.DEFAULT_PARAMS,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=False,
          )
@@ -367,7 +368,7 @@ class TestPrivateFunctions(RewardSpaceTestBase):
              + breakdown.reward_shaping
              + breakdown.entry_additive
              + breakdown.exit_additive,
-            tolerance=self.TOL_IDENTITY_RELAXED,
+            tolerance=TOLERANCE.IDENTITY_RELAXED,
              msg="Total should equal invalid penalty plus shaping/additives",
          )
  
@@ -392,8 +393,8 @@ class TestPrivateFunctions(RewardSpaceTestBase):
              context,
              params,
              base_factor=10000000.0,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=True,
          )
diff --git a/ReforceXY/reward_space_analysis/tests/cli/test_cli_params_and_csv.py b/ReforceXY/reward_space_analysis/tests/cli/test_cli_params_and_csv.py

index e6a425a58c7527c175ebf005872a9d6a681db3ac..750f4795c7abf863686a7ccb568997ecc2331917 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/cli/test_cli_params_and_csv.py
+++ b/ReforceXY/reward_space_analysis/tests/cli/test_cli_params_and_csv.py
@@ -10,6 +10,7 @@ from pathlib import Path
  import pandas as pd
  import pytest
  
+from ..constants import SEEDS
  from ..test_base import RewardSpaceTestBase
  
  # Pytest marker for taxonomy classification
@@ -32,7 +33,7 @@ class TestCsvEncoding(RewardSpaceTestBase):
              "--num_samples",
              "200",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
          ]
@@ -74,7 +75,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "200",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
              "--skip_feature_analysis",
@@ -101,7 +102,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "150",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
              "--risk_reward_ratio",
@@ -130,7 +131,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "180",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
          ]
@@ -155,7 +156,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "120",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
              "--strict_diagnostics",
@@ -185,7 +186,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "120",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
              "--params",
@@ -216,7 +217,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "120",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
              "--max_trade_duration_candles",
@@ -254,7 +255,7 @@ class TestParamsPropagation(RewardSpaceTestBase):
              "--num_samples",
              "150",
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(out_dir),
              # Enable PBRS shaping explicitly
diff --git a/ReforceXY/reward_space_analysis/tests/components/test_additives.py b/ReforceXY/reward_space_analysis/tests/components/test_additives.py

index cf4346b252cdad0d5154a5dc6d2c234ca0a3b298..a06302ba8dffad1e8a1fec4f0c0410043b3dfb32 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/components/test_additives.py
+++ b/ReforceXY/reward_space_analysis/tests/components/test_additives.py
@@ -10,6 +10,7 @@ import pytest
  
  from reward_space_analysis import apply_potential_shaping
  
+from ..constants import PARAMS
  from ..test_base import RewardSpaceTestBase
  
  pytestmark = pytest.mark.components
@@ -19,6 +20,30 @@ class TestAdditivesDeterministicContribution(RewardSpaceTestBase):
      """Additives enabled increase total reward; shaping impact limited."""
  
      def test_additive_activation_deterministic_contribution(self):
+        """Enabling additives increases total reward while limiting shaping impact.
+
+        **Invariant:** report-additives-deterministic-092
+
+        Validates that when entry/exit additives are enabled, the total reward
+        increases deterministically, but the shaping component remains bounded.
+        This ensures additives provide meaningful reward contribution without
+        destabilizing PBRS shaping dynamics.
+
+        **Setup:**
+        - Base configuration: hold_potential enabled, additives disabled
+        - Test configuration: entry_additive and exit_additive enabled
+        - Additive parameters: scale=0.4, gain=1.0 for both entry/exit
+        - Context: base_reward=0.05, pnl=0.01, duration_ratio=0.2
+
+        **Assertions:**
+        - Total reward with additives > total reward without additives
+        - Shaping difference remains bounded: |s1 - s0| < 0.2
+        - Both total and shaping rewards are finite
+
+        **Tolerance rationale:**
+        - Custom bound 0.2 for shaping delta: Additives should not cause
+          large shifts in shaping component, which maintains PBRS properties
+        """
          base = self.base_params(
              hold_potential_enabled=True,
              entry_additive_enabled=False,
@@ -39,7 +64,7 @@ class TestAdditivesDeterministicContribution(RewardSpaceTestBase):
          ctx = {
              "base_reward": 0.05,
              "current_pnl": 0.01,
-            "pnl_target": self.TEST_PROFIT_AIM * self.TEST_RR,
+            "pnl_target": PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              "current_duration_ratio": 0.2,
              "next_pnl": 0.012,
              "next_duration_ratio": 0.25,
diff --git a/ReforceXY/reward_space_analysis/tests/components/test_reward_components.py b/ReforceXY/reward_space_analysis/tests/components/test_reward_components.py

index bf85ee1c15451f6146b73bcfd47ed3c1a6eaa803..a2833fc8366a348a960572e0073bf07362469eeb 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/components/test_reward_components.py
+++ b/ReforceXY/reward_space_analysis/tests/components/test_reward_components.py
@@ -18,7 +18,7 @@ from reward_space_analysis import (
      calculate_reward,
  )
  
-from ..constants import PARAMS
+from ..constants import PARAMS, SCENARIOS, TOLERANCE
  from ..helpers import (
      RewardScenarioConfig,
      ThresholdTestConfig,
@@ -45,7 +45,9 @@ class TestRewardComponents(RewardSpaceTestBase):
              "hold_potential_transform_pnl": "tanh",
              "hold_potential_transform_duration": "tanh",
          }
-        val = _compute_hold_potential(0.5, self.TEST_PROFIT_AIM * self.TEST_RR, 0.3, params)
+        val = _compute_hold_potential(
+            0.5, PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO, 0.3, params
+        )
          self.assertFinite(val, name="hold_potential")
  
      def test_hold_penalty_basic_calculation(self):
@@ -67,16 +69,16 @@ class TestRewardComponents(RewardSpaceTestBase):
          breakdown = calculate_reward(
              context,
              self.DEFAULT_PARAMS,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=True,
          )
          self.assertLess(breakdown.hold_penalty, 0, "Hold penalty should be negative")
          config = ValidationConfig(
-            tolerance_strict=self.TOL_IDENTITY_STRICT,
-            tolerance_relaxed=self.TOL_IDENTITY_RELAXED,
+            tolerance_strict=TOLERANCE.IDENTITY_STRICT,
+            tolerance_relaxed=TOLERANCE.IDENTITY_RELAXED,
              exclude_components=["idle_penalty", "exit_component", "invalid_penalty"],
              component_description="hold + shaping/additives",
          )
@@ -109,14 +111,14 @@ class TestRewardComponents(RewardSpaceTestBase):
          config = ThresholdTestConfig(
              max_duration=max_duration,
              test_cases=threshold_test_cases,
-            tolerance=self.TOL_IDENTITY_RELAXED,
+            tolerance=TOLERANCE.IDENTITY_RELAXED,
          )
          assert_hold_penalty_threshold_behavior(
              self,
              context_factory,
              self.DEFAULT_PARAMS,
-            self.TEST_BASE_FACTOR,
-            self.TEST_PROFIT_AIM,
+            PARAMS.BASE_FACTOR,
+            PARAMS.PROFIT_AIM,
              1.0,
              config,
          )
@@ -128,7 +130,6 @@ class TestRewardComponents(RewardSpaceTestBase):
          - For d1 < d2 < d3: penalty(d1) >= penalty(d2) >= penalty(d3)
          - Progressive scaling beyond max_duration threshold
          """
-        from ..constants import SCENARIOS
  
          params = self.base_params(max_trade_duration_candles=100)
          durations = list(SCENARIOS.DURATION_SCENARIOS)
@@ -144,9 +145,9 @@ class TestRewardComponents(RewardSpaceTestBase):
              breakdown = calculate_reward(
                  context,
                  params,
-                base_factor=self.TEST_BASE_FACTOR,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR,
+                base_factor=PARAMS.BASE_FACTOR,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                  short_allowed=True,
                  action_masking=True,
              )
@@ -174,7 +175,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          def validate_idle_penalty(test_case, breakdown, description, tolerance):
              test_case.assertLess(breakdown.idle_penalty, 0, "Idle penalty should be negative")
              config = ValidationConfig(
-                tolerance_strict=test_case.TOL_IDENTITY_STRICT,
+                tolerance_strict=TOLERANCE.IDENTITY_STRICT,
                  tolerance_relaxed=tolerance,
                  exclude_components=["hold_penalty", "exit_component", "invalid_penalty"],
                  component_description="idle + shaping/additives",
@@ -183,10 +184,10 @@ class TestRewardComponents(RewardSpaceTestBase):
  
          scenarios = [(context, self.DEFAULT_PARAMS, "idle_penalty_basic")]
          config = RewardScenarioConfig(
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
              risk_reward_ratio=1.0,
-            tolerance_relaxed=self.TOL_IDENTITY_RELAXED,
+            tolerance_relaxed=TOLERANCE.IDENTITY_RELAXED,
          )
          assert_reward_calculation_scenarios(
              self,
@@ -211,14 +212,14 @@ class TestRewardComponents(RewardSpaceTestBase):
              action=Actions.Long_exit,
          )
          params = self.base_params()
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          pnl_target_coefficient = _compute_pnl_target_coefficient(
-            params, ctx.pnl, pnl_target, self.TEST_RR
+            params, ctx.pnl, pnl_target, PARAMS.RISK_REWARD_RATIO
          )
          efficiency_coefficient = _compute_efficiency_coefficient(params, ctx, ctx.pnl)
          pnl_coefficient = pnl_target_coefficient * efficiency_coefficient
          self.assertFinite(pnl_coefficient, name="pnl_coefficient")
-        self.assertAlmostEqualFloat(pnl_coefficient, 1.0, tolerance=self.TOL_GENERIC_EQ)
+        self.assertAlmostEqualFloat(pnl_coefficient, 1.0, tolerance=TOLERANCE.GENERIC_EQ)
  
      def test_max_idle_duration_candles_logic(self):
          """Test max idle duration candles parameter affects penalty magnitude.
@@ -229,7 +230,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          """
          params_small = self.base_params(max_idle_duration_candles=50)
          params_large = self.base_params(max_idle_duration_candles=200)
-        base_factor = self.TEST_BASE_FACTOR
+        base_factor = PARAMS.BASE_FACTOR
          context = self.make_ctx(
              pnl=0.0,
              trade_duration=0,
@@ -241,17 +242,17 @@ class TestRewardComponents(RewardSpaceTestBase):
              context,
              params_small,
              base_factor,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=True,
          )
          large = calculate_reward(
              context,
              params_large,
-            base_factor=self.TEST_BASE_FACTOR,
+            base_factor=PARAMS.BASE_FACTOR,
              profit_aim=PARAMS.PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=True,
          )
@@ -290,7 +291,7 @@ class TestRewardComponents(RewardSpaceTestBase):
                  duration_ratio=0.3,
                  context=context,
                  params=test_params,
-                risk_reward_ratio=self.TEST_RR_HIGH,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO_HIGH,
              )
              self.assertFinite(factor, name=f"exit_factor[{mode}]")
              self.assertGreater(factor, 0, f"Exit factor for {mode} should be positive")
@@ -309,8 +310,8 @@ class TestRewardComponents(RewardSpaceTestBase):
              context=context,
              plateau_params=plateau_params,
              grace=0.5,
-            tolerance_strict=self.TOL_IDENTITY_STRICT,
-            risk_reward_ratio=self.TEST_RR_HIGH,
+            tolerance_strict=TOLERANCE.IDENTITY_STRICT,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO_HIGH,
          )
  
      def test_idle_penalty_zero_when_pnl_target_zero(self):
@@ -338,10 +339,10 @@ class TestRewardComponents(RewardSpaceTestBase):
  
          scenarios = [(context, self.DEFAULT_PARAMS, "pnl_target_zero")]
          config = RewardScenarioConfig(
-            base_factor=self.TEST_BASE_FACTOR,
+            base_factor=PARAMS.BASE_FACTOR,
              profit_aim=0.0,
-            risk_reward_ratio=self.TEST_RR,
-            tolerance_relaxed=self.TOL_IDENTITY_RELAXED,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+            tolerance_relaxed=TOLERANCE.IDENTITY_RELAXED,
          )
          assert_reward_calculation_scenarios(
              self,
@@ -360,7 +361,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          """
          win_reward_factor = 3.0
          beta = 0.5
-        profit_aim = self.TEST_PROFIT_AIM
+        profit_aim = PARAMS.PROFIT_AIM
          params = self.base_params(
              win_reward_factor=win_reward_factor,
              pnl_factor_beta=beta,
@@ -370,7 +371,7 @@ class TestRewardComponents(RewardSpaceTestBase):
              exit_linear_slope=0.0,
          )
          params.pop("base_factor", None)
-        pnl_values = [profit_aim * m for m in (1.05, self.TEST_RR_HIGH, 5.0, 10.0)]
+        pnl_values = [profit_aim * m for m in (1.05, PARAMS.RISK_REWARD_RATIO_HIGH, 5.0, 10.0)]
          ratios_observed: list[float] = []
          for pnl in pnl_values:
              context = self.make_ctx(
@@ -387,7 +388,7 @@ class TestRewardComponents(RewardSpaceTestBase):
                  params,
                  base_factor=1.0,
                  profit_aim=profit_aim,
-                risk_reward_ratio=self.TEST_RR,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                  short_allowed=True,
                  action_masking=True,
              )
@@ -396,7 +397,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          self.assertMonotonic(
              ratios_observed,
              non_decreasing=True,
-            tolerance=self.TOL_IDENTITY_STRICT,
+            tolerance=TOLERANCE.IDENTITY_STRICT,
              name="pnl_amplification_ratio",
          )
          asymptote = 1.0 + win_reward_factor
@@ -431,7 +432,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          """
          params = self.base_params(max_idle_duration_candles=None, max_trade_duration_candles=100)
          base_factor = PARAMS.BASE_FACTOR
-        profit_aim = self.TEST_PROFIT_AIM
+        profit_aim = PARAMS.PROFIT_AIM
          risk_reward_ratio = 1.0
  
          base_context_kwargs = {
@@ -504,9 +505,9 @@ class TestRewardComponents(RewardSpaceTestBase):
          breakdown = calculate_reward(
              context,
              canonical_params,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=True,
          )
@@ -521,7 +522,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          self.assertAlmostEqualFloat(
              breakdown.reward_shaping,
              expected_shaping,
-            tolerance=self.TOL_IDENTITY_STRICT,
+            tolerance=TOLERANCE.IDENTITY_STRICT,
              msg="reward_shaping should equal pbrs_delta + invariance_correction",
          )
  
@@ -529,7 +530,7 @@ class TestRewardComponents(RewardSpaceTestBase):
          self.assertAlmostEqualFloat(
              breakdown.invariance_correction,
              0.0,
-            tolerance=self.TOL_IDENTITY_STRICT,
+            tolerance=TOLERANCE.IDENTITY_STRICT,
              msg="invariance_correction should be ~0 in canonical mode",
          )
  
diff --git a/ReforceXY/reward_space_analysis/tests/components/test_transforms.py b/ReforceXY/reward_space_analysis/tests/components/test_transforms.py

index bbefdbe143c00afe04bb941fe50584d0d9b9c0a9..2c90120358582aac2f84546f2cec6f7932cbd54c 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/components/test_transforms.py
+++ b/ReforceXY/reward_space_analysis/tests/components/test_transforms.py
@@ -10,9 +10,10 @@ import pytest
  
  from reward_space_analysis import ALLOWED_TRANSFORMS, apply_transform
  
+from ..constants import TOLERANCE
  from ..test_base import RewardSpaceTestBase
  
-pytestmark = pytest.mark.transforms
+pytestmark = pytest.mark.components
  
  
  class TestTransforms(RewardSpaceTestBase):
@@ -88,7 +89,7 @@ class TestTransforms(RewardSpaceTestBase):
                      next_val = transform_values[i + 1]
                      self.assertLessEqual(
                          current_val,
-                        next_val + self.TOL_IDENTITY_STRICT,
+                        next_val + TOLERANCE.IDENTITY_STRICT,
                          f"{transform_name} not monotonic: values[{i}]={current_val:.6f} > values[{i + 1}]={next_val:.6f}",
                      )
  
@@ -104,7 +105,7 @@ class TestTransforms(RewardSpaceTestBase):
                  next_val = transform_values[i + 1]
                  self.assertLessEqual(
                      current_val,
-                    next_val + self.TOL_IDENTITY_STRICT,
+                    next_val + TOLERANCE.IDENTITY_STRICT,
                      f"clip not monotonic: values[{i}]={current_val:.6f} > values[{i + 1}]={next_val:.6f}",
                  )
  
@@ -116,7 +117,7 @@ class TestTransforms(RewardSpaceTestBase):
                  self.assertAlmostEqualFloat(
                      result,
                      0.0,
-                    tolerance=self.TOL_IDENTITY_STRICT,
+                    tolerance=TOLERANCE.IDENTITY_STRICT,
                      msg=f"{transform_name}(0.0) should equal 0.0",
                  )
  
@@ -131,7 +132,7 @@ class TestTransforms(RewardSpaceTestBase):
                  self.assertAlmostEqualFloat(
                      pos_result,
                      -neg_result,
-                    tolerance=self.TOL_IDENTITY_STRICT,
+                    tolerance=TOLERANCE.IDENTITY_STRICT,
                      msg=f"asinh({test_val}) should equal -asinh({-test_val})",
                  )
  
@@ -176,7 +177,7 @@ class TestTransforms(RewardSpaceTestBase):
          self.assertAlmostEqualFloat(
              invalid_result,
              expected_result,
-            tolerance=self.TOL_IDENTITY_RELAXED,
+            tolerance=TOLERANCE.IDENTITY_RELAXED,
              msg="Invalid transform should fall back to tanh",
          )
  
@@ -243,6 +244,6 @@ class TestTransforms(RewardSpaceTestBase):
                      self.assertFinite(approx_derivative, name=f"d/dx {transform_name}({x})")
                      self.assertGreaterEqual(
                          approx_derivative,
-                        -self.TOL_IDENTITY_STRICT,  # Allow small numerical errors
+                        -TOLERANCE.IDENTITY_STRICT,  # Allow small numerical errors
                          f"Derivative of {transform_name} at x={x} should be non-negative",
                      )
diff --git a/ReforceXY/reward_space_analysis/tests/constants.py b/ReforceXY/reward_space_analysis/tests/constants.py

index a755e7720bf651224699cfc8d9694fa2ad404106..1a356b34cafc2f3e00ea13fbf22c776b338e8cd5 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/constants.py
+++ b/ReforceXY/reward_space_analysis/tests/constants.py
@@ -20,13 +20,22 @@ class ToleranceConfig:
      comparisons, ensuring consistent precision requirements across all tests.
  
      Attributes:
-        IDENTITY_STRICT: Machine-precision tolerance for identity checks (1e-12)
-        IDENTITY_RELAXED: Relaxed tolerance for approximate identity (1e-09)
-        GENERIC_EQ: Generic equality tolerance for float comparisons (1e-08)
+        IDENTITY_STRICT: Tolerance for strict identity checks (1e-12)
+        IDENTITY_RELAXED: Tolerance for relaxed identity checks (1e-09)
+        GENERIC_EQ: General-purpose equality tolerance (1e-08)
          NUMERIC_GUARD: Minimum threshold to prevent division by zero (1e-18)
          NEGLIGIBLE: Threshold below which values are considered negligible (1e-15)
          RELATIVE: Relative tolerance for ratio/percentage comparisons (1e-06)
          DISTRIB_SHAPE: Tolerance for distribution shape metrics (skew, kurtosis) (0.15)
+        DECIMAL_PLACES_STRICT: Decimal places for exact formula validation (12)
+        DECIMAL_PLACES_STANDARD: Decimal places for general calculations (9)
+        DECIMAL_PLACES_RELAXED: Decimal places for accumulated operations (6)
+        DECIMAL_PLACES_DATA_LOADING: Decimal places for data loading/casting tests (7)
+
+        # Additional tolerances for specific test scenarios
+        ALPHA_ATTENUATION_STRICT: Strict tolerance for alpha attenuation tests (5e-12)
+        ALPHA_ATTENUATION_RELAXED: Relaxed tolerance for alpha attenuation with tau != 1.0 (5e-09)
+        SHAPING_BOUND_TOLERANCE: Tolerance for bounded shaping checks (0.2)
      """
  
      IDENTITY_STRICT: float = 1e-12
@@ -36,6 +45,15 @@ class ToleranceConfig:
      NEGLIGIBLE: float = 1e-15
      RELATIVE: float = 1e-06
      DISTRIB_SHAPE: float = 0.15
+    DECIMAL_PLACES_STRICT: int = 12
+    DECIMAL_PLACES_STANDARD: int = 9
+    DECIMAL_PLACES_RELAXED: int = 6
+    DECIMAL_PLACES_DATA_LOADING: int = 7
+
+    # Additional tolerances
+    ALPHA_ATTENUATION_STRICT: float = 5e-12
+    ALPHA_ATTENUATION_RELAXED: float = 5e-09
+    SHAPING_BOUND_TOLERANCE: float = 0.2
  
  
  @dataclass(frozen=True)
@@ -48,10 +66,18 @@ class ContinuityConfig:
      Attributes:
          EPS_SMALL: Small epsilon for tight continuity checks (1e-06)
          EPS_LARGE: Larger epsilon for coarser continuity tests (1e-05)
+        BOUND_MULTIPLIER_LINEAR: Linear mode derivative bound multiplier (2.0)
+        BOUND_MULTIPLIER_SQRT: Sqrt mode derivative bound multiplier (2.0)
+        BOUND_MULTIPLIER_POWER: Power mode derivative bound multiplier (2.0)
+        BOUND_MULTIPLIER_HALF_LIFE: Half-life mode derivative bound multiplier (2.5)
      """
  
      EPS_SMALL: float = 1e-06
      EPS_LARGE: float = 1e-05
+    BOUND_MULTIPLIER_LINEAR: float = 2.0
+    BOUND_MULTIPLIER_SQRT: float = 2.0
+    BOUND_MULTIPLIER_POWER: float = 2.0
+    BOUND_MULTIPLIER_HALF_LIFE: float = 2.5
  
  
  @dataclass(frozen=True)
@@ -149,6 +175,10 @@ class TestSeeds:
          # Report formatting seeds
          REPORT_FORMAT_1: Seed for report formatting test 1 (234)
          REPORT_FORMAT_2: Seed for report formatting test 2 (321)
+
+        # Additional seeds for various test scenarios
+        ALTERNATE_1: Alternate seed for robustness tests (555)
+        ALTERNATE_2: Alternate seed for variance tests (808)
      """
  
      BASE: int = 42
@@ -177,6 +207,10 @@ class TestSeeds:
      REPORT_FORMAT_1: int = 234
      REPORT_FORMAT_2: int = 321
  
+    # Additional seeds
+    ALTERNATE_1: int = 555
+    ALTERNATE_2: int = 808
+
  
  @dataclass(frozen=True)
  class TestParameters:
@@ -192,6 +226,20 @@ class TestParameters:
          RISK_REWARD_RATIO_HIGH: High risk/reward ratio for stress tests (2.0)
          PNL_STD: Standard deviation for PnL generation (0.02)
          PNL_DUR_VOL_SCALE: Duration-based volatility scaling factor (0.001)
+
+        # Common test PnL values
+        PNL_SMALL: Small profit/loss value (0.02)
+        PNL_MEDIUM: Medium profit/loss value (0.05)
+        PNL_LARGE: Large profit/loss value (0.10)
+
+        # Common duration values
+        TRADE_DURATION_SHORT: Short trade duration in steps (50)
+        TRADE_DURATION_MEDIUM: Medium trade duration in steps (100)
+        TRADE_DURATION_LONG: Long trade duration in steps (200)
+
+        # Common additive parameters
+        ADDITIVE_SCALE_DEFAULT: Default additive scale factor (0.4)
+        ADDITIVE_GAIN_DEFAULT: Default additive gain (1.0)
      """
  
      BASE_FACTOR: float = 90.0
@@ -201,6 +249,20 @@ class TestParameters:
      PNL_STD: float = 0.02
      PNL_DUR_VOL_SCALE: float = 0.001
  
+    # Common PnL values
+    PNL_SMALL: float = 0.02
+    PNL_MEDIUM: float = 0.05
+    PNL_LARGE: float = 0.10
+
+    # Common duration values
+    TRADE_DURATION_SHORT: int = 50
+    TRADE_DURATION_MEDIUM: int = 100
+    TRADE_DURATION_LONG: int = 200
+
+    # Additive parameters
+    ADDITIVE_SCALE_DEFAULT: float = 0.4
+    ADDITIVE_GAIN_DEFAULT: float = 1.0
+
  
  @dataclass(frozen=True)
  class TestScenarios:
diff --git a/ReforceXY/reward_space_analysis/tests/helpers/assertions.py b/ReforceXY/reward_space_analysis/tests/helpers/assertions.py

index 0aebb60bc17835237747faf0d3b0eb9373566696..c70e8b33ea714eea43c51c8c719ad49538910f20 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/helpers/assertions.py
+++ b/ReforceXY/reward_space_analysis/tests/helpers/assertions.py
@@ -436,8 +436,6 @@ def assert_parameter_sensitivity_behavior(
              self, variations, ctx, params, "exit_component", "increasing", config
          )
      """
-    from reward_space_analysis import calculate_reward
-
      results = []
      for param_variation in parameter_variations:
          params = base_params.copy()
@@ -556,8 +554,6 @@ def assert_exit_factor_attenuation_modes(
              make_params, 1e-09
          )
      """
-    import numpy as np
-
      for mode in attenuation_modes:
          with test_case.subTest(mode=mode):
              if mode == "plateau_linear":
diff --git a/ReforceXY/reward_space_analysis/tests/helpers/test_internal_branches.py b/ReforceXY/reward_space_analysis/tests/helpers/test_internal_branches.py

index 3af443af7cfc9d61bf93aec1462ed80ce7398a82..4581fc17babefaed2984fd5999539371f10b70c0 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/helpers/test_internal_branches.py
+++ b/ReforceXY/reward_space_analysis/tests/helpers/test_internal_branches.py
@@ -13,6 +13,20 @@ from reward_space_analysis import (
  
  
  def test_get_bool_param_none_and_invalid_literal():
+    """Verify _get_bool_param handles None and invalid literals correctly.
+
+    Tests edge case handling in boolean parameter parsing:
+    - None values should coerce to False
+    - Invalid string literals should trigger fallback to default value
+
+    **Setup:**
+    - Test cases: None value, invalid literal "not_a_bool"
+    - Default value: True
+
+    **Assertions:**
+    - None coerces to False (covers _to_bool None path)
+    - Invalid literal returns default (ValueError fallback path)
+    """
      params_none = {"check_invariants": None}
      # None should coerce to False (coverage for _to_bool None path)
      assert _get_bool_param(params_none, "check_invariants", True) is False
@@ -23,12 +37,41 @@ def test_get_bool_param_none_and_invalid_literal():
  
  
  def test_get_float_param_invalid_string_returns_nan():
+    """Verify _get_float_param returns NaN for invalid string input.
+
+    Tests error handling in float parameter parsing when given
+    a non-numeric string that cannot be converted to float.
+
+    **Setup:**
+    - Invalid string: "abc"
+    - Parameter: idle_penalty_scale
+    - Default value: 0.5
+
+    **Assertions:**
+    - Result is NaN (covers float conversion ValueError path)
+    """
      params = {"idle_penalty_scale": "abc"}
      val = _get_float_param(params, "idle_penalty_scale", 0.5)
      assert math.isnan(val)
  
  
  def test_calculate_reward_unrealized_pnl_hold_path():
+    """Verify unrealized PnL branch activates during hold action.
+
+    Tests that when hold_potential_enabled and unrealized_pnl are both True,
+    the reward calculation uses max/min unrealized profit to compute next_pnl
+    via the tanh transformation path.
+
+    **Setup:**
+    - Position: Long, Action: Neutral (hold)
+    - PnL: 0.01, max_unrealized_profit: 0.02, min_unrealized_profit: -0.01
+    - Parameters: hold_potential_enabled=True, unrealized_pnl=True
+    - Trade duration: 5 steps
+
+    **Assertions:**
+    - Both prev_potential and next_potential are finite
+    - At least one potential is non-zero (shaping should activate)
+    """
      # Exercise unrealized_pnl branch during hold to cover next_pnl tanh path
      context = RewardContext(
          pnl=0.01,
diff --git a/ReforceXY/reward_space_analysis/tests/helpers/test_utilities.py b/ReforceXY/reward_space_analysis/tests/helpers/test_utilities.py

index cf08c81edaf4cb50533d8e507a467d9b96634b6c..c25d04b38b5ad7ee58eb53bd3a90e5c169649410 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/helpers/test_utilities.py
+++ b/ReforceXY/reward_space_analysis/tests/helpers/test_utilities.py
@@ -16,6 +16,7 @@ import pandas as pd
  
  from reward_space_analysis import load_real_episodes
  
+from ..constants import TOLERANCE
  from ..test_base import RewardSpaceTestBase
  
  
@@ -72,6 +73,7 @@ class TestLoadRealEpisodes(RewardSpaceTestBase):
              pickle.dump(obj, f)
  
      def test_top_level_dict_transitions(self):
+        """Load episodes from pickle with top-level dict containing transitions key."""
          df = pd.DataFrame(
              {
                  "pnl": [0.01],
@@ -90,6 +92,7 @@ class TestLoadRealEpisodes(RewardSpaceTestBase):
          self.assertEqual(len(loaded), 1)
  
      def test_mixed_episode_list_warns_and_flattens(self):
+        """Load episodes from list with mixed structure (some with transitions, some without)."""
          ep1 = {"episode_id": 1}
          ep2 = {
              "episode_id": 2,
@@ -111,9 +114,12 @@ class TestLoadRealEpisodes(RewardSpaceTestBase):
              loaded = load_real_episodes(p)
              _ = w
          self.assertEqual(len(loaded), 1)
-        self.assertPlacesEqual(float(loaded.iloc[0]["pnl"]), 0.02, places=7)
+        self.assertPlacesEqual(
+            float(loaded.iloc[0]["pnl"]), 0.02, places=TOLERANCE.DECIMAL_PLACES_DATA_LOADING
+        )
  
      def test_non_iterable_transitions_raises(self):
+        """Verify ValueError raised when transitions value is not iterable."""
          bad = {"transitions": 123}
          p = Path(self.temp_dir) / "bad.pkl"
          self.write_pickle(bad, p)
@@ -121,6 +127,7 @@ class TestLoadRealEpisodes(RewardSpaceTestBase):
              load_real_episodes(p)
  
      def test_enforce_columns_false_fills_na(self):
+        """Verify enforce_columns=False fills missing required columns with NaN."""
          trans = [
              {"pnl": 0.03, "trade_duration": 10, "idle_duration": 0, "position": 1.0, "action": 2.0}
          ]
@@ -131,6 +138,7 @@ class TestLoadRealEpisodes(RewardSpaceTestBase):
          self.assertTrue(loaded["reward"].isna().all())
  
      def test_casting_numeric_strings(self):
+        """Verify numeric strings are correctly cast to numeric types during loading."""
          trans = [
              {
                  "pnl": "0.04",
@@ -146,9 +154,12 @@ class TestLoadRealEpisodes(RewardSpaceTestBase):
          loaded = load_real_episodes(p)
          self.assertIn("pnl", loaded.columns)
          self.assertIn(loaded["pnl"].dtype.kind, ("f", "i"))
-        self.assertPlacesEqual(float(loaded.iloc[0]["pnl"]), 0.04, places=7)
+        self.assertPlacesEqual(
+            float(loaded.iloc[0]["pnl"]), 0.04, places=TOLERANCE.DECIMAL_PLACES_DATA_LOADING
+        )
  
      def test_pickled_dataframe_loads(self):
+        """Verify pickled DataFrame loads correctly with all required columns."""
          test_episodes = pd.DataFrame(
              {
                  "pnl": [0.01, -0.02, 0.03],
diff --git a/ReforceXY/reward_space_analysis/tests/integration/test_integration.py b/ReforceXY/reward_space_analysis/tests/integration/test_integration.py

index aa00baeac957e9f858f96418230a38acb1b6ce14..2f691dc057aa0a8abdb8b5e9556f2a2b093f755e 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/integration/test_integration.py
+++ b/ReforceXY/reward_space_analysis/tests/integration/test_integration.py
@@ -9,6 +9,7 @@ from pathlib import Path
  
  import pytest
  
+from ..constants import SCENARIOS, SEEDS
  from ..test_base import RewardSpaceTestBase
  
  pytestmark = pytest.mark.integration
@@ -25,9 +26,9 @@ class TestIntegration(RewardSpaceTestBase):
              sys.executable,
              str(Path(__file__).parent.parent.parent / "reward_space_analysis.py"),
              "--num_samples",
-            str(self.TEST_SAMPLES),
+            str(SCENARIOS.SAMPLE_SIZE_SMALL),
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(self.output_path),
          ]
@@ -56,9 +57,9 @@ class TestIntegration(RewardSpaceTestBase):
              sys.executable,
              str(Path(__file__).parent.parent.parent / "reward_space_analysis.py"),
              "--num_samples",
-            str(self.TEST_SAMPLES),
+            str(SCENARIOS.SAMPLE_SIZE_SMALL),
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(self.output_path / "run1"),
          ]
@@ -68,9 +69,9 @@ class TestIntegration(RewardSpaceTestBase):
              sys.executable,
              str(Path(__file__).parent.parent.parent / "reward_space_analysis.py"),
              "--num_samples",
-            str(self.TEST_SAMPLES),
+            str(SCENARIOS.SAMPLE_SIZE_SMALL),
              "--seed",
-            str(self.SEED),
+            str(SEEDS.BASE),
              "--out_dir",
              str(self.output_path / "run2"),
          ]
@@ -102,8 +103,8 @@ class TestIntegration(RewardSpaceTestBase):
              self.assertNotIn("top_features", manifest)
              self.assertNotIn("reward_param_overrides", manifest)
              self.assertNotIn("params", manifest)
-            self.assertEqual(manifest["num_samples"], self.TEST_SAMPLES)
-            self.assertEqual(manifest["seed"], self.SEED)
+            self.assertEqual(manifest["num_samples"], SCENARIOS.SAMPLE_SIZE_SMALL)
+            self.assertEqual(manifest["seed"], SEEDS.BASE)
          with open(self.output_path / "run1" / "manifest.json", "r") as f:
              manifest1 = json.load(f)
          with open(self.output_path / "run2" / "manifest.json", "r") as f:
diff --git a/ReforceXY/reward_space_analysis/tests/integration/test_report_formatting.py b/ReforceXY/reward_space_analysis/tests/integration/test_report_formatting.py

index e67c824c788da690db595859ad3371be716ba989..c2d4690d1f06a379d8304eeb6f3b1f9464580b57 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/integration/test_report_formatting.py
+++ b/ReforceXY/reward_space_analysis/tests/integration/test_report_formatting.py
@@ -13,7 +13,12 @@ import pytest
  
  from reward_space_analysis import PBRS_INVARIANCE_TOL, write_complete_statistical_analysis
  
-from ..constants import SCENARIOS, SEEDS
+from ..constants import (
+    PARAMS,
+    SCENARIOS,
+    SEEDS,
+    TOLERANCE,
+)
  from ..test_base import RewardSpaceTestBase
  
  pytestmark = pytest.mark.integration
@@ -53,7 +58,6 @@ class TestReportFormatting(RewardSpaceTestBase):
          """Helper: invoke write_complete_statistical_analysis into temp dir and return content."""
          out_dir = self.output_path / "report_tmp"
          # Ensure required columns present (action required for summary stats)
-        # Ensure required columns present (action required for summary stats)
          required_cols = [
              "action",
              "reward_invalid",
@@ -73,9 +77,9 @@ class TestReportFormatting(RewardSpaceTestBase):
          write_complete_statistical_analysis(
              df=df,
              output_dir=out_dir,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
-            seed=self.SEED,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+            seed=SEEDS.BASE,
              real_df=real_df,
              adjust_method="none",
              strict_diagnostics=False,
@@ -92,7 +96,7 @@ class TestReportFormatting(RewardSpaceTestBase):
          """Abs Σ Shaping Reward line present, formatted, uses constant not literal."""
          df = pd.DataFrame(
              {
-                "reward_shaping": [self.TOL_IDENTITY_STRICT, -self.TOL_IDENTITY_STRICT],
+                "reward_shaping": [TOLERANCE.IDENTITY_STRICT, -TOLERANCE.IDENTITY_STRICT],
                  "reward_entry_additive": [0.0, 0.0],
                  "reward_exit_additive": [0.0, 0.0],
              }
@@ -105,9 +109,9 @@ class TestReportFormatting(RewardSpaceTestBase):
          self.assertIsNotNone(m, "Abs Σ Shaping Reward line missing or misformatted")
          val = float(m.group(1)) if m else None
          if val is not None:
-            self.assertLess(val, self.TOL_NEGLIGIBLE + self.TOL_IDENTITY_STRICT)
+            self.assertLess(val, TOLERANCE.NEGLIGIBLE + TOLERANCE.IDENTITY_STRICT)
          self.assertNotIn(
-            str(self.TOL_GENERIC_EQ),
+            str(TOLERANCE.GENERIC_EQ),
              content,
              "Tolerance constant value should appear, not raw literal",
          )
@@ -130,9 +134,7 @@ class TestReportFormatting(RewardSpaceTestBase):
          # Ensure placeholder text absent
          self.assertNotIn("_Not performed (no real episodes provided)._", content)
          # Basic regex to find a feature row (pnl)
-        import re as _re
-
-        m = _re.search(r"\| pnl \| ([0-9]+\.[0-9]{4}) \| ([0-9]+\.[0-9]{4}) \|", content)
+        m = re.search(r"\| pnl \| ([0-9]+\.[0-9]{4}) \| ([0-9]+\.[0-9]{4}) \|", content)
          self.assertIsNotNone(
              m, "pnl feature row missing or misformatted in distribution shift table"
          )
@@ -213,10 +215,8 @@ class TestReportFormatting(RewardSpaceTestBase):
              self.assertIn(metric, content, f"Missing metric in PBRS Metrics section: {metric}")
  
          # Verify proper formatting (values should be formatted with proper precision)
-        import re as _re
-
          # Check for at least one properly formatted metric line
-        m = _re.search(r"\| Mean Base Reward \| (-?[0-9]+\.[0-9]{6}) \|", content)
+        m = re.search(r"\| Mean Base Reward \| (-?[0-9]+\.[0-9]{6}) \|", content)
          self.assertIsNotNone(m, "Mean Base Reward metric missing or misformatted")
  
  
diff --git a/ReforceXY/reward_space_analysis/tests/integration/test_reward_calculation.py b/ReforceXY/reward_space_analysis/tests/integration/test_reward_calculation.py

index 6ad0cd3a174fc2c1af24265ac188fbbd4c77fbcd..91c79c250645f581f9ba284a4bd06119af6160c4 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/integration/test_reward_calculation.py
+++ b/ReforceXY/reward_space_analysis/tests/integration/test_reward_calculation.py
@@ -17,6 +17,7 @@ from reward_space_analysis import (
      calculate_reward,
  )
  
+from ..constants import PARAMS, TOLERANCE
  from ..test_base import RewardSpaceTestBase
  
  pytestmark = pytest.mark.integration
@@ -96,9 +97,9 @@ class TestRewardCalculation(RewardSpaceTestBase):
                  breakdown = calculate_reward(
                      ctx,
                      self.DEFAULT_PARAMS,
-                    base_factor=self.TEST_BASE_FACTOR,
-                    profit_aim=self.TEST_PROFIT_AIM,
-                    risk_reward_ratio=self.TEST_RR,
+                    base_factor=PARAMS.BASE_FACTOR,
+                    profit_aim=PARAMS.PROFIT_AIM,
+                    risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                      short_allowed=True,
                      action_masking=expected_component != "invalid_penalty",
                  )
@@ -123,7 +124,7 @@ class TestRewardCalculation(RewardSpaceTestBase):
                  self.assertAlmostEqualFloat(
                      breakdown.total,
                      comp_sum,
-                    tolerance=self.TOL_IDENTITY_RELAXED,
+                    tolerance=TOLERANCE.IDENTITY_RELAXED,
                      msg=f"Total != sum components in {name}",
                  )
  
@@ -136,7 +137,7 @@ class TestRewardCalculation(RewardSpaceTestBase):
          params.pop("base_factor", None)
          base_factor = 100.0
          profit_aim = 0.04
-        rr = self.TEST_RR
+        rr = PARAMS.RISK_REWARD_RATIO
  
          for pnl, label in [(0.02, "profit"), (-0.02, "loss")]:
              with self.subTest(pnl=pnl, label=label):
diff --git a/ReforceXY/reward_space_analysis/tests/pbrs/test_pbrs.py b/ReforceXY/reward_space_analysis/tests/pbrs/test_pbrs.py

index af04a919caa696e57524f803210471f2a655ccc8..9932e57752e734f0443a028a1d7a39bac4d013b1 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/pbrs/test_pbrs.py
+++ b/ReforceXY/reward_space_analysis/tests/pbrs/test_pbrs.py
@@ -1,9 +1,11 @@
  #!/usr/bin/env python3
  """Tests for Potential-Based Reward Shaping (PBRS) mechanics."""
  
+import re
  import unittest
  
  import numpy as np
+import pandas as pd
  import pytest
  
  from reward_space_analysis import (
@@ -22,7 +24,14 @@ from reward_space_analysis import (
      write_complete_statistical_analysis,
  )
  
-from ..constants import SEEDS
+from ..constants import (
+    PARAMS,
+    PBRS,
+    SCENARIOS,
+    SEEDS,
+    STATISTICAL,
+    TOLERANCE,
+)
  from ..helpers import (
      assert_non_canonical_shaping_exceeds,
      assert_pbrs_canonical_sum_within_tolerance,
@@ -42,7 +51,11 @@ class TestPBRS(RewardSpaceTestBase):
      # ---------------- Potential transform mechanics ---------------- #
  
      def test_pbrs_progressive_release_decay_clamped(self):
-        """Verifies progressive_release mode with decay>1 clamps potential to zero."""
+        """Verifies progressive_release mode decay clamps at terminal.
+
+        Tolerance rationale: IDENTITY_RELAXED used for PBRS terminal state checks
+        due to accumulated errors from gamma discounting and potential calculations.
+        """
          params = self.DEFAULT_PARAMS.copy()
          params.update(
              {
@@ -56,9 +69,9 @@ class TestPBRS(RewardSpaceTestBase):
          )
          current_pnl = 0.02
          current_dur = 0.5
-        profit_aim = self.TEST_PROFIT_AIM
+        profit_aim = PARAMS.PROFIT_AIM
          prev_potential = _compute_hold_potential(
-            current_pnl, profit_aim * self.TEST_RR, current_dur, params
+            current_pnl, profit_aim * PARAMS.RISK_REWARD_RATIO, current_dur, params
          )
          (
              _total_reward,
@@ -70,7 +83,7 @@ class TestPBRS(RewardSpaceTestBase):
          ) = apply_potential_shaping(
              base_reward=0.0,
              current_pnl=current_pnl,
-            pnl_target=profit_aim * self.TEST_RR,
+            pnl_target=profit_aim * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio=current_dur,
              next_pnl=0.0,
              next_duration_ratio=0.0,
@@ -79,9 +92,9 @@ class TestPBRS(RewardSpaceTestBase):
              last_potential=0.789,
              params=params,
          )
-        self.assertAlmostEqualFloat(next_potential, 0.0, tolerance=self.TOL_IDENTITY_RELAXED)
+        self.assertAlmostEqualFloat(next_potential, 0.0, tolerance=TOLERANCE.IDENTITY_RELAXED)
          self.assertAlmostEqualFloat(
-            reward_shaping, -prev_potential, tolerance=self.TOL_IDENTITY_RELAXED
+            reward_shaping, -prev_potential, tolerance=TOLERANCE.IDENTITY_RELAXED
          )
  
      def test_pbrs_spike_cancel_invariance(self):
@@ -98,9 +111,9 @@ class TestPBRS(RewardSpaceTestBase):
          )
          current_pnl = 0.015
          current_dur = 0.4
-        profit_aim = self.TEST_PROFIT_AIM
+        profit_aim = PARAMS.PROFIT_AIM
          prev_potential = _compute_hold_potential(
-            current_pnl, profit_aim * self.TEST_RR, current_dur, params
+            current_pnl, profit_aim * PARAMS.RISK_REWARD_RATIO, current_dur, params
          )
          gamma = _get_float_param(
              params, "potential_gamma", DEFAULT_MODEL_REWARD_PARAMETERS.get("potential_gamma", 0.95)
@@ -118,7 +131,7 @@ class TestPBRS(RewardSpaceTestBase):
          ) = apply_potential_shaping(
              base_reward=0.0,
              current_pnl=current_pnl,
-            pnl_target=profit_aim * self.TEST_RR,
+            pnl_target=profit_aim * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio=current_dur,
              next_pnl=0.0,
              next_duration_ratio=0.0,
@@ -128,15 +141,14 @@ class TestPBRS(RewardSpaceTestBase):
              params=params,
          )
          self.assertAlmostEqualFloat(
-            next_potential, expected_next_potential, tolerance=self.TOL_IDENTITY_RELAXED
+            next_potential, expected_next_potential, tolerance=TOLERANCE.IDENTITY_RELAXED
          )
-        self.assertNearZero(reward_shaping, atol=self.TOL_IDENTITY_RELAXED)
+        self.assertNearZero(reward_shaping, atol=TOLERANCE.IDENTITY_RELAXED)
  
      # ---------------- Invariance sum checks (simulate_samples) ---------------- #
  
      def test_canonical_invariance_flag_and_sum(self):
          """Canonical mode + no additives -> invariant flags True and Σ shaping ≈ 0."""
-        from ..constants import SCENARIOS
  
          params = self.base_params(
              exit_potential_mode="canonical",
@@ -147,14 +159,14 @@ class TestPBRS(RewardSpaceTestBase):
          df = simulate_samples(
              params={**params, "max_trade_duration_candles": 100},
              num_samples=SCENARIOS.SAMPLE_SIZE_MEDIUM,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          unique_flags = set(df["pbrs_invariant"].unique().tolist())
          self.assertEqual(unique_flags, {True}, f"Unexpected invariant flags: {unique_flags}")
@@ -163,7 +175,6 @@ class TestPBRS(RewardSpaceTestBase):
  
      def test_non_canonical_flag_false_and_sum_nonzero(self):
          """Non-canonical mode -> invariant flags False and Σ shaping significantly non-zero."""
-        from ..constants import SCENARIOS
  
          params = self.base_params(
              exit_potential_mode="progressive_release",
@@ -175,14 +186,14 @@ class TestPBRS(RewardSpaceTestBase):
          df = simulate_samples(
              params={**params, "max_trade_duration_candles": 100},
              num_samples=SCENARIOS.SAMPLE_SIZE_MEDIUM,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          unique_flags = set(df["pbrs_invariant"].unique().tolist())
          self.assertEqual(unique_flags, {False}, f"Unexpected invariant flags: {unique_flags}")
@@ -195,12 +206,12 @@ class TestPBRS(RewardSpaceTestBase):
          """Verifies entry/exit additives return zero when disabled."""
          params_entry = {"entry_additive_enabled": False, "entry_additive_scale": 1.0}
          val_entry = _compute_entry_additive(
-            0.5, self.TEST_PROFIT_AIM * self.TEST_RR, 0.3, params_entry
+            0.5, PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO, 0.3, params_entry
          )
          self.assertEqual(float(val_entry), 0.0)
          params_exit = {"exit_additive_enabled": False, "exit_additive_scale": 1.0}
          val_exit = _compute_exit_additive(
-            0.5, self.TEST_PROFIT_AIM * self.TEST_RR, 0.3, params_exit
+            0.5, PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO, 0.3, params_exit
          )
          self.assertEqual(float(val_exit), 0.0)
  
@@ -221,7 +232,7 @@ class TestPBRS(RewardSpaceTestBase):
              apply_potential_shaping(
                  base_reward=base_reward,
                  current_pnl=current_pnl,
-                pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+                pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
                  current_duration_ratio=current_duration_ratio,
                  next_pnl=next_pnl,
                  next_duration_ratio=next_duration_ratio,
@@ -240,16 +251,16 @@ class TestPBRS(RewardSpaceTestBase):
              params["exit_additive_enabled"],
              "Exit additive should be auto-disabled in canonical mode",
          )
-        self.assertPlacesEqual(next_potential, 0.0, places=12)
+        self.assertPlacesEqual(next_potential, 0.0, places=TOLERANCE.DECIMAL_PLACES_STRICT)
          current_potential = _compute_hold_potential(
              current_pnl,
-            self.TEST_PROFIT_AIM * self.TEST_RR,
+            PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio,
              {"hold_potential_enabled": True, "hold_potential_scale": 1.0},
          )
-        self.assertAlmostEqual(shaping, -current_potential, delta=self.TOL_IDENTITY_RELAXED)
+        self.assertAlmostEqual(shaping, -current_potential, delta=TOLERANCE.IDENTITY_RELAXED)
          residual = total - base_reward - shaping
-        self.assertAlmostEqual(residual, 0.0, delta=self.TOL_IDENTITY_RELAXED)
+        self.assertAlmostEqual(residual, 0.0, delta=TOLERANCE.IDENTITY_RELAXED)
          self.assertTrue(np.isfinite(total))
  
      def test_pbrs_invariance_internal_flag_set(self):
@@ -264,7 +275,7 @@ class TestPBRS(RewardSpaceTestBase):
          _t1, _s1, _n1, _pbrs_delta, _entry_additive, _exit_additive = apply_potential_shaping(
              base_reward=0.0,
              current_pnl=0.05,
-            pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+            pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio=0.3,
              next_pnl=0.0,
              next_duration_ratio=0.0,
@@ -277,16 +288,14 @@ class TestPBRS(RewardSpaceTestBase):
          self.assertFalse(params["entry_additive_enabled"])
          self.assertFalse(params["exit_additive_enabled"])
          if terminal_next_potentials:
-            self.assertTrue(
-                all((abs(p) < self.PBRS_TERMINAL_TOL for p in terminal_next_potentials))
-            )
+            self.assertTrue(all((abs(p) < PBRS.TERMINAL_TOL for p in terminal_next_potentials)))
          max_abs = max((abs(v) for v in shaping_values)) if shaping_values else 0.0
-        self.assertLessEqual(max_abs, self.PBRS_MAX_ABS_SHAPING)
+        self.assertLessEqual(max_abs, PBRS.MAX_ABS_SHAPING)
          state_after = (params["entry_additive_enabled"], params["exit_additive_enabled"])
          _t2, _s2, _n2, _pbrs_delta2, _entry_additive2, _exit_additive2 = apply_potential_shaping(
              base_reward=0.0,
              current_pnl=0.02,
-            pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+            pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio=0.1,
              next_pnl=0.0,
              next_duration_ratio=0.0,
@@ -311,7 +320,7 @@ class TestPBRS(RewardSpaceTestBase):
              apply_potential_shaping(
                  base_reward=0.0,
                  current_pnl=0.0,
-                pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+                pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
                  current_duration_ratio=0.0,
                  next_pnl=0.0,
                  next_duration_ratio=0.0,
@@ -320,15 +329,17 @@ class TestPBRS(RewardSpaceTestBase):
                  params=params,
              )
          )
-        self.assertPlacesEqual(next_potential, last_potential, places=12)
+        self.assertPlacesEqual(
+            next_potential, last_potential, places=TOLERANCE.DECIMAL_PLACES_STRICT
+        )
          gamma_raw = DEFAULT_MODEL_REWARD_PARAMETERS.get("potential_gamma", 0.95)
          gamma_fallback = 0.95 if gamma_raw is None else gamma_raw
          try:
              gamma = float(gamma_fallback)
          except Exception:
              gamma = 0.95
-        self.assertLessEqual(abs(shaping - gamma * last_potential), self.TOL_GENERIC_EQ)
-        self.assertPlacesEqual(total, shaping, places=12)
+        self.assertLessEqual(abs(shaping - gamma * last_potential), TOLERANCE.GENERIC_EQ)
+        self.assertPlacesEqual(total, shaping, places=TOLERANCE.DECIMAL_PLACES_STRICT)
  
      def test_potential_gamma_nan_fallback(self):
          """Verifies potential_gamma=NaN fallback to default value."""
@@ -338,7 +349,7 @@ class TestPBRS(RewardSpaceTestBase):
          res_nan = apply_potential_shaping(
              base_reward=0.1,
              current_pnl=0.03,
-            pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+            pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio=0.2,
              next_pnl=0.035,
              next_duration_ratio=0.25,
@@ -350,7 +361,7 @@ class TestPBRS(RewardSpaceTestBase):
          res_ref = apply_potential_shaping(
              base_reward=0.1,
              current_pnl=0.03,
-            pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+            pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              current_duration_ratio=0.2,
              next_pnl=0.035,
              next_duration_ratio=0.25,
@@ -360,12 +371,12 @@ class TestPBRS(RewardSpaceTestBase):
          )
          self.assertLess(
              abs(res_nan[1] - res_ref[1]),
-            self.TOL_IDENTITY_RELAXED,
+            TOLERANCE.IDENTITY_RELAXED,
              "Unexpected shaping difference under gamma NaN fallback",
          )
          self.assertLess(
              abs(res_nan[0] - res_ref[0]),
-            self.TOL_IDENTITY_RELAXED,
+            TOLERANCE.IDENTITY_RELAXED,
              "Unexpected total difference under gamma NaN fallback",
          )
  
@@ -433,21 +444,21 @@ class TestPBRS(RewardSpaceTestBase):
          ctx_dur_ratio = 0.3
          params_can = self.base_params(exit_potential_mode="canonical", **base_common)
          prev_phi = _compute_hold_potential(
-            ctx_pnl, self.TEST_PROFIT_AIM * self.TEST_RR, ctx_dur_ratio, params_can
+            ctx_pnl, PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO, ctx_dur_ratio, params_can
          )
          self.assertFinite(prev_phi, name="prev_phi")
          next_phi_can = _compute_exit_potential(prev_phi, params_can)
          self.assertAlmostEqualFloat(
              next_phi_can,
              0.0,
-            tolerance=self.TOL_IDENTITY_STRICT,
+            tolerance=TOLERANCE.IDENTITY_STRICT,
              msg="Canonical exit must zero potential",
          )
          canonical_delta = -prev_phi
          self.assertAlmostEqualFloat(
              canonical_delta,
              -prev_phi,
-            tolerance=self.TOL_IDENTITY_RELAXED,
+            tolerance=TOLERANCE.IDENTITY_RELAXED,
              msg="Canonical delta mismatch",
          )
          params_spike = self.base_params(exit_potential_mode="spike_cancel", **base_common)
@@ -455,11 +466,11 @@ class TestPBRS(RewardSpaceTestBase):
          shaping_spike = gamma * next_phi_spike - prev_phi
          self.assertNearZero(
              shaping_spike,
-            atol=self.TOL_IDENTITY_RELAXED,
+            atol=TOLERANCE.IDENTITY_RELAXED,
              msg="Spike cancel should nullify shaping delta",
          )
          self.assertGreaterEqual(
-            abs(canonical_delta) + self.TOL_IDENTITY_STRICT,
+            abs(canonical_delta) + TOLERANCE.IDENTITY_STRICT,
              abs(shaping_spike),
              "Canonical shaping magnitude should exceed spike_cancel",
          )
@@ -480,15 +491,14 @@ class TestPBRS(RewardSpaceTestBase):
          potentials = rng.uniform(0.05, 0.85, size=220)
          deltas = [gamma * p - p for p in potentials]
          cumulative = float(np.sum(deltas))
-        self.assertLess(cumulative, -self.TOL_NEGLIGIBLE)
-        self.assertGreater(abs(cumulative), 10 * self.TOL_IDENTITY_RELAXED)
+        self.assertLess(cumulative, -TOLERANCE.NEGLIGIBLE)
+        self.assertGreater(abs(cumulative), 10 * TOLERANCE.IDENTITY_RELAXED)
  
      # ---------------- Drift correction invariants (simulate_samples) ---------------- #
  
      # Owns invariant: pbrs-canonical-drift-correction-106
      def test_pbrs_106_canonical_drift_correction_zero_sum(self):
          """Invariant 106: canonical mode enforces near zero-sum shaping (drift correction)."""
-        from ..constants import SCENARIOS
  
          params = self.base_params(
              exit_potential_mode="canonical",
@@ -500,14 +510,14 @@ class TestPBRS(RewardSpaceTestBase):
          df = simulate_samples(
              params={**params, "max_trade_duration_candles": 100},
              num_samples=SCENARIOS.SAMPLE_SIZE_MEDIUM,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          total_shaping = float(df["reward_shaping"].sum())
          assert_pbrs_canonical_sum_within_tolerance(self, total_shaping, PBRS_INVARIANCE_TOL)
@@ -524,8 +534,6 @@ class TestPBRS(RewardSpaceTestBase):
              exit_additive_enabled=False,
              potential_gamma=0.91,
          )
-        import pandas as pd
-
          original_sum = pd.DataFrame.sum
  
          def boom(self, *args, **kwargs):  # noqa: D401
@@ -539,13 +547,13 @@ class TestPBRS(RewardSpaceTestBase):
                  params={**params, "max_trade_duration_candles": 120},
                  num_samples=250,
                  seed=SEEDS.PBRS_INVARIANCE_2,
-                base_factor=self.TEST_BASE_FACTOR,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR,
+                base_factor=PARAMS.BASE_FACTOR,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                  max_duration_ratio=2.0,
                  trading_mode="margin",
-                pnl_base_std=self.TEST_PNL_STD,
-                pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+                pnl_base_std=PARAMS.PNL_STD,
+                pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
              )
          finally:
              pd.DataFrame.sum = original_sum
@@ -558,7 +566,6 @@ class TestPBRS(RewardSpaceTestBase):
      # Owns invariant (comparison path): pbrs-canonical-drift-correction-106
      def test_pbrs_106_canonical_drift_correction_uniform_offset(self):
          """Canonical drift correction reduces Σ shaping below tolerance vs non-canonical."""
-        from ..constants import SCENARIOS
  
          params_can = self.base_params(
              exit_potential_mode="canonical",
@@ -571,13 +578,13 @@ class TestPBRS(RewardSpaceTestBase):
              params={**params_can, "max_trade_duration_candles": 120},
              num_samples=SCENARIOS.SAMPLE_SIZE_MEDIUM,
              seed=SEEDS.PBRS_TERMINAL,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          params_non = self.base_params(
              exit_potential_mode="retain_previous",
@@ -590,25 +597,25 @@ class TestPBRS(RewardSpaceTestBase):
              params={**params_non, "max_trade_duration_candles": 120},
              num_samples=SCENARIOS.SAMPLE_SIZE_MEDIUM,
              seed=SEEDS.PBRS_TERMINAL,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          total_can = float(df_can["reward_shaping"].sum())
          total_non = float(df_non["reward_shaping"].sum())
-        self.assertLess(abs(total_can), abs(total_non) + self.TOL_IDENTITY_RELAXED)
+        self.assertLess(abs(total_can), abs(total_non) + TOLERANCE.IDENTITY_RELAXED)
          assert_pbrs_canonical_sum_within_tolerance(self, total_can, PBRS_INVARIANCE_TOL)
          invariant_mask = df_can["pbrs_invariant"]
          if bool(getattr(invariant_mask, "any", lambda: False)()):
              corrected_values = df_can.loc[invariant_mask, "reward_shaping"].to_numpy()
              mean_corrected = float(np.mean(corrected_values))
-            self.assertLess(abs(mean_corrected), self.TOL_IDENTITY_RELAXED)
+            self.assertLess(abs(mean_corrected), TOLERANCE.IDENTITY_RELAXED)
              spread = float(np.max(corrected_values) - np.min(corrected_values))
-            self.assertLess(spread, self.PBRS_MAX_ABS_SHAPING)
+            self.assertLess(spread, PBRS.MAX_ABS_SHAPING)
  
      # ---------------- Statistical shape invariance ---------------- #
  
@@ -624,14 +631,14 @@ class TestPBRS(RewardSpaceTestBase):
              m2 = np.mean(c**2)
              m3 = np.mean(c**3)
              m4 = np.mean(c**4)
-            skew = m3 / (m2**1.5 + self.TOL_NUMERIC_GUARD)
-            kurt = m4 / (m2**2 + self.TOL_NUMERIC_GUARD) - 3.0
+            skew = m3 / (m2**1.5 + TOLERANCE.NUMERIC_GUARD)
+            kurt = m4 / (m2**2 + TOLERANCE.NUMERIC_GUARD) - 3.0
              return (float(skew), float(kurt))
  
          s_base, k_base = _skew_kurt(base)
          s_scaled, k_scaled = _skew_kurt(scaled)
-        self.assertAlmostEqualFloat(s_base, s_scaled, tolerance=self.TOL_DISTRIB_SHAPE)
-        self.assertAlmostEqualFloat(k_base, k_scaled, tolerance=self.TOL_DISTRIB_SHAPE)
+        self.assertAlmostEqualFloat(s_base, s_scaled, tolerance=TOLERANCE.DISTRIB_SHAPE)
+        self.assertAlmostEqualFloat(k_base, k_scaled, tolerance=TOLERANCE.DISTRIB_SHAPE)
  
      # ---------------- Report classification / formatting ---------------- #
  
@@ -639,11 +646,6 @@ class TestPBRS(RewardSpaceTestBase):
      @pytest.mark.smoke
      def test_pbrs_non_canonical_report_generation(self):
          """Synthetic invariance section: Non-canonical classification formatting."""
-        import re
-
-        import pandas as pd
-
-        from reward_space_analysis import PBRS_INVARIANCE_TOL
  
          df = pd.DataFrame(
              {
@@ -674,7 +676,7 @@ class TestPBRS(RewardSpaceTestBase):
          self.assertIsNotNone(m_abs)
          if m_abs:
              val = float(m_abs.group(1))
-            self.assertAlmostEqual(abs(total_shaping), val, places=12)
+            self.assertAlmostEqual(abs(total_shaping), val, places=TOLERANCE.DECIMAL_PLACES_STRICT)
  
      def test_potential_gamma_boundary_values_stability(self):
          """Potential gamma boundary values (0 and ≈1) produce bounded shaping."""
@@ -690,7 +692,7 @@ class TestPBRS(RewardSpaceTestBase):
                  apply_potential_shaping(
                      base_reward=0.0,
                      current_pnl=0.02,
-                    pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+                    pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
                      current_duration_ratio=0.3,
                      next_pnl=0.025,
                      next_duration_ratio=0.35,
@@ -701,11 +703,10 @@ class TestPBRS(RewardSpaceTestBase):
              )
              self.assertTrue(np.isfinite(shap))
              self.assertTrue(np.isfinite(next_pot))
-            self.assertLessEqual(abs(shap), self.PBRS_MAX_ABS_SHAPING)
+            self.assertLessEqual(abs(shap), PBRS.MAX_ABS_SHAPING)
  
      def test_report_cumulative_invariance_aggregation(self):
          """Canonical telescoping term: small per-step mean drift, bounded increments."""
-        from ..constants import SCENARIOS
  
          params = self.base_params(
              hold_potential_enabled=True,
@@ -731,7 +732,7 @@ class TestPBRS(RewardSpaceTestBase):
                  apply_potential_shaping(
                      base_reward=0.0,
                      current_pnl=current_pnl,
-                    pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+                    pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
                      current_duration_ratio=current_dur,
                      next_pnl=next_pnl,
                      next_duration_ratio=next_dur,
@@ -757,13 +758,12 @@ class TestPBRS(RewardSpaceTestBase):
          )
          self.assertLessEqual(
              max_abs_step,
-            self.PBRS_MAX_ABS_SHAPING,
+            PBRS.MAX_ABS_SHAPING,
              f"Unexpected large telescoping increment (max={max_abs_step})",
          )
  
      def test_report_explicit_non_invariance_progressive_release(self):
          """progressive_release cumulative shaping non-zero (release leak)."""
-        from ..constants import SCENARIOS
  
          params = self.base_params(
              hold_potential_enabled=True,
@@ -775,7 +775,6 @@ class TestPBRS(RewardSpaceTestBase):
          rng = np.random.default_rng(321)
          last_potential = 0.0
          shaping_sum = 0.0
-        from ..constants import STATISTICAL
  
          for _ in range(SCENARIOS.MONTE_CARLO_ITERATIONS):
              is_exit = rng.uniform() < STATISTICAL.EXIT_PROBABILITY_THRESHOLD
@@ -785,7 +784,7 @@ class TestPBRS(RewardSpaceTestBase):
                  apply_potential_shaping(
                      base_reward=0.0,
                      current_pnl=float(rng.normal(0, 0.07)),
-                    pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+                    pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
                      current_duration_ratio=float(rng.uniform(0, 1)),
                      next_pnl=next_pnl,
                      next_duration_ratio=next_dur,
@@ -807,14 +806,6 @@ class TestPBRS(RewardSpaceTestBase):
      @pytest.mark.smoke
      def test_pbrs_canonical_near_zero_report(self):
          """Invariant 116: canonical near-zero cumulative shaping classified in full report."""
-        import re
-
-        import numpy as np
-        import pandas as pd
-
-        from reward_space_analysis import PBRS_INVARIANCE_TOL
-
-        from ..constants import SCENARIOS
  
          small_vals = [1.0e-7, -2.0e-7, 3.0e-7]  # sum = 2.0e-7 < tolerance
          total_shaping = float(sum(small_vals))
@@ -852,9 +843,9 @@ class TestPBRS(RewardSpaceTestBase):
          write_complete_statistical_analysis(
              df,
              output_dir=out_dir,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
-            seed=self.SEED,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+            seed=SEEDS.BASE,
              skip_feature_analysis=True,
              skip_partial_dependence=True,
              bootstrap_resamples=SCENARIOS.BOOTSTRAP_MINIMAL_ITERATIONS,
@@ -870,17 +861,14 @@ class TestPBRS(RewardSpaceTestBase):
          self.assertIsNotNone(m_abs)
          if m_abs:
              val_abs = float(m_abs.group(1))
-            self.assertAlmostEqual(abs(total_shaping), val_abs, places=12)
+            self.assertAlmostEqual(
+                abs(total_shaping), val_abs, places=TOLERANCE.DECIMAL_PLACES_STRICT
+            )
  
      # Non-owning smoke; ownership: robustness/test_robustness.py:35 (robustness-decomposition-integrity-101)
      @pytest.mark.smoke
      def test_pbrs_canonical_warning_report(self):
          """Canonical mode + no additives but |Σ shaping| > tolerance -> warning classification."""
-        import pandas as pd
-
-        from reward_space_analysis import PBRS_INVARIANCE_TOL
-
-        from ..constants import SCENARIOS
  
          shaping_vals = [1.2e-4, 1.3e-4, 8.0e-5, -2.0e-5, 1.4e-4]  # sum = 4.5e-4 (> tol)
          total_shaping = sum(shaping_vals)
@@ -914,9 +902,9 @@ class TestPBRS(RewardSpaceTestBase):
          write_complete_statistical_analysis(
              df,
              output_dir=out_dir,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
-            seed=self.SEED,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+            seed=SEEDS.BASE,
              skip_feature_analysis=True,
              skip_partial_dependence=True,
              bootstrap_resamples=SCENARIOS.BOOTSTRAP_MINIMAL_ITERATIONS,
@@ -934,9 +922,6 @@ class TestPBRS(RewardSpaceTestBase):
      @pytest.mark.smoke
      def test_pbrs_non_canonical_full_report_reason_aggregation(self):
          """Full report: Non-canonical classification aggregates mode + additives reasons."""
-        import pandas as pd
-
-        from ..constants import SCENARIOS
  
          shaping_vals = [0.02, -0.005, 0.007]
          entry_add_vals = [0.003, 0.0, 0.004]
@@ -970,9 +955,9 @@ class TestPBRS(RewardSpaceTestBase):
          write_complete_statistical_analysis(
              df,
              output_dir=out_dir,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
-            seed=self.SEED,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+            seed=SEEDS.BASE,
              skip_feature_analysis=True,
              skip_partial_dependence=True,
              bootstrap_resamples=SCENARIOS.BOOTSTRAP_MINIMAL_ITERATIONS,
@@ -991,11 +976,6 @@ class TestPBRS(RewardSpaceTestBase):
      @pytest.mark.smoke
      def test_pbrs_non_canonical_mode_only_reason(self):
          """Non-canonical exit mode with additives disabled -> reason excludes additive list."""
-        import pandas as pd
-
-        from reward_space_analysis import PBRS_INVARIANCE_TOL
-
-        from ..constants import SCENARIOS
  
          shaping_vals = [0.002, -0.0005, 0.0012]
          total_shaping = sum(shaping_vals)
@@ -1029,9 +1009,9 @@ class TestPBRS(RewardSpaceTestBase):
          write_complete_statistical_analysis(
              df,
              output_dir=out_dir,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
-            seed=self.SEED,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+            seed=SEEDS.BASE,
              skip_feature_analysis=True,
              skip_partial_dependence=True,
              bootstrap_resamples=SCENARIOS.BOOTSTRAP_MINIMAL_ITERATIONS,
@@ -1049,9 +1029,6 @@ class TestPBRS(RewardSpaceTestBase):
      # Owns invariant: pbrs-absence-shift-placeholder-118
      def test_pbrs_absence_and_distribution_shift_placeholder(self):
          """Report generation without PBRS columns triggers absence + shift placeholder."""
-        import pandas as pd
-
-        from ..constants import SEEDS
  
          n = 90
          rng = np.random.default_rng(SEEDS.CANONICAL_SWEEP)
@@ -1077,13 +1054,13 @@ class TestPBRS(RewardSpaceTestBase):
              }
          )
          out_dir = self.output_path / "pbrs_absence_and_shift_placeholder"
+        # Import here to mock _compute_summary_stats function
          import reward_space_analysis as rsa
  
-        from ..constants import SCENARIOS
-
          original_compute_summary_stats = rsa._compute_summary_stats
  
          def _minimal_summary_stats(_df):
+            # Use _pd alias to avoid conflicts with global pd
              import pandas as _pd
  
              comp_share = _pd.Series([], dtype=float)
@@ -1108,9 +1085,9 @@ class TestPBRS(RewardSpaceTestBase):
              write_complete_statistical_analysis(
                  df,
                  output_dir=out_dir,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR,
-                seed=self.SEED,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
+                seed=SEEDS.BASE,
                  skip_feature_analysis=True,
                  skip_partial_dependence=True,
                  bootstrap_resamples=SCENARIOS.BOOTSTRAP_MINIMAL_ITERATIONS // 2,
@@ -1125,10 +1102,6 @@ class TestPBRS(RewardSpaceTestBase):
  
      def test_get_max_idle_duration_candles_negative_or_zero_fallback(self):
          """Explicit mid<=0 fallback path returns derived default multiplier."""
-        from reward_space_analysis import (
-            DEFAULT_MODEL_REWARD_PARAMETERS,
-        )
-
          base = DEFAULT_MODEL_REWARD_PARAMETERS.copy()
          base["max_trade_duration_candles"] = 64
          base["max_idle_duration_candles"] = 0
diff --git a/ReforceXY/reward_space_analysis/tests/robustness/test_branch_coverage.py b/ReforceXY/reward_space_analysis/tests/robustness/test_branch_coverage.py

index 76c20a1531f51505d47f8cab271ef618b653858b..1cf1ea5f8b9b831818b402d760855e303968577e 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/robustness/test_branch_coverage.py
+++ b/ReforceXY/reward_space_analysis/tests/robustness/test_branch_coverage.py
@@ -61,6 +61,17 @@ def test_validate_reward_parameters_relaxed_adjustment_batch():
  
  @pytest.mark.robustness
  def test_get_exit_factor_negative_plateau_grace_warning():
+    """Verify negative exit_plateau_grace triggers warning but returns valid factor.
+
+    **Setup:**
+    - Attenuation mode: linear with plateau
+    - exit_plateau_grace: -1.0 (invalid, should be non-negative)
+    - Duration ratio: 0.5
+
+    **Assertions:**
+    - Warning emitted (RewardDiagnosticsWarning)
+    - Factor is non-negative despite invalid parameter
+    """
      params = {"exit_attenuation_mode": "linear", "exit_plateau": True, "exit_plateau_grace": -1.0}
      pnl = 0.01
      pnl_target = 0.03
@@ -88,6 +99,17 @@ def test_get_exit_factor_negative_plateau_grace_warning():
  
  @pytest.mark.robustness
  def test_get_exit_factor_negative_linear_slope_warning():
+    """Verify negative exit_linear_slope triggers warning but returns valid factor.
+
+    **Setup:**
+    - Attenuation mode: linear
+    - exit_linear_slope: -5.0 (invalid, should be non-negative)
+    - Duration ratio: 2.0
+
+    **Assertions:**
+    - Warning emitted (RewardDiagnosticsWarning)
+    - Factor is non-negative despite invalid parameter
+    """
      params = {"exit_attenuation_mode": "linear", "exit_linear_slope": -5.0}
      pnl = 0.01
      pnl_target = 0.03
@@ -115,6 +137,18 @@ def test_get_exit_factor_negative_linear_slope_warning():
  
  @pytest.mark.robustness
  def test_get_exit_factor_invalid_power_tau_relaxed():
+    """Verify invalid exit_power_tau (0.0) triggers warning in relaxed mode.
+
+    **Setup:**
+    - Attenuation mode: power
+    - exit_power_tau: 0.0 (invalid, should be positive)
+    - strict_validation: False (relaxed mode)
+    - Duration ratio: 1.5
+
+    **Assertions:**
+    - Warning emitted (RewardDiagnosticsWarning)
+    - Factor is positive (fallback to default tau)
+    """
      params = {"exit_attenuation_mode": "power", "exit_power_tau": 0.0, "strict_validation": False}
      pnl = 0.02
      pnl_target = 0.03
@@ -142,6 +176,18 @@ def test_get_exit_factor_invalid_power_tau_relaxed():
  
  @pytest.mark.robustness
  def test_get_exit_factor_half_life_near_zero_relaxed():
+    """Verify near-zero exit_half_life triggers warning in relaxed mode.
+
+    **Setup:**
+    - Attenuation mode: half_life
+    - exit_half_life: 1e-12 (near zero, impractical)
+    - strict_validation: False (relaxed mode)
+    - Duration ratio: 2.0
+
+    **Assertions:**
+    - Warning emitted (RewardDiagnosticsWarning)
+    - Factor is non-zero (fallback to sensible value)
+    """
      params = {
          "exit_attenuation_mode": "half_life",
          "exit_half_life": 1e-12,
@@ -173,6 +219,16 @@ def test_get_exit_factor_half_life_near_zero_relaxed():
  
  @pytest.mark.robustness
  def test_hold_penalty_short_duration_returns_zero():
+    """Verify hold penalty is zero when trade_duration is below max threshold.
+
+    **Setup:**
+    - Trade duration: 1 candle (short)
+    - Max trade duration: 128 candles
+    - Position: Long, Action: Neutral (hold)
+
+    **Assertions:**
+    - Penalty equals 0.0 (no penalty for short duration holds)
+    """
      context = RewardContext(
          pnl=0.0,
          trade_duration=1,  # shorter than default max trade duration (128)
diff --git a/ReforceXY/reward_space_analysis/tests/robustness/test_robustness.py b/ReforceXY/reward_space_analysis/tests/robustness/test_robustness.py

index fea1b8d5459dd29825edf7bc51b334f99c70e5da..21f02922017e7c2213d3e5f5eb367f4a6869baa3 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/robustness/test_robustness.py
+++ b/ReforceXY/reward_space_analysis/tests/robustness/test_robustness.py
@@ -18,6 +18,13 @@ from reward_space_analysis import (
      simulate_samples,
  )
  
+from ..constants import (
+    CONTINUITY,
+    EXIT_FACTOR,
+    PARAMS,
+    SEEDS,
+    TOLERANCE,
+)
  from ..helpers import (
      assert_diagnostic_warning,
      assert_exit_factor_attenuation_modes,
@@ -63,7 +70,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              ),
              dict(
                  ctx=self.make_ctx(
-                    pnl=self.TEST_PROFIT_AIM,
+                    pnl=PARAMS.PROFIT_AIM,
                      trade_duration=60,
                      idle_duration=0,
                      max_unrealized_profit=0.05,
@@ -104,17 +111,19 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                  br = calculate_reward(
                      ctx_obj,
                      params,
-                    base_factor=self.TEST_BASE_FACTOR,
-                    profit_aim=self.TEST_PROFIT_AIM,
-                    risk_reward_ratio=self.TEST_RR,
+                    base_factor=PARAMS.BASE_FACTOR,
+                    profit_aim=PARAMS.PROFIT_AIM,
+                    risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                      short_allowed=True,
                      action_masking=True,
                  )
+                # Relaxed tolerance: Accumulated floating-point errors across multiple
+                # reward component calculations (entry, hold, exit additives, and penalties)
                  assert_single_active_component_with_additives(
                      self,
                      br,
                      active_label,
-                    self.TOL_IDENTITY_RELAXED,
+                    TOLERANCE.IDENTITY_RELAXED,
                      inactive_core=[
                          "exit_component",
                          "idle_penalty",
@@ -129,14 +138,14 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          df = simulate_samples(
              params=self.base_params(max_trade_duration_candles=50),
              num_samples=200,
-            seed=self.SEED,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.BASE,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          total_pnl = df["pnl"].sum()
          exit_mask = df["reward_exit"] != 0
@@ -144,7 +153,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          self.assertAlmostEqual(
              total_pnl,
              exit_pnl_sum,
-            places=10,
+            places=TOLERANCE.DECIMAL_PLACES_STANDARD,
              msg="PnL invariant violation: total PnL != sum of exit PnL",
          )
          non_zero_pnl_actions = set(np.unique(df[df["pnl"].abs() > np.finfo(float).eps]["action"]))
@@ -172,14 +181,16 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          )
          params = self.DEFAULT_PARAMS.copy()
  
+        # Relaxed tolerance: Exit factor calculations involve multiple steps
+        # (normalization, kernel application, potential transforms)
          assert_exit_mode_mathematical_validation(
              self,
              context,
              params,
-            self.TEST_BASE_FACTOR,
-            self.TEST_PROFIT_AIM,
-            self.TEST_RR,
-            self.TOL_IDENTITY_RELAXED,
+            PARAMS.BASE_FACTOR,
+            PARAMS.PROFIT_AIM,
+            PARAMS.RISK_REWARD_RATIO,
+            TOLERANCE.IDENTITY_RELAXED,
          )
  
          # Part 2: Monotonic attenuation validation
@@ -191,16 +202,18 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              max_unrealized_profit=0.06,
              min_unrealized_profit=0.0,
          )
+        # Relaxed tolerance: Testing across multiple attenuation modes with different
+        # numerical characteristics (exponential, polynomial, rational functions)
          assert_exit_factor_attenuation_modes(
              self,
-            base_factor=self.TEST_BASE_FACTOR,
+            base_factor=PARAMS.BASE_FACTOR,
              pnl=test_pnl,
-            pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+            pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
              context=test_context,
              attenuation_modes=modes,
              base_params_fn=self.base_params,
-            tolerance_relaxed=self.TOL_IDENTITY_RELAXED,
-            risk_reward_ratio=self.TEST_RR,
+            tolerance_relaxed=TOLERANCE.IDENTITY_RELAXED,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
          )
  
      def test_exit_factor_threshold_warning_and_non_capping(self):
@@ -220,19 +233,19 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              baseline = calculate_reward(
                  context,
                  params,
-                base_factor=self.TEST_BASE_FACTOR,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR_HIGH,
+                base_factor=PARAMS.BASE_FACTOR,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO_HIGH,
                  short_allowed=True,
                  action_masking=True,
              )
-            amplified_base_factor = self.TEST_BASE_FACTOR * 200.0
+            amplified_base_factor = PARAMS.BASE_FACTOR * 200.0
              amplified = calculate_reward(
                  context,
                  params,
                  base_factor=amplified_base_factor,
-                profit_aim=self.TEST_PROFIT_AIM,
-                risk_reward_ratio=self.TEST_RR_HIGH,
+                profit_aim=PARAMS.PROFIT_AIM,
+                risk_reward_ratio=PARAMS.RISK_REWARD_RATIO_HIGH,
                  short_allowed=True,
                  action_masking=True,
              )
@@ -257,7 +270,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          """Negative exit_linear_slope is sanitized to 1.0; resulting exit factors must match slope=1.0 within tolerance."""
          base_factor = 100.0
          pnl = 0.03
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.04, min_unrealized_profit=0.0
          )
@@ -270,15 +283,17 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          )
          for dr in duration_ratios:
              f_bad = _get_exit_factor(
-                base_factor, pnl, pnl_target, dr, test_context, params_bad, self.TEST_RR
+                base_factor, pnl, pnl_target, dr, test_context, params_bad, PARAMS.RISK_REWARD_RATIO
              )
              f_ref = _get_exit_factor(
-                base_factor, pnl, pnl_target, dr, test_context, params_ref, self.TEST_RR
+                base_factor, pnl, pnl_target, dr, test_context, params_ref, PARAMS.RISK_REWARD_RATIO
              )
+            # Relaxed tolerance: Comparing exit factors computed with different slope values
+            # after sanitization; minor numerical differences expected
              self.assertAlmostEqualFloat(
                  f_bad,
                  f_ref,
-                tolerance=self.TOL_IDENTITY_RELAXED,
+                tolerance=TOLERANCE.IDENTITY_RELAXED,
                  msg=f"Sanitized slope mismatch at dr={dr} f_bad={f_bad} f_ref={f_ref}",
              )
  
@@ -286,7 +301,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          """Power mode attenuation: ratio f(dr=1)/f(dr=0) must equal 1/(1+1)^alpha with alpha=-log(tau)/log(2)."""
          base_factor = 200.0
          pnl = 0.04
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.05, min_unrealized_profit=0.0
          )
@@ -297,10 +312,16 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                  exit_attenuation_mode="power", exit_power_tau=tau, exit_plateau=False
              )
              f0 = _get_exit_factor(
-                base_factor, pnl, pnl_target, 0.0, test_context, params, self.TEST_RR
+                base_factor, pnl, pnl_target, 0.0, test_context, params, PARAMS.RISK_REWARD_RATIO
              )
              f1 = _get_exit_factor(
-                base_factor, pnl, pnl_target, duration_ratio, test_context, params, self.TEST_RR
+                base_factor,
+                pnl,
+                pnl_target,
+                duration_ratio,
+                test_context,
+                params,
+                PARAMS.RISK_REWARD_RATIO,
              )
              if 0.0 < tau <= 1.0:
                  alpha = -math.log(tau) / math.log(2.0)
@@ -316,7 +337,23 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              )
  
      def test_reward_calculation_extreme_parameters_stability(self):
-        """Test reward calculation extreme parameters stability."""
+        """Reward calculation remains numerically stable with extreme parameter values.
+
+        Tests numerical stability and finite output when using extreme parameter
+        values (win_reward_factor=1000, base_factor=10000) that might cause
+        overflow or NaN propagation in poorly designed implementations.
+
+        **Setup:**
+        - Extreme parameters: win_reward_factor=1000.0, base_factor=10000.0
+        - Context: Long exit with pnl=0.05, duration=50, profit extrema=[0.02, 0.06]
+        - Configuration: short_allowed=True, action_masking=True
+
+        **Assertions:**
+        - Total reward is finite (not NaN, not Inf)
+
+        **Tolerance rationale:**
+        - Uses assertFinite which checks for non-NaN, non-Inf values only
+        """
          extreme_params = self.base_params(win_reward_factor=1000.0, base_factor=10000.0)
          context = RewardContext(
              pnl=0.05,
@@ -331,15 +368,32 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              context,
              extreme_params,
              base_factor=10000.0,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              short_allowed=True,
              action_masking=True,
          )
          self.assertFinite(br.total, name="breakdown.total")
  
      def test_exit_attenuation_modes_enumeration(self):
-        """Test exit attenuation modes enumeration."""
+        """All exit attenuation modes produce finite rewards without errors.
+
+        Smoke test ensuring each exit attenuation mode (including legacy modes)
+        executes successfully and produces finite reward components. This validates
+        that mode enumeration is complete and all modes are correctly implemented.
+
+        **Setup:**
+        - Modes tested: All values in ATTENUATION_MODES_WITH_LEGACY
+        - Context: Long exit with pnl=0.02, duration=50, profit extrema=[0.01, 0.03]
+        - Uses subTest for mode-specific failure isolation
+
+        **Assertions:**
+        - Exit component is finite for each mode
+        - Total reward is finite for each mode
+
+        **Tolerance rationale:**
+        - Uses assertFinite which checks for non-NaN, non-Inf values only
+        """
          modes = ATTENUATION_MODES_WITH_LEGACY
          for mode in modes:
              with self.subTest(mode=mode):
@@ -356,9 +410,9 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                  br = calculate_reward(
                      ctx,
                      test_params,
-                    base_factor=self.TEST_BASE_FACTOR,
-                    profit_aim=self.TEST_PROFIT_AIM,
-                    risk_reward_ratio=self.TEST_RR,
+                    base_factor=PARAMS.BASE_FACTOR,
+                    profit_aim=PARAMS.PROFIT_AIM,
+                    risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
                      short_allowed=True,
                      action_masking=True,
                  )
@@ -369,20 +423,20 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          """Test parameter edge cases: tau extrema, plateau grace edges, slope zero."""
          base_factor = 50.0
          pnl = 0.02
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.03, min_unrealized_profit=0.0
          )
          params_hi = self.base_params(exit_attenuation_mode="power", exit_power_tau=0.999999)
          params_lo = self.base_params(
-            exit_attenuation_mode="power", exit_power_tau=self.MIN_EXIT_POWER_TAU
+            exit_attenuation_mode="power", exit_power_tau=EXIT_FACTOR.MIN_POWER_TAU
          )
          r = 1.5
          hi_val = _get_exit_factor(
-            base_factor, pnl, pnl_target, r, test_context, params_hi, self.TEST_RR
+            base_factor, pnl, pnl_target, r, test_context, params_hi, PARAMS.RISK_REWARD_RATIO
          )
          lo_val = _get_exit_factor(
-            base_factor, pnl, pnl_target, r, test_context, params_lo, self.TEST_RR
+            base_factor, pnl, pnl_target, r, test_context, params_lo, PARAMS.RISK_REWARD_RATIO
          )
          self.assertGreater(
              hi_val, lo_val, "Power mode: higher tau (≈1) should attenuate less than tiny tau"
@@ -400,10 +454,10 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              exit_linear_slope=1.0,
          )
          val_g0 = _get_exit_factor(
-            base_factor, pnl, pnl_target, 0.5, test_context, params_g0, self.TEST_RR
+            base_factor, pnl, pnl_target, 0.5, test_context, params_g0, PARAMS.RISK_REWARD_RATIO
          )
          val_g1 = _get_exit_factor(
-            base_factor, pnl, pnl_target, 0.5, test_context, params_g1, self.TEST_RR
+            base_factor, pnl, pnl_target, 0.5, test_context, params_g1, PARAMS.RISK_REWARD_RATIO
          )
          self.assertGreater(
              val_g1, val_g0, "Plateau grace=1.0 should delay attenuation vs grace=0.0"
@@ -415,10 +469,10 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              exit_attenuation_mode="linear", exit_linear_slope=2.0, exit_plateau=False
          )
          val_lin0 = _get_exit_factor(
-            base_factor, pnl, pnl_target, 1.0, test_context, params_lin0, self.TEST_RR
+            base_factor, pnl, pnl_target, 1.0, test_context, params_lin0, PARAMS.RISK_REWARD_RATIO
          )
          val_lin1 = _get_exit_factor(
-            base_factor, pnl, pnl_target, 1.0, test_context, params_lin1, self.TEST_RR
+            base_factor, pnl, pnl_target, 1.0, test_context, params_lin1, PARAMS.RISK_REWARD_RATIO
          )
          self.assertGreater(
              val_lin0, val_lin1, "Linear slope=0 should yield no attenuation vs slope>0"
@@ -432,23 +486,27 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              exit_plateau_grace=0.3,
              exit_linear_slope=0.0,
          )
-        base_factor = self.TEST_BASE_FACTOR
+        base_factor = PARAMS.BASE_FACTOR
          pnl = 0.04
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.05, min_unrealized_profit=0.0
          )
          ratios = [0.3, 0.6, 1.0, 1.4]
          values = [
-            _get_exit_factor(base_factor, pnl, pnl_target, r, test_context, params, self.TEST_RR)
+            _get_exit_factor(
+                base_factor, pnl, pnl_target, r, test_context, params, PARAMS.RISK_REWARD_RATIO
+            )
              for r in ratios
          ]
          first = values[0]
          for v in values[1:]:
+            # Relaxed tolerance: Exit factor should remain constant across all duration
+            # ratios when slope=0; accumulated errors from multiple calculations
              self.assertAlmostEqualFloat(
                  v,
                  first,
-                tolerance=self.TOL_IDENTITY_RELAXED,
+                tolerance=TOLERANCE.IDENTITY_RELAXED,
                  msg=f"Plateau+linear slope=0 factor drift at ratio set {ratios} => {values}",
              )
  
@@ -464,24 +522,32 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              }
          )
          base_factor = 80.0
-        profit_aim = self.TEST_PROFIT_AIM
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        profit_aim = PARAMS.PROFIT_AIM
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=profit_aim, trade_duration=50, max_unrealized_profit=0.04, min_unrealized_profit=0.0
          )
          ratios = [0.8, 1.0, 1.2, 1.4, 1.6]
          vals = [
              _get_exit_factor(
-                base_factor, profit_aim, pnl_target, r, test_context, params, self.TEST_RR
+                base_factor,
+                profit_aim,
+                pnl_target,
+                r,
+                test_context,
+                params,
+                PARAMS.RISK_REWARD_RATIO,
              )
              for r in ratios
          ]
          ref = vals[0]
          for i, r in enumerate(ratios[:-1]):
+            # Relaxed tolerance: All values before grace boundary should match;
+            # minor differences from repeated exit factor computations expected
              self.assertAlmostEqualFloat(
                  vals[i],
                  ref,
-                tolerance=self.TOL_IDENTITY_RELAXED,
+                tolerance=TOLERANCE.IDENTITY_RELAXED,
                  msg=f"Unexpected attenuation before grace end at ratio {r}",
              )
          self.assertLess(vals[-1], ref, "Attenuation should begin after grace boundary")
@@ -490,10 +556,10 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          """Test plateau continuity at grace boundary."""
          modes = list(ATTENUATION_MODES)
          grace = 0.8
-        eps = self.CONTINUITY_EPS_SMALL
-        base_factor = self.TEST_BASE_FACTOR
+        eps = CONTINUITY.EPS_SMALL
+        base_factor = PARAMS.BASE_FACTOR
          pnl = 0.01
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.02, min_unrealized_profit=0.0
          )
@@ -514,18 +580,38 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                      }
                  )
                  left = _get_exit_factor(
-                    base_factor, pnl, pnl_target, grace - eps, test_context, params, self.TEST_RR
+                    base_factor,
+                    pnl,
+                    pnl_target,
+                    grace - eps,
+                    test_context,
+                    params,
+                    PARAMS.RISK_REWARD_RATIO,
                  )
                  boundary = _get_exit_factor(
-                    base_factor, pnl, pnl_target, grace, test_context, params, self.TEST_RR
+                    base_factor,
+                    pnl,
+                    pnl_target,
+                    grace,
+                    test_context,
+                    params,
+                    PARAMS.RISK_REWARD_RATIO,
                  )
                  right = _get_exit_factor(
-                    base_factor, pnl, pnl_target, grace + eps, test_context, params, self.TEST_RR
+                    base_factor,
+                    pnl,
+                    pnl_target,
+                    grace + eps,
+                    test_context,
+                    params,
+                    PARAMS.RISK_REWARD_RATIO,
                  )
+                # Relaxed tolerance: Continuity check at plateau grace boundary;
+                # left and boundary values should be nearly identical
                  self.assertAlmostEqualFloat(
                      left,
                      boundary,
-                    tolerance=self.TOL_IDENTITY_RELAXED,
+                    tolerance=TOLERANCE.IDENTITY_RELAXED,
                      msg=f"Left/boundary mismatch for mode {mode}",
                  )
                  self.assertLess(
@@ -533,14 +619,19 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                  )
                  diff = boundary - right
                  if mode == "linear":
-                    bound = base_factor * slope * eps * 2.0
+                    bound = base_factor * slope * eps * CONTINUITY.BOUND_MULTIPLIER_LINEAR
                  elif mode == "sqrt":
-                    bound = base_factor * 0.5 * eps * 2.0
+                    bound = base_factor * 0.5 * eps * CONTINUITY.BOUND_MULTIPLIER_SQRT
                  elif mode == "power":
                      alpha = -math.log(tau) / math.log(2.0)
-                    bound = base_factor * alpha * eps * 2.0
+                    bound = base_factor * alpha * eps * CONTINUITY.BOUND_MULTIPLIER_POWER
                  elif mode == "half_life":
-                    bound = base_factor * (math.log(2.0) / half_life) * eps * 2.5
+                    bound = (
+                        base_factor
+                        * (math.log(2.0) / half_life)
+                        * eps
+                        * CONTINUITY.BOUND_MULTIPLIER_HALF_LIFE
+                    )
                  else:
                      bound = base_factor * eps * 5.0
                  self.assertLessEqual(
@@ -553,11 +644,11 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          """Verify attenuation difference scales approximately linearly with epsilon (first-order continuity heuristic)."""
          mode = "linear"
          grace = 0.6
-        eps1 = self.CONTINUITY_EPS_LARGE
-        eps2 = self.CONTINUITY_EPS_SMALL
+        eps1 = CONTINUITY.EPS_LARGE
+        eps2 = CONTINUITY.EPS_SMALL
          base_factor = 80.0
          pnl = 0.02
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR_HIGH
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO_HIGH
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.03, min_unrealized_profit=0.0
          )
@@ -571,25 +662,38 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              }
          )
          f_boundary = _get_exit_factor(
-            base_factor, pnl, pnl_target, grace, test_context, params, self.TEST_RR_HIGH
+            base_factor, pnl, pnl_target, grace, test_context, params, PARAMS.RISK_REWARD_RATIO_HIGH
          )
          f1 = _get_exit_factor(
-            base_factor, pnl, pnl_target, grace + eps1, test_context, params, self.TEST_RR_HIGH
+            base_factor,
+            pnl,
+            pnl_target,
+            grace + eps1,
+            test_context,
+            params,
+            PARAMS.RISK_REWARD_RATIO_HIGH,
          )
          f2 = _get_exit_factor(
-            base_factor, pnl, pnl_target, grace + eps2, test_context, params, self.TEST_RR_HIGH
+            base_factor,
+            pnl,
+            pnl_target,
+            grace + eps2,
+            test_context,
+            params,
+            PARAMS.RISK_REWARD_RATIO_HIGH,
          )
          diff1 = f_boundary - f1
          diff2 = f_boundary - f2
-        ratio = diff1 / max(diff2, self.TOL_NUMERIC_GUARD)
+        # NUMERIC_GUARD: Prevent division by zero when computing scaling ratio
+        ratio = diff1 / max(diff2, TOLERANCE.NUMERIC_GUARD)
          self.assertGreater(
              ratio,
-            self.EXIT_FACTOR_SCALING_RATIO_MIN,
+            EXIT_FACTOR.SCALING_RATIO_MIN,
              f"Scaling ratio too small (ratio={ratio:.2f})",
          )
          self.assertLess(
              ratio,
-            self.EXIT_FACTOR_SCALING_RATIO_MAX,
+            EXIT_FACTOR.SCALING_RATIO_MAX,
              f"Scaling ratio too large (ratio={ratio:.2f})",
          )
  
@@ -602,7 +706,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          )
          base_factor = 75.0
          pnl = 0.05
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.06, min_unrealized_profit=0.0
          )
@@ -615,7 +719,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                  duration_ratio,
                  test_context,
                  params,
-                self.TEST_RR_HIGH,
+                PARAMS.RISK_REWARD_RATIO_HIGH,
              )
          linear_params = self.base_params(exit_attenuation_mode="linear", exit_plateau=False)
          f_linear = _get_exit_factor(
@@ -625,12 +729,14 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              duration_ratio,
              test_context,
              linear_params,
-            self.TEST_RR_HIGH,
+            PARAMS.RISK_REWARD_RATIO_HIGH,
          )
+        # Relaxed tolerance: Unknown exit mode should fall back to linear mode;
+        # verifying identical behavior between fallback and explicit linear
          self.assertAlmostEqualFloat(
              f_unknown,
              f_linear,
-            tolerance=self.TOL_IDENTITY_RELAXED,
+            tolerance=TOLERANCE.IDENTITY_RELAXED,
              msg=f"Fallback linear mismatch unknown={f_unknown} linear={f_linear}",
          )
  
@@ -643,9 +749,9 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              exit_plateau_grace=-2.0,
              exit_linear_slope=1.2,
          )
-        base_factor = self.TEST_BASE_FACTOR
+        base_factor = PARAMS.BASE_FACTOR
          pnl = 0.03
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR_HIGH
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO_HIGH
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.04, min_unrealized_profit=0.0
          )
@@ -658,7 +764,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                  duration_ratio,
                  test_context,
                  params,
-                self.TEST_RR_HIGH,
+                PARAMS.RISK_REWARD_RATIO_HIGH,
              )
          # Reference with grace=0.0 (since negative should clamp)
          ref_params = self.base_params(
@@ -674,12 +780,14 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              duration_ratio,
              test_context,
              ref_params,
-            self.TEST_RR_HIGH,
+            PARAMS.RISK_REWARD_RATIO_HIGH,
          )
+        # Relaxed tolerance: Negative grace parameter should clamp to 0.0;
+        # verifying clamped behavior matches explicit grace=0.0 configuration
          self.assertAlmostEqualFloat(
              f_neg,
              f_ref,
-            tolerance=self.TOL_IDENTITY_RELAXED,
+            tolerance=TOLERANCE.IDENTITY_RELAXED,
              msg=f"Negative grace clamp mismatch f_neg={f_neg} f_ref={f_ref}",
          )
  
@@ -689,7 +797,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          invalid_taus = [0.0, -0.5, 2.0, float("nan")]
          base_factor = 120.0
          pnl = 0.04
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.05, min_unrealized_profit=0.0
          )
@@ -702,16 +810,29 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              )
              with assert_diagnostic_warning(["exit_power_tau"]):
                  f0 = _get_exit_factor(
-                    base_factor, pnl, pnl_target, 0.0, test_context, params, self.TEST_RR
+                    base_factor,
+                    pnl,
+                    pnl_target,
+                    0.0,
+                    test_context,
+                    params,
+                    PARAMS.RISK_REWARD_RATIO,
                  )
                  f1 = _get_exit_factor(
-                    base_factor, pnl, pnl_target, duration_ratio, test_context, params, self.TEST_RR
+                    base_factor,
+                    pnl,
+                    pnl_target,
+                    duration_ratio,
+                    test_context,
+                    params,
+                    PARAMS.RISK_REWARD_RATIO,
                  )
-            ratio = f1 / max(f0, self.TOL_NUMERIC_GUARD)
+            # NUMERIC_GUARD: Prevent division by zero when computing power mode ratio
+            ratio = f1 / max(f0, TOLERANCE.NUMERIC_GUARD)
              self.assertAlmostEqual(
                  ratio,
                  expected_ratio_alpha1,
-                places=9,
+                places=TOLERANCE.DECIMAL_PLACES_STANDARD,
                  msg=f"Alpha=1 fallback ratio mismatch tau={tau} ratio={ratio} expected={expected_ratio_alpha1}",
              )
  
@@ -720,7 +841,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
          """Invariant 105: Near-zero exit_half_life warns and returns factor≈base_factor (no attenuation)."""
          base_factor = 60.0
          pnl = 0.02
-        pnl_target = self.TEST_PROFIT_AIM * self.TEST_RR_HIGH
+        pnl_target = PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO_HIGH
          test_context = self.make_ctx(
              pnl=pnl, trade_duration=50, max_unrealized_profit=0.03, min_unrealized_profit=0.0
          )
@@ -730,7 +851,13 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
              params = self.base_params(exit_attenuation_mode="half_life", exit_half_life=hl)
              with assert_diagnostic_warning(["exit_half_life", "close to 0"]):
                  _ = _get_exit_factor(
-                    base_factor, pnl, pnl_target, 0.0, test_context, params, self.TEST_RR_HIGH
+                    base_factor,
+                    pnl,
+                    pnl_target,
+                    0.0,
+                    test_context,
+                    params,
+                    PARAMS.RISK_REWARD_RATIO_HIGH,
                  )
                  fdr = _get_exit_factor(
                      base_factor,
@@ -739,7 +866,7 @@ class TestRewardRobustnessAndBoundaries(RewardSpaceTestBase):
                      duration_ratio,
                      test_context,
                      params,
-                    self.TEST_RR_HIGH,
+                    PARAMS.RISK_REWARD_RATIO_HIGH,
                  )
              # Note: The expected value calculation needs adjustment since _get_exit_factor now computes
              # pnl_target_coefficient and efficiency_coefficient internally
diff --git a/ReforceXY/reward_space_analysis/tests/statistics/test_feature_analysis_failures.py b/ReforceXY/reward_space_analysis/tests/statistics/test_feature_analysis_failures.py

index affae2a2a7e5ed466d500b104a271eedcc89a253..8e7dde036f01c7c1f70d85b7816d28602a7e4173 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/statistics/test_feature_analysis_failures.py
+++ b/ReforceXY/reward_space_analysis/tests/statistics/test_feature_analysis_failures.py
@@ -40,6 +40,18 @@ def _minimal_df(n: int = 30) -> pd.DataFrame:
  
  
  def test_feature_analysis_missing_reward_column():
+    """Verify feature analysis handles missing reward column gracefully.
+
+    **Setup:**
+    - DataFrame with reward column removed
+    - skip_partial_dependence: True
+
+    **Assertions:**
+    - importance_df is empty
+    - model_fitted is False
+    - n_features is 0
+    - model is None
+    """
      df = _minimal_df().drop(columns=["reward"])  # remove reward
      importance_df, stats, partial_deps, model = _perform_feature_analysis(
          df, seed=SEEDS.FEATURE_EMPTY, skip_partial_dependence=True
@@ -52,6 +64,17 @@ def test_feature_analysis_missing_reward_column():
  
  
  def test_feature_analysis_empty_frame():
+    """Verify feature analysis handles empty DataFrame gracefully.
+
+    **Setup:**
+    - DataFrame with 0 rows
+    - skip_partial_dependence: True
+
+    **Assertions:**
+    - importance_df is empty
+    - n_features is 0
+    - model is None
+    """
      df = _minimal_df(0)  # empty
      importance_df, stats, partial_deps, model = _perform_feature_analysis(
          df, seed=SEEDS.FEATURE_EMPTY, skip_partial_dependence=True
@@ -62,6 +85,17 @@ def test_feature_analysis_empty_frame():
  
  
  def test_feature_analysis_single_feature_path():
+    """Verify feature analysis handles single feature DataFrame (stub path).
+
+    **Setup:**
+    - DataFrame with only 1 feature (pnl)
+    - skip_partial_dependence: True
+
+    **Assertions:**
+    - n_features is 1
+    - importance_mean is all NaN (stub path for single feature)
+    - model is None
+    """
      df = pd.DataFrame({"pnl": np.random.normal(0, 1, 25), "reward": np.random.normal(0, 1, 25)})
      importance_df, stats, partial_deps, model = _perform_feature_analysis(
          df, seed=SEEDS.FEATURE_PRIME_11, skip_partial_dependence=True
@@ -73,6 +107,18 @@ def test_feature_analysis_single_feature_path():
  
  
  def test_feature_analysis_nans_present_path():
+    """Verify feature analysis handles NaN values in features (stub path).
+
+    **Setup:**
+    - DataFrame with NaN values in trade_duration column
+    - 40 rows with alternating NaN values
+    - skip_partial_dependence: True
+
+    **Assertions:**
+    - model_fitted is False (NaN stub path)
+    - importance_mean is all NaN
+    - model is None
+    """
      rng = np.random.default_rng(9)
      df = pd.DataFrame(
          {
@@ -91,6 +137,21 @@ def test_feature_analysis_nans_present_path():
  
  
  def test_feature_analysis_model_fitting_failure(monkeypatch):
+    """Verify feature analysis handles model fitting failure gracefully.
+
+    Uses monkeypatch to force RandomForestRegressor.fit() to raise RuntimeError,
+    simulating model fitting failure.
+
+    **Setup:**
+    - Monkeypatch RandomForestRegressor.fit to raise RuntimeError
+    - DataFrame with 50 rows
+    - skip_partial_dependence: True
+
+    **Assertions:**
+    - model_fitted is False
+    - model is None
+    - importance_mean is all NaN
+    """
      # Monkeypatch model fit to raise
      from reward_space_analysis import RandomForestRegressor  # type: ignore
  
@@ -111,6 +172,23 @@ def test_feature_analysis_model_fitting_failure(monkeypatch):
  
  
  def test_feature_analysis_permutation_failure_partial_dependence(monkeypatch):
+    """Verify feature analysis handles permutation_importance failure with partial dependence enabled.
+
+    Uses monkeypatch to force permutation_importance to raise RuntimeError,
+    while allowing partial dependence calculation to proceed.
+
+    **Setup:**
+    - Monkeypatch permutation_importance to raise RuntimeError
+    - DataFrame with 60 rows
+    - skip_partial_dependence: False
+
+    **Assertions:**
+    - model_fitted is True (model fits successfully)
+    - importance_mean is all NaN (permutation failed)
+    - partial_deps has at least 1 entry (PD still computed)
+    - model is not None
+    """
+
      # Monkeypatch permutation_importance to raise while allowing partial dependence
      def perm_boom(*a, **kw):  # noqa: D401
          raise RuntimeError("forced permutation failure")
@@ -129,6 +207,20 @@ def test_feature_analysis_permutation_failure_partial_dependence(monkeypatch):
  
  
  def test_feature_analysis_success_partial_dependence():
+    """Verify feature analysis succeeds with partial dependence enabled.
+
+    Happy path test with sufficient data and all components working.
+
+    **Setup:**
+    - DataFrame with 70 rows
+    - skip_partial_dependence: False
+
+    **Assertions:**
+    - At least one non-NaN importance value
+    - model_fitted is True
+    - partial_deps has at least 1 entry
+    - model is not None
+    """
      df = _minimal_df(70)
      importance_df, stats, partial_deps, model = _perform_feature_analysis(
          df, seed=SEEDS.FEATURE_PRIME_47, skip_partial_dependence=False
diff --git a/ReforceXY/reward_space_analysis/tests/statistics/test_statistics.py b/ReforceXY/reward_space_analysis/tests/statistics/test_statistics.py

index 0a4c6685b28f4d9b00ae4470cd5708fbcafa97db..3931f1558c7b4daefe0d4c148174e0cdd9f78857 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/statistics/test_statistics.py
+++ b/ReforceXY/reward_space_analysis/tests/statistics/test_statistics.py
@@ -18,6 +18,14 @@ from reward_space_analysis import (
      statistical_hypothesis_tests,
  )
  
+from ..constants import (
+    PARAMS,
+    SCENARIOS,
+    SEEDS,
+    STAT_TOL,
+    STATISTICAL,
+    TOLERANCE,
+)
  from ..helpers import assert_diagnostic_warning
  from ..test_base import RewardSpaceTestBase
  
@@ -34,9 +42,9 @@ class TestStatistics(RewardSpaceTestBase):
          except ImportError:
              self.skipTest("sklearn not available; skipping feature analysis invariance test")
          # Use existing helper to get synthetic stats df (small for speed)
-        df = self.make_stats_df(n=120, seed=self.SEED, idle_pattern="mixed")
+        df = self.make_stats_df(n=120, seed=SEEDS.BASE, idle_pattern="mixed")
          importance_df, analysis_stats, partial_deps, model = _perform_feature_analysis(
-            df, seed=self.SEED, skip_partial_dependence=True, rf_n_jobs=1, perm_n_jobs=1
+            df, seed=SEEDS.BASE, skip_partial_dependence=True, rf_n_jobs=1, perm_n_jobs=1
          )
          self.assertIsInstance(importance_df, pd.DataFrame)
          self.assertIsInstance(analysis_stats, dict)
@@ -47,7 +55,7 @@ class TestStatistics(RewardSpaceTestBase):
      def test_statistics_binned_stats_invalid_bins_raises(self):
          """Invariant 110: _binned_stats must raise ValueError for <2 bin edges."""
  
-        df = self.make_stats_df(n=50, seed=self.SEED)
+        df = self.make_stats_df(n=50, seed=SEEDS.BASE)
          with self.assertRaises(ValueError):
              _binned_stats(df, "idle_duration", "reward_idle", [0.0])  # single edge invalid
          # Control: valid case should not raise and produce frame
@@ -58,7 +66,7 @@ class TestStatistics(RewardSpaceTestBase):
      def test_statistics_correlation_dropped_constant_columns(self):
          """Invariant 111: constant columns are listed in correlation_dropped and excluded."""
  
-        df = self.make_stats_df(n=90, seed=self.SEED)
+        df = self.make_stats_df(n=90, seed=SEEDS.BASE)
          # Force some columns constant
          df.loc[:, "reward_hold"] = 0.0
          df.loc[:, "idle_duration"] = 5.0
@@ -89,12 +97,18 @@ class TestStatistics(RewardSpaceTestBase):
                  key = f"{feature}_{suffix}"
                  if key in metrics:
                      self.assertPlacesEqual(
-                        float(metrics[key]), 0.0, places=12, msg=f"Expected 0 for {key}"
+                        float(metrics[key]),
+                        0.0,
+                        places=TOLERANCE.DECIMAL_PLACES_STRICT,
+                        msg=f"Expected 0 for {key}",
                      )
              p_key = f"{feature}_ks_pvalue"
              if p_key in metrics:
                  self.assertPlacesEqual(
-                    float(metrics[p_key]), 1.0, places=12, msg=f"Expected 1.0 for {p_key}"
+                    float(metrics[p_key]),
+                    1.0,
+                    places=TOLERANCE.DECIMAL_PLACES_STRICT,
+                    msg=f"Expected 1.0 for {p_key}",
                  )
  
      def test_statistics_distribution_shift_metrics(self):
@@ -139,12 +153,10 @@ class TestStatistics(RewardSpaceTestBase):
              if name.endswith(("_kl_divergence", "_js_distance", "_wasserstein")):
                  self.assertLess(
                      abs(val),
-                    self.TOL_GENERIC_EQ,
+                    TOLERANCE.GENERIC_EQ,
                      f"Metric {name} expected ≈ 0 on identical distributions (got {val})",
                  )
              elif name.endswith("_ks_statistic"):
-                from ..constants import STAT_TOL
-
                  self.assertLess(
                      abs(val),
                      STAT_TOL.KS_STATISTIC_IDENTITY,
@@ -191,19 +203,19 @@ class TestStatistics(RewardSpaceTestBase):
          for key in ["reward_mean", "reward_std", "pnl_mean", "pnl_std"]:
              if key in diagnostics:
                  self.assertAlmostEqualFloat(
-                    float(diagnostics[key]), 0.0, tolerance=self.TOL_IDENTITY_RELAXED
+                    float(diagnostics[key]), 0.0, tolerance=TOLERANCE.IDENTITY_RELAXED
                  )
          # Skewness & kurtosis fallback to INTERNAL_GUARDS['distribution_constant_fallback_moment'] (0.0)
          for key in ["reward_skewness", "reward_kurtosis", "pnl_skewness", "pnl_kurtosis"]:
              if key in diagnostics:
                  self.assertAlmostEqualFloat(
-                    float(diagnostics[key]), 0.0, tolerance=self.TOL_IDENTITY_RELAXED
+                    float(diagnostics[key]), 0.0, tolerance=TOLERANCE.IDENTITY_RELAXED
                  )
          # Q-Q plot r2 fallback value
          qq_key = next((k for k in diagnostics if k.endswith("_qq_r2")), None)
          if qq_key is not None:
              self.assertAlmostEqualFloat(
-                float(diagnostics[qq_key]), 1.0, tolerance=self.TOL_IDENTITY_RELAXED
+                float(diagnostics[qq_key]), 1.0, tolerance=TOLERANCE.IDENTITY_RELAXED
              )
          # All diagnostic values finite
          for k, v in diagnostics.items():
@@ -225,7 +237,7 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_statistical_hypothesis_tests_api_integration(self):
          """Test statistical_hypothesis_tests API integration with synthetic data."""
-        base = self.make_stats_df(n=200, seed=self.SEED, idle_pattern="mixed")
+        base = self.make_stats_df(n=200, seed=SEEDS.BASE, idle_pattern="mixed")
          base.loc[:149, ["reward_idle", "reward_hold", "reward_exit"]] = 0.0
          results = statistical_hypothesis_tests(base)
          self.assertIsInstance(results, dict)
@@ -244,14 +256,12 @@ class TestStatistics(RewardSpaceTestBase):
          self.assertAlmostEqualFloat(
              metrics[js_key],
              metrics_swapped[js_key_swapped],
-            tolerance=self.TOL_IDENTITY_STRICT,
-            rtol=self.TOL_RELATIVE,
+            tolerance=TOLERANCE.IDENTITY_STRICT,
+            rtol=TOLERANCE.RELATIVE,
          )
  
      def test_stats_variance_vs_duration_spearman_sign(self):
          """trade_duration up => pnl variance up (rank corr >0)."""
-        from ..constants import SCENARIOS, STAT_TOL
-
          rng = np.random.default_rng(99)
          n = 250
          trade_duration = np.linspace(1, SCENARIOS.DURATION_LONG, n)
@@ -264,8 +274,6 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_stats_scaling_invariance_distribution_metrics(self):
          """Equal scaling keeps KL/JS ≈0."""
-        from ..constants import SCENARIOS, STAT_TOL
-
          df1 = self._shift_scale_df(SCENARIOS.SAMPLE_SIZE_MEDIUM)
          scale = 3.5
          df2 = df1.copy()
@@ -291,12 +299,11 @@ class TestStatistics(RewardSpaceTestBase):
              len(df_a) + len(df_b)
          )
          self.assertAlmostEqualFloat(
-            m_concat, m_weighted, tolerance=self.TOL_IDENTITY_STRICT, rtol=self.TOL_RELATIVE
+            m_concat, m_weighted, tolerance=TOLERANCE.IDENTITY_STRICT, rtol=TOLERANCE.RELATIVE
          )
  
      def test_stats_bh_correction_null_false_positive_rate(self):
          """Null: low BH discovery rate."""
-        from ..constants import SCENARIOS
  
          rng = np.random.default_rng(1234)
          n = SCENARIOS.SAMPLE_SIZE_MEDIUM
@@ -321,7 +328,9 @@ class TestStatistics(RewardSpaceTestBase):
          if flags:
              rate = sum(flags) / len(flags)
              self.assertLess(
-                rate, self.BH_FP_RATE_THRESHOLD, f"BH null FP rate too high under null: {rate:.3f}"
+                rate,
+                STATISTICAL.BH_FP_RATE_THRESHOLD,
+                f"BH null FP rate too high under null: {rate:.3f}",
              )
  
      def test_stats_half_life_monotonic_series(self):
@@ -336,9 +345,9 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_stats_hypothesis_seed_reproducibility(self):
          """Seed reproducibility for statistical_hypothesis_tests + bootstrap."""
-        df = self.make_stats_df(n=300, seed=self.SEED, idle_pattern="mixed")
-        r1 = statistical_hypothesis_tests(df, seed=self.SEED_REPRODUCIBILITY)
-        r2 = statistical_hypothesis_tests(df, seed=self.SEED_REPRODUCIBILITY)
+        df = self.make_stats_df(n=300, seed=SEEDS.BASE, idle_pattern="mixed")
+        r1 = statistical_hypothesis_tests(df, seed=SEEDS.REPRODUCIBILITY)
+        r2 = statistical_hypothesis_tests(df, seed=SEEDS.REPRODUCIBILITY)
          self.assertEqual(set(r1.keys()), set(r2.keys()))
          for k in r1:
              for field in ("p_value", "significant"):
@@ -353,30 +362,30 @@ class TestStatistics(RewardSpaceTestBase):
                  self.assertEqual(v1, v2, f"Mismatch for {k}:{field}")
          metrics = ["reward", "pnl"]
          ci_a = bootstrap_confidence_intervals(
-            df, metrics, n_bootstrap=self.BOOTSTRAP_DEFAULT_ITERATIONS, seed=self.SEED_BOOTSTRAP
+            df, metrics, n_bootstrap=STATISTICAL.BOOTSTRAP_DEFAULT_ITERATIONS, seed=SEEDS.BOOTSTRAP
          )
          ci_b = bootstrap_confidence_intervals(
-            df, metrics, n_bootstrap=self.BOOTSTRAP_DEFAULT_ITERATIONS, seed=self.SEED_BOOTSTRAP
+            df, metrics, n_bootstrap=STATISTICAL.BOOTSTRAP_DEFAULT_ITERATIONS, seed=SEEDS.BOOTSTRAP
          )
          for metric in metrics:
              m_a, lo_a, hi_a = ci_a[metric]
              m_b, lo_b, hi_b = ci_b[metric]
              self.assertAlmostEqualFloat(
-                m_a, m_b, tolerance=self.TOL_IDENTITY_STRICT, rtol=self.TOL_RELATIVE
+                m_a, m_b, tolerance=TOLERANCE.IDENTITY_STRICT, rtol=TOLERANCE.RELATIVE
              )
              self.assertAlmostEqualFloat(
-                lo_a, lo_b, tolerance=self.TOL_IDENTITY_STRICT, rtol=self.TOL_RELATIVE
+                lo_a, lo_b, tolerance=TOLERANCE.IDENTITY_STRICT, rtol=TOLERANCE.RELATIVE
              )
              self.assertAlmostEqualFloat(
-                hi_a, hi_b, tolerance=self.TOL_IDENTITY_STRICT, rtol=self.TOL_RELATIVE
+                hi_a, hi_b, tolerance=TOLERANCE.IDENTITY_STRICT, rtol=TOLERANCE.RELATIVE
              )
  
      def test_stats_distribution_metrics_mathematical_bounds(self):
          """Mathematical bounds and validity of distribution shift metrics."""
-        self.seed_all(self.SEED)
+        self.seed_all(SEEDS.BASE)
          df1 = pd.DataFrame(
              {
-                "pnl": np.random.normal(0, self.TEST_PNL_STD, 500),
+                "pnl": np.random.normal(0, PARAMS.PNL_STD, 500),
                  "trade_duration": np.random.exponential(30, 500),
                  "idle_duration": np.random.gamma(2, 5, 500),
              }
@@ -408,19 +417,18 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_stats_heteroscedasticity_pnl_validation(self):
          """PnL variance increases with trade duration (heteroscedasticity)."""
-        from ..constants import SCENARIOS
  
          df = simulate_samples(
              params=self.base_params(max_trade_duration_candles=100),
              num_samples=SCENARIOS.SAMPLE_SIZE_LARGE + 200,
-            seed=self.SEED_HETEROSCEDASTICITY,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.HETEROSCEDASTICITY,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          exit_data = df[df["reward_exit"] != 0].copy()
          if len(exit_data) < SCENARIOS.SAMPLE_SIZE_TINY:
@@ -430,8 +438,6 @@ class TestStatistics(RewardSpaceTestBase):
          )
          variance_by_bin = exit_data.groupby("duration_bin")["pnl"].var().dropna()
          if "Q1" in variance_by_bin.index and "Q4" in variance_by_bin.index:
-            from ..constants import STAT_TOL
-
              self.assertGreater(
                  variance_by_bin["Q4"],
                  variance_by_bin["Q1"] * STAT_TOL.VARIANCE_RATIO_THRESHOLD,
@@ -440,7 +446,7 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_stats_statistical_functions_bounds_validation(self):
          """All statistical functions respect bounds."""
-        df = self.make_stats_df(n=300, seed=self.SEED, idle_pattern="all_nonzero")
+        df = self.make_stats_df(n=300, seed=SEEDS.BASE, idle_pattern="all_nonzero")
          diagnostics = distribution_diagnostics(df)
          for col in ["reward", "pnl", "trade_duration", "idle_duration"]:
              if f"{col}_skewness" in diagnostics:
@@ -451,7 +457,7 @@ class TestStatistics(RewardSpaceTestBase):
                  self.assertPValue(
                      diagnostics[f"{col}_shapiro_pval"], msg=f"Shapiro p-value bounds for {col}"
                  )
-        hypothesis_results = statistical_hypothesis_tests(df, seed=self.SEED)
+        hypothesis_results = statistical_hypothesis_tests(df, seed=SEEDS.BASE)
          for test_name, result in hypothesis_results.items():
              if "p_value" in result:
                  self.assertPValue(result["p_value"], msg=f"p-value bounds for {test_name}")
@@ -470,22 +476,21 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_stats_benjamini_hochberg_adjustment(self):
          """BH adjustment adds p_value_adj & significant_adj with valid bounds."""
-        from ..constants import SCENARIOS
  
          df = simulate_samples(
              params=self.base_params(max_trade_duration_candles=100),
              num_samples=SCENARIOS.SAMPLE_SIZE_LARGE - 200,
-            seed=self.SEED_HETEROSCEDASTICITY,
-            base_factor=self.TEST_BASE_FACTOR,
-            profit_aim=self.TEST_PROFIT_AIM,
-            risk_reward_ratio=self.TEST_RR,
+            seed=SEEDS.HETEROSCEDASTICITY,
+            base_factor=PARAMS.BASE_FACTOR,
+            profit_aim=PARAMS.PROFIT_AIM,
+            risk_reward_ratio=PARAMS.RISK_REWARD_RATIO,
              max_duration_ratio=2.0,
              trading_mode="margin",
-            pnl_base_std=self.TEST_PNL_STD,
-            pnl_duration_vol_scale=self.TEST_PNL_DUR_VOL_SCALE,
+            pnl_base_std=PARAMS.PNL_STD,
+            pnl_duration_vol_scale=PARAMS.PNL_DUR_VOL_SCALE,
          )
          results_adj = statistical_hypothesis_tests(
-            df, adjust_method="benjamini_hochberg", seed=self.SEED_REPRODUCIBILITY
+            df, adjust_method="benjamini_hochberg", seed=SEEDS.REPRODUCIBILITY
          )
          self.assertGreater(len(results_adj), 0)
          for name, res in results_adj.items():
@@ -496,7 +501,7 @@ class TestStatistics(RewardSpaceTestBase):
              p_adj = res["p_value_adj"]
              self.assertPValue(p_raw)
              self.assertPValue(p_adj)
-            self.assertGreaterEqual(p_adj, p_raw - self.TOL_IDENTITY_STRICT)
+            self.assertGreaterEqual(p_adj, p_raw - TOLERANCE.IDENTITY_STRICT)
              alpha = 0.05
              self.assertEqual(res["significant_adj"], bool(p_adj < alpha))
              if "effect_size_epsilon_sq" in res:
@@ -506,7 +511,7 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_bootstrap_confidence_intervals_bounds_ordering(self):
          """Test bootstrap confidence intervals return ordered finite bounds."""
-        test_data = self.make_stats_df(n=100, seed=self.SEED)
+        test_data = self.make_stats_df(n=100, seed=SEEDS.BASE)
          results = bootstrap_confidence_intervals(test_data, ["reward", "pnl"], n_bootstrap=100)
          for metric, (mean, ci_low, ci_high) in results.items():
              self.assertFinite(mean, name=f"mean[{metric}]")
@@ -516,7 +521,6 @@ class TestStatistics(RewardSpaceTestBase):
  
      def test_stats_bootstrap_shrinkage_with_sample_size(self):
          """Bootstrap CI half-width decreases with larger sample (~1/sqrt(n) heuristic)."""
-        from ..constants import SCENARIOS
  
          small = self._shift_scale_df(SCENARIOS.SAMPLE_SIZE_SMALL - 20)
          large = self._shift_scale_df(SCENARIOS.SAMPLE_SIZE_LARGE)
@@ -555,8 +559,6 @@ class TestStatistics(RewardSpaceTestBase):
              )
              width = hi - lo
              self.assertGreater(width, 0.0)
-            from ..constants import STAT_TOL
-
              self.assertLessEqual(
                  width, STAT_TOL.CI_WIDTH_EPSILON, "Width should be small epsilon range"
              )
diff --git a/ReforceXY/reward_space_analysis/tests/test_base.py b/ReforceXY/reward_space_analysis/tests/test_base.py

index adf2ac192fbd46eaf32e8727494bed7f6bc8973d..43a0c873194beb83f8cb46bca7569de33f8fc795 100644 (file)
--- a/ReforceXY/reward_space_analysis/tests/test_base.py
+++ b/ReforceXY/reward_space_analysis/tests/test_base.py
@@ -21,12 +21,10 @@ from reward_space_analysis import (
  )
  
  from .constants import (
-    CONTINUITY,
-    EXIT_FACTOR,
+    PARAMS,
      PBRS,
      SCENARIOS,
      SEEDS,
-    STATISTICAL,
      TOLERANCE,
  )
  
@@ -47,29 +45,15 @@ class RewardSpaceTestBase(unittest.TestCase):
      @classmethod
      def setUpClass(cls):
          """Set up class-level constants."""
-        cls.SEED = SEEDS.BASE
          cls.DEFAULT_PARAMS = DEFAULT_MODEL_REWARD_PARAMETERS.copy()
-        cls.TEST_SAMPLES = SCENARIOS.SAMPLE_SIZE_TINY
-        cls.TEST_BASE_FACTOR = 100.0
-        cls.TEST_PROFIT_AIM = 0.03
-        cls.TEST_RR = 1.0
-        cls.TEST_RR_HIGH = 2.0
-        cls.TEST_PNL_STD = 0.02
-        cls.TEST_PNL_DUR_VOL_SCALE = 0.5
-        # Seeds for different test contexts
-        cls.SEED_SMOKE_TEST = SEEDS.SMOKE_TEST
-        cls.SEED_REPRODUCIBILITY = SEEDS.REPRODUCIBILITY
-        cls.SEED_BOOTSTRAP = SEEDS.BOOTSTRAP
-        cls.SEED_HETEROSCEDASTICITY = SEEDS.HETEROSCEDASTICITY
-        # Statistical test thresholds
-        cls.BOOTSTRAP_DEFAULT_ITERATIONS = SCENARIOS.BOOTSTRAP_EXTENDED_ITERATIONS
-        cls.BH_FP_RATE_THRESHOLD = STATISTICAL.BH_FP_RATE_THRESHOLD
-        cls.EXIT_FACTOR_SCALING_RATIO_MIN = EXIT_FACTOR.SCALING_RATIO_MIN
-        cls.EXIT_FACTOR_SCALING_RATIO_MAX = EXIT_FACTOR.SCALING_RATIO_MAX
+        # Constants used in helper methods
+        cls.PBRS_TERMINAL_PROB = PBRS.TERMINAL_PROBABILITY
+        cls.PBRS_SWEEP_ITER = SCENARIOS.PBRS_SWEEP_ITERATIONS
+        cls.JS_DISTANCE_UPPER_BOUND = math.sqrt(math.log(2.0))
  
      def setUp(self):
          """Set up test fixtures with reproducible random seed."""
-        self.seed_all(self.SEED)
+        self.seed_all(SEEDS.BASE)
          self.temp_dir = tempfile.mkdtemp()
          self.output_path = Path(self.temp_dir)
  
@@ -77,35 +61,6 @@ class RewardSpaceTestBase(unittest.TestCase):
          """Clean up temporary files."""
          shutil.rmtree(self.temp_dir, ignore_errors=True)
  
-    # ===============================================
-    # Constants imported from tests.constants module
-    # ===============================================
-
-    # Tolerance constants
-    TOL_IDENTITY_STRICT = TOLERANCE.IDENTITY_STRICT
-    TOL_IDENTITY_RELAXED = TOLERANCE.IDENTITY_RELAXED
-    TOL_GENERIC_EQ = TOLERANCE.GENERIC_EQ
-    TOL_NUMERIC_GUARD = TOLERANCE.NUMERIC_GUARD
-    TOL_NEGLIGIBLE = TOLERANCE.NEGLIGIBLE
-    TOL_RELATIVE = TOLERANCE.RELATIVE
-    TOL_DISTRIB_SHAPE = TOLERANCE.DISTRIB_SHAPE
-
-    # PBRS constants
-    PBRS_TERMINAL_TOL = PBRS.TERMINAL_TOL
-    PBRS_MAX_ABS_SHAPING = PBRS.MAX_ABS_SHAPING
-
-    # Continuity constants
-    CONTINUITY_EPS_SMALL = CONTINUITY.EPS_SMALL
-    CONTINUITY_EPS_LARGE = CONTINUITY.EPS_LARGE
-
-    # Exit factor constants
-    MIN_EXIT_POWER_TAU = EXIT_FACTOR.MIN_POWER_TAU
-
-    # Test-specific constants
-    PBRS_TERMINAL_PROB = PBRS.TERMINAL_PROBABILITY
-    PBRS_SWEEP_ITER = SCENARIOS.PBRS_SWEEP_ITERATIONS
-    JS_DISTANCE_UPPER_BOUND = math.sqrt(math.log(2.0))
-
      def make_ctx(
          self,
          *,
@@ -163,7 +118,7 @@ class RewardSpaceTestBase(unittest.TestCase):
                  apply_potential_shaping(
                      base_reward=0.0,
                      current_pnl=current_pnl,
-                    pnl_target=self.TEST_PROFIT_AIM * self.TEST_RR,
+                    pnl_target=PARAMS.PROFIT_AIM * PARAMS.RISK_REWARD_RATIO,
                      current_duration_ratio=current_dur,
                      next_pnl=next_pnl,
                      next_duration_ratio=next_dur,
@@ -223,7 +178,7 @@ class RewardSpaceTestBase(unittest.TestCase):
          """
          if seed is not None:
              self.seed_all(seed)
-        pnl_std_eff = self.TEST_PNL_STD if pnl_std is None else pnl_std
+        pnl_std_eff = PARAMS.PNL_STD if pnl_std is None else pnl_std
          reward = np.random.normal(reward_mean, reward_std, n)
          pnl = np.random.normal(pnl_mean, pnl_std_eff, n)
          if trade_duration_dist == "exponential":
@@ -272,12 +227,12 @@ class RewardSpaceTestBase(unittest.TestCase):
          self.assertFinite(first, name="a")
          self.assertFinite(second, name="b")
          if tolerance is None:
-            tolerance = self.TOL_GENERIC_EQ
+            tolerance = TOLERANCE.GENERIC_EQ
          diff = abs(first - second)
          if diff <= tolerance:
              return
          if rtol is not None:
-            scale = max(abs(first), abs(second), self.TOL_NEGLIGIBLE)
+            scale = max(abs(first), abs(second), TOLERANCE.NEGLIGIBLE)
              if diff <= rtol * scale:
                  return
          self.fail(
@@ -388,7 +343,7 @@ class RewardSpaceTestBase(unittest.TestCase):
          Uses strict identity tolerance by default for PBRS invariance style checks.
          """
          self.assertFinite(value, name="value")
-        tol = atol if atol is not None else self.TOL_IDENTITY_RELAXED
+        tol = atol if atol is not None else TOLERANCE.IDENTITY_RELAXED
          if abs(float(value)) > tol:
              self.fail(msg or f"Value {value} not near zero (tol={tol})")
  
@@ -442,7 +397,7 @@ class RewardSpaceTestBase(unittest.TestCase):
  
      def _make_idle_variance_df(self, n: int = 100) -> pd.DataFrame:
          """Synthetic dataframe focusing on idle_duration ↔ reward_idle correlation."""
-        self.seed_all(self.SEED)
+        self.seed_all(SEEDS.BASE)
          idle_duration = np.random.exponential(10, n)
          reward_idle = -0.01 * idle_duration + np.random.normal(0, 0.001, n)
          return pd.DataFrame(
@@ -451,7 +406,7 @@ class RewardSpaceTestBase(unittest.TestCase):
                  "reward_idle": reward_idle,
                  "position": np.random.choice([0.0, 0.5, 1.0], n),
                  "reward": np.random.normal(0, 1, n),
-                "pnl": np.random.normal(0, self.TEST_PNL_STD, n),
+                "pnl": np.random.normal(0, PARAMS.PNL_STD, n),
                  "trade_duration": np.random.exponential(20, n),
              }
          )
author	Jérôme Benoit <jerome.benoit@piment-noir.org>
	Sat, 20 Dec 2025 21:33:38 +0000 (22:33 +0100)
committer	GitHub <noreply@github.com>
	Sat, 20 Dec 2025 21:33:38 +0000 (22:33 +0100)
ReforceXY/reward_space_analysis/tests/.docstring_template.md	[new file with mode: 0644]	patch \| blob
ReforceXY/reward_space_analysis/tests/README.md		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/api/test_api_helpers.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/cli/test_cli_params_and_csv.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/components/test_additives.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/components/test_reward_components.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/components/test_transforms.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/constants.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/helpers/assertions.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/helpers/test_internal_branches.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/helpers/test_utilities.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/integration/test_integration.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/integration/test_report_formatting.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/integration/test_reward_calculation.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/pbrs/test_pbrs.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/robustness/test_branch_coverage.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/robustness/test_robustness.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/statistics/test_feature_analysis_failures.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/statistics/test_statistics.py		patch \| blob \| blame \| history
ReforceXY/reward_space_analysis/tests/test_base.py		patch \| blob \| blame \| history