]> Piment Noir Git Repositories - freqai-strategies.git/commit
fix: eliminate data leakage in extrema weighting normalization (#30)
authorJérôme Benoit <jerome.benoit@piment-noir.org>
Sat, 3 Jan 2026 20:48:51 +0000 (21:48 +0100)
committerGitHub <noreply@github.com>
Sat, 3 Jan 2026 20:48:51 +0000 (21:48 +0100)
commit69841c859496055682d6692a7a1505c5d0db7727
treedeaba08f73c67453ac497cd639a187cff45cadc9
parent79c6eae04a9c6eee53aab60025576412ae4b3b5d
fix: eliminate data leakage in extrema weighting normalization (#30)

* fix: eliminate data leakage in extrema weighting normalization

Move dataset-dependent scaling from strategy (pre-split) to model label
pipeline (post-split) to prevent train/test data leakage.

Changes:
- Add ExtremaWeightingTransformer (datasieve BaseTransform) in Utils.py
  that fits standardization/normalization stats on training data only
- Add define_label_pipeline() in QuickAdapterRegressorV3 that replaces
  FreqAI's default MinMaxScaler with our configurable transformer
- Simplify strategy's set_freqai_targets() to pass raw weighted extrema
  without any normalization (normalization now happens post-split)
- Remove pre-split normalization functions from Utils.py (~107 lines)

The transformer supports:
- Standardization: zscore, robust, mmad, none
- Normalization: minmax, sigmoid, none (all mathematically invertible)
- Configurable minmax_range (default [-1, 1] per FreqAI convention)
- Correct inverse_transform for prediction recovery

BREAKING CHANGES:
- softmax normalization removed
- l1, l2, rank normalization removed (not mathematically invertible)
- rank_method config option removed
- extrema_weighting config now processed in model pipeline instead of strategy

* chore: remove dead rank_method log line

* chore: cleanup unused imports and comments

* refactor(quickadapter): cleanup extrema weighting implementation

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix: use robust_quantiles config in transformer fit()

* style: align with codebase conventions (error messages, near-zero detection)

* refactor: remove redundant checks

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix: update config validation for transformer pipeline

- Remove obsolete aggregation+normalization warning (no longer applies post-refactor)
- Change standardization+normalization=none from error to warning

* refactor: cleanup ExtremaWeightingTransformer implementation

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: cleanup ExtremaWeightingTransformer implementation

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: Remove redundant configuration extraction in ExtremaWeightingTransformer

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: align ExtremaWeightingTransformer with BaseTransform API

- Call super().__init__() with name parameter
- Match method signatures exactly (npt.ArrayLike, ArrayOrNone, ListOrNone)
- Return tuple from fit() instead of self
- Import types from same namespaces as BaseTransform

* refactor: cleanup type hints

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: remove unnecessary type casts and annotations

Let numpy types flow naturally without explicit float()/int() casts.

* refactor: avoid range python shadowing

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: cleanup extrema weighting transformer implementation

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: cleanup extrema weighting and smoothing config handling

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: cleanup extrema weighting transformer

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: filter non-finite values in ExtremaWeightingTransformer

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: use scipy.special.logit for inverse sigmoid transformation

Replace manual inverse sigmoid calculation (-np.log(1.0 / values - 1.0))
with scipy.special.logit() for better code clarity and consistency.

- Uses official scipy function that is the documented inverse of expit
- Mathematically equivalent to the previous implementation
- Improves code readability and maintainability
- Maintains symmetry: sp.special.expit() <-> sp.special.logit()

Also improve comment clarity for standardization identity function.

* docs: update README.md

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: remove unused _n_train attribute from ExtremaWeightingTransformer

The _n_train attribute was being set during fit() but never used
elsewhere in the class or by the BaseTransform interface. Removing
it to reduce code clutter and improve maintainability.

* fix: import paths correction

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix: add Bessel correction and ValueError consistency in ExtremaWeightingTransformer

- Use ddof=1 for std computation (sample std instead of population std)
- Add ValueError in _inverse_standardize for unknown methods
- Add ValueError in _inverse_normalize for unknown methods

* chore: refine config-template.json for extrema weighting options

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* chore: refine extrema weighting configuration in config-template.json

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* chore: remove hybrid extrema weighting source weights

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix: remove unreachable dead code in compute_extrema_weights

* docs: refine README extrema weighting descriptions

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
---------

Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
README.md
quickadapter/user_data/config-template.json
quickadapter/user_data/freqaimodels/QuickAdapterRegressorV3.py
quickadapter/user_data/strategies/ExtremaWeightingTransformer.py [new file with mode: 0644]
quickadapter/user_data/strategies/QuickAdapterV3.py
quickadapter/user_data/strategies/Utils.py