Jérôme Benoit [Mon, 26 Jan 2026 20:27:52 +0000 (21:27 +0100)]
refactor(quickadapter): harmonize parameter naming in threshold computation
- Rename extrema_selection → selection_method to match tunable name
- Rename pred_extrema → pred_label for consistency across methods
- Rename n_extrema → n_values in _build_weights_array (generic function)
- Fix bug: use default_weight param instead of constant in early return
Jérôme Benoit [Mon, 26 Jan 2026 12:23:51 +0000 (13:23 +0100)]
refactor(quickadapter): add format_dict helper and improve numeric formatting
- Add format_dict() with singledispatch for type-safe dict/params formatting
- Refactor format_number() with unified significant digits formula
- Replace raw dict logging with format_dict() across strategy and model
- Remove redundant _format_label_method_config method
- Bump version to 3.11.1
- Refactor label processing into 4 orthogonal phases:
1. Weighting: apply weights to raw label values per column
2. Smoothing: smooth weighted values per column
3. Pipeline: LabelTransformer standardization per column
4. Prediction: threshold calculation per column
- Loop over LABEL_COLUMNS for weighting and smoothing in set_freqai_targets()
- Loop over dk.label_list for thresholds in fit_live_predictions()
- All config helpers return {default, columns} structure with glob pattern support
- Rename ExtremaWeightingTransformer to LabelTransformer
- Harmonize namespace: label_weighting, label_smoothing, label_prediction
- Backward compatible with flat configs and legacy column names
* refactor: remove deprecated internal APIs
Remove unused deprecated functions that were replaced by the orthogonal
label processing architecture:
- get_label_transformer_config() from Utils.py
- get_label_transformer_config import from QuickAdapterV3.py
- extrema_smoothing property from QuickAdapterV3.py
* refactor: use DEFAULTS_LABEL_PREDICTION for outlier_quantile fallback
* refactor: make label weighting generic with metrics dict
- compute_label_weights() takes generic metrics dict instead of hardcoded params
- _compute_combined_weights() takes generic metrics dict
- apply_label_weighting() takes generic metrics dict
- Caller builds metrics dict, making weighting truly transverse to any label
* refactor: centralize deprecation handling with PARAM_DEPRECATIONS table
* refactor: call resolve_deprecated_params once at startup
- Change resolve_deprecated_params to modify dict in-place (returns None)
- Centralize all deprecation calls in bot_start() and __init__()
- Remove calls from properties and utility functions that run multiple times
- This ensures deprecation warnings are logged once, not repeatedly
* fix: resolve deprecations in __init__ before regressor loads
- Move deprecation resolution from bot_start() to Strategy.__init__()
so it runs before FreqaiModel.__init__() (which reads same config)
- Remove label_transformer legacy support (never released)
- Simplify label_weighting/label_pipeline properties
- Keep regressor-specific deprecations in regressor __init__
* fix: address PR review comments
- Fix label_weighting['strategy'] KeyError by using ['default']['strategy']
- Respect label_prediction.method='none' in min_max_pred()
- Use float('inf') specificity for exact matches in get_column_config
- Reuse Utils.get_column_config in LabelTransformer
- Default label_smoothing method to 'gaussian'
* refactor: unify threshold column naming and soft_extremum_alpha
- Remove MINIMA_THRESHOLD_COLUMN/MAXIMA_THRESHOLD_COLUMN constants
- Use uniform {label}_minima_threshold/{label}_maxima_threshold for all labels
- Rename internal soft_alpha to soft_extremum_alpha for consistency with config
- Remove redundant docstrings from LabelTransformer (code is self-documenting)
* refactor: cleanup docstrings and rename internal functions
* refactor: make label_pipeline orthogonal from label_weighting
* refactor: rename get_column_config to get_label_column_config
* fix: add missing method field to label_prediction logging
* refactor: per-column logging and deprecate label_smoothing.window
- Update logging in QuickAdapterV3 and QuickAdapterRegressorV3 to show
resolved per-column configs instead of just defaults with override keys
- Move get_label_column_config() to LabelTransformer.py (re-export from Utils)
- Add deprecation mapping for label_smoothing.window -> window_candles
- Fix extrema_direction undefined variable bug in populate_any_indicators
* fix: correct deprecation mappings for label_prediction params
* refactor: move label_pipeline property and logging to regressor
- Move label_pipeline property from QuickAdapterV3 strategy to QuickAdapterRegressorV3
- Move Pipeline configuration logging from _log_strategy_configuration() to
_log_model_configuration()
- Simplify define_label_pipeline() to use self.label_pipeline property
- Remove unused get_label_pipeline_config import from strategy
- Rename local variable label_weighting to label_weighting_raw for consistency
* fix: import get_label_column_config from LabelTransformer
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor(quickadapter): replace string literals with constant references in LabelTransformer
* refactor(quickadapter): use per-column prediction config in regressor and strategy
* fix: reference correct config paths for label processing
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix(quickadapter): warn when column doesn't match any config pattern
* feat(deprecation): support cross-section parameter moves
Extend PARAM_DEPRECATIONS to handle parameters that moved between
config sections, not just renames within the same section.
- Add tuple[str, str] value type for (old_section, old_key) moves
- Add root_config parameter to resolve_deprecated_params()
- Add deprecation entries for 7 params moved from label_weighting
to label_pipeline: standardization, robust_quantiles,
mmad_scaling_factor, normalization, minmax_range, sigmoid_scale,
gamma
- Add call sites in QuickAdapterV3 and QuickAdapterRegressorV3
* refactor(quickadapter): replace imperative deprecation handling with declarative path-based migrations
- Replace PARAM_DEPRECATIONS dict and resolve_deprecated_params() with
CONFIG_MIGRATIONS tuple and migrate_config()
- Single migrate_config() call in __init__ replaces 6+ resolve_deprecated_params() calls
- Fix bug in set_freqai_targets: move maxima/minima column creation after weighting
- Fix DI_value_param assignment to only occur when Weibull fit succeeds
* refactor(validation): replace imperative validation with declarative system
- Add dataclass-based validators (_EnumValidator, _NumericValidator, etc.)
- Replace ~240 lines of repetitive validation code with _validate_params()
- Consolidate type aliases in LabelTransformer.py (avoid duplicates)
- Fix pyright errors: float() casts, np.asarray() for pmean returns
- Use np.nan as default for optuna .get() (proper 'no value' sentinel)
- Add pyright to requirements-dev.txt
* chore(ReforceXY): add pyright to dev dependencies
- Move DI_value stats computation before label loop
- Unify warmed_up conditional to single if/else block
- Always set threshold values (defaults when not warmed up)
* refactor(quickadapter): add OPTUNA_*_DEFAULT constants and fix static member access
- Add OPTUNA_*_DEFAULT class constants for n_jobs, n_trials, timeout,
n_startup_trials, min_resource, label_candles_step, space_reduction,
space_fraction, and seed
- Update _optuna_config property to use constants instead of hardcoded values
- Update all .get() calls to use constants as defaults for type safety
- Fix static method/property access: use QuickAdapterRegressorV3.method()
instead of self.method() for static members
- Add assertions for narrowing Optional types (weights)
- Fix min_max_pred signature to accept Optional[int] for label_period_candles
Reduces pyright errors from 174 to 158 (-16)
* fix(quickadapter): default label_prediction method to 'thresholding' for backward compatibility
DEFAULTS_LABEL_PREDICTION['method'] was 'none' which broke backward
compatibility - legacy configs without explicit method would skip
threshold computation. Changed to 'thresholding' to preserve historical
behavior where thresholds were always computed by default.
Jérôme Benoit [Mon, 12 Jan 2026 15:58:52 +0000 (16:58 +0100)]
perf(quickadapter): eliminate ~15k np.log() recalculations via pure log space (#41)
* perf(zigzag): eliminate ~15k np.log() recalculations via pure log space
Comprehensive optimization of zigzag() function to operate entirely in
logarithmic space, eliminating redundant np.log() recalculations.
**Performance Impact:**
- ~11,000-15,000 fewer np.log() calls per zigzag() execution
- Pre-computation: ~10,000 calls eliminated
- Pure log space conversion: ~1,050-5,100 calls eliminated
**Implementation Changes:**
Utils.py (zigzag function):
- Pre-compute log arrays once: closes_log, highs_log, lows_log (L1195-1199)
- Convert update_candidate_pivot() to accept log values (L1245)
- Convert add_pivot() to accept log values (L1401)
- Convert initial phase to log space (L1531-1569)
- Convert main loop comparisons to log space (L1583-1615)
- Rename top_change_percent() → top_log_return() (L813)
- Rename bottom_change_percent() → bottom_log_return() (L834)
- Convert efficiency ratio calculations to log space (L1343, L1368)
**API Changes:**
- zigzag() now returns pivots_values_log instead of pivots_values
- calculate_pivot_metrics() accepts log values directly
**Callers Updated:**
- QuickAdapterV3.py: Use renamed functions, add TODO comments (L674, L676, L702)
- QuickAdapterRegressorV3.py: Use len(pivots_indices) instead of len(pivots_values) (L3350, L3396)
**Mathematical Correctness:**
- Maintains semantic equivalence via log monotonicity: a > b ⟺ log(a) > log(b)
- Provides symmetric treatment of returns in log space
- All comparisons and calculations mathematically equivalent
**Breaking Changes (Future):**
- Added TODO comments for feature renaming (requires model retraining)
- %-tcp-period → %-top_log_return-period
- %-bcp-period → %-bottom_log_return-period
- %-close_pct_change → %-close_log_return
* refactor(zigzag): harmonize log variable naming to _log suffix
Jérôme Benoit [Mon, 12 Jan 2026 01:43:33 +0000 (02:43 +0100)]
fix: prevent division by zero in price_retracement_percent when prior range collapses
- Replace linear price change calculations with log-space formulas in top_change_percent, bottom_change_percent, and price_retracement_percent
- Add masked division with np.isclose() guard in price_retracement_percent to handle flat prior windows (returns 0.0 when denominator ≈ 0)
- Migrate zigzag amplitude and threshold calculations to log-space for numerical stability
- Remove normalization (x/(1+x)) from zigzag amplitude and speed metrics (now unbounded in log units)
- Update %-close_pct_change feature from pct_change() to log().diff() for consistency
- Bump version to 3.10.10
Jérôme Benoit [Fri, 9 Jan 2026 12:48:49 +0000 (13:48 +0100)]
fix(quickadapter): preserve rsm parameter for CatBoost GPU pairwise modes (#37)
* fix(quickadapter): preserve rsm parameter for CatBoost GPU pairwise modes
The previous fix unconditionally removed the rsm parameter when using GPU,
but according to CatBoost documentation, rsm IS supported on GPU for
pairwise loss functions (PairLogit and PairLogitPairwise).
This commit refines the logic to only remove rsm for non-pairwise modes
on GPU, allowing users to benefit from rsm optimization when using
pairwise ranking loss functions.
* refactor(quickadapter): define _CATBOOST_GPU_RSM_LOSS_FUNCTIONS as global constant
- Define _CATBOOST_GPU_RSM_LOSS_FUNCTIONS as a reusable global constant
- Remove duplicate definitions in fit_regressor() and get_optuna_study_model_parameters()
- Improves maintainability: single source of truth for GPU rsm compatibility
- Ensures consistency between runtime logic and Optuna hyperparameter search
* chore: bump version to 3.10.8
Includes:
- CatBoost GPU rsm parameter fix for pairwise loss functions
- Optuna hyperparameter search optimization for rsm parameter
- Global constant for GPU rsm compatibility
Jérôme Benoit [Thu, 8 Jan 2026 21:10:38 +0000 (22:10 +0100)]
fix(quickadapter): restrict CatBoost grow_policy for Ordered boosting
CatBoost's Ordered boosting mode only supports SymmetricTree grow policy.
When Optuna suggested Ordered boosting with Depthwise or Lossguide grow
policies, CatBoost raised: "Ordered boosting is not supported for
nonsymmetric trees."
This fix conditionally restricts grow_policy options to SymmetricTree when
boosting_type is Ordered, preventing invalid parameter combinations during
hyperparameter optimization.
Jérôme Benoit [Thu, 8 Jan 2026 15:37:22 +0000 (16:37 +0100)]
feat(quickadapter): add boosting_type with DART support to LightGBM HPO
Add boosting_type parameter allowing selection between gbdt and dart
boosting methods. When dart is selected, conditional parameters drop_rate
and skip_drop are tuned. Also widen num_leaves and min_child_samples
ranges for better exploration of the hyperparameter space.
Jérôme Benoit [Thu, 8 Jan 2026 11:49:39 +0000 (12:49 +0100)]
perf(quickadapter): optimize Optuna log scale for LightGBM and CatBoost hyperparameters
Apply logarithmic sampling scale to regularization and tree complexity parameters for improved hyperparameter search efficiency:
- LightGBM: Add num_leaves to log scale (exponential tree growth)
- CatBoost: Add l2_leaf_reg and random_strength to log scale (multiplicative effects)
- Reverted bagging_temperature to linear scale (0 has special meaning: disables Bayesian bootstrap)
Log scale provides better exploration in low-value regions where these parameters have the most impact, consistent with Optuna best practices and industry standards (FLAML, XGBoost patterns).
Jérôme Benoit [Thu, 8 Jan 2026 11:28:55 +0000 (12:28 +0100)]
feat(quickadapter): add NGBoost regressor support with Optuna optimization (#33)
* feat: add NGBoost regressor support with Optuna optimization
- Add NGBoost to supported regressors (xgboost, lightgbm, histgradientboosting, ngboost)
- Install ngboost==0.5.8 in Docker image
- Implement fit_regressor branch for NGBoost with:
- Dynamic distribution selection via get_ngboost_dist() helper
- Support for 5 distributions: normal, lognormal, exponential, laplace, t
- Early stopping support with validation set (X_val/Y_val API)
- Sample weights support for training and validation
- Optuna trial handling with random_state adjustment
- Verbosity parameter conversion (verbosity -> verbose)
- Add Optuna hyperparameter optimization support:
- n_estimators: [100, 1000] (log-scaled)
- learning_rate: [0.001, 0.3] (log-scaled)
- minibatch_frac: [0.5, 1.0] (linear)
- col_sample: [0.3, 1.0] (linear)
- dist: categorical [normal, lognormal]
- Space reduction support for refined optimization
- Create get_ngboost_dist() helper function for distribution class mapping
- Default distribution: lognormal (optimal for crypto prices)
- Compatible with RMSE optimization objective (LogScore ≈ RMSE)
* docs: add ngboost to regressor enum in README
* fix: correct NGBoost parameter comment to reflect actual tuned parameters
Removed 'tree structure' from the parameter order comment since NGBoost
implementation doesn't tune tree structure parameters (only boosting,
sampling, and distribution parameters are optimized via Optuna).
* feat(ngboost): add tree structure parameter tuning
Add DecisionTreeRegressor base learner parameters for NGBoost:
- max_depth: (3, 8) based on literature and XGBoost patterns
- min_samples_split: (2, 20) following sklearn best practices
- min_samples_leaf: (1, 10) conservative range for crypto data
These parameters are passed via the Base argument to control
the underlying decision tree learners in the NGBoost ensemble.
* refine(ngboost): narrow sampling and leaf hyperparameter ranges
Refined Optuna search space based on gradient boosting research:
- min_samples_leaf: 1-8 (was 1-10)
- minibatch_frac: 0.6-1.0 (was 0.5-1.0)
- col_sample: 0.4-1.0 (was 0.3-1.0)
Ranges focused on empirically proven optimal zones for ensemble
gradient boosting methods on financial/crypto time series data.
* refactor(ngboost): move DecisionTreeRegressor import to branch start
Move sklearn.tree.DecisionTreeRegressor import to the beginning of
the NGBoost branch (after NGBRegressor import) for better code
organization and consistency with import conventions.
Jérôme Benoit [Thu, 8 Jan 2026 11:13:16 +0000 (12:13 +0100)]
feat(quickadapter): add CatBoost regressor with RMSE loss function (#34)
* feat: add CatBoost regressor with RMSE loss function
Add CatBoost as the 5th regressor option using standard RMSE loss function.
Changes:
- Add catboost==1.2.8 to Dockerfile dependencies
- Update Regressor type literal to include 'catboost'
- Implement fit_regressor branch for CatBoost with:
- RMSE loss function (default)
- Early stopping and validation set handling
- Verbosity parameter mapping
- Sample weights support
- Optuna CatBoostPruningCallback for trial pruning
- Add Optuna hyperparameter optimization with 6 parameters:
- iterations: [100, 2000] (log-scaled)
- learning_rate: [0.001, 0.3] (log-scaled)
- depth: [4, 10] (tree depth)
- l2_leaf_reg: [1, 10] (L2 regularization)
- bagging_temperature: [0, 10] (Bayesian bootstrap)
- random_strength: [1, 20] (split randomness)
- Update README.md regressor enum documentation
CatBoost advantages:
- Better accuracy than XGBoost/LightGBM (2024 benchmarks)
- GPU support for faster training
- Better categorical feature handling
- Strong overfitting resistance (ordered boosting)
- Production-ready at scale
- Optuna pruning callback for efficient hyperparameter search
* feat(catboost): add GPU/CPU differentiation for training and Optuna hyperparameters
- Add task_type-aware parameter handling in fit_regressor
- GPU mode: set devices, max_ctr_complexity=4, remove n_jobs
- CPU mode: propagate n_jobs to thread_count, max_ctr_complexity=2
- Trust CatBoost defaults for border_count (CPU=254, GPU=128)
- Differentiate Optuna hyperparameter search spaces by task_type
- GPU: depth=(4,12), border_count=(32,254), bootstrap=[Bayesian,Bernoulli]
- CPU: depth=(4,10), bootstrap=[Bayesian,Bernoulli,MVS]
- Add GPU-specific parameters: border_count, max_ctr_complexity
- Expand search space: min_data_in_leaf, grow_policy, model_size_reg, rsm, subsample
- Use CatBoost Pool for training with proper eval_set handling
* refactor(catboost): remove devices default to allow GPU auto-discovery
Trust CatBoost's automatic GPU device detection (default: all available GPUs).
Users can still explicitly set devices='0' or devices='0:1' in config if needed.
* fix(quickadapter): avoid forcing CatBoost thread_count when n_jobs unset
Remove nonstandard thread_count=-1 default in CPU CatBoost path.\nLet CatBoost select threads automatically unless n_jobs is provided.\nImproves consistency and avoids potential performance misinterpretation.
CatBoost strictly validates bootstrap parameters and rejects:
- subsample with Bayesian bootstrap
- bagging_temperature with non-Bayesian bootstrap
Even passing 'neutral' values (0 or 1.0) causes runtime errors.
Changed from ternary expressions (which always pass params) to
conditional dict building (which omits incompatible params entirely).
Also fixed: border_count min_val 16→1 per CatBoost documentation.
* refactor(quickadapter): replace plotting column names with constants
Replace string literals "minima", "maxima", and "smoothed-extrema" with MINIMA_COLUMN, MAXIMA_COLUMN, and SMOOTHED_EXTREMA_COLUMN constants following the existing *_COLUMN naming convention.
This improves maintainability and prevents typos when referencing these DataFrame column names throughout the codebase.
Jérôme Benoit [Wed, 7 Jan 2026 12:46:29 +0000 (13:46 +0100)]
refactor(quickadapter): consolidate pivot metrics and extrema ranking; bump version to 3.10.5
- Utils.py: unify amplitude/threshold/speed in calculate_pivot_metrics, remove calculate_pivot_speed, update add_pivot to consume normalized speed; preserves edge-case guards (NaN/inf, zero duration).
- QuickAdapterRegressorV3: add _calculate_n_kept_extrema and use in ranking; mark scaler fallback path; bump version to 3.10.5.
- QuickAdapterV3: bump version() to 3.10.5; adjust docstring for t-distribution helper.
Jérôme Benoit [Wed, 7 Jan 2026 01:08:17 +0000 (02:08 +0100)]
feat(quickadapter): add logging for invalid fit data in ExtremaWeightingTransformer
Add warning when fit() receives data with no finite values, improving
observability of data quality issues. Uses fallback [0.0, 1.0] to prevent
pipeline crashes while alerting users to upstream preprocessing problems.
Jérôme Benoit [Wed, 7 Jan 2026 00:14:46 +0000 (01:14 +0100)]
refactor(quickadapter): simplify early stopping condition checks
Remove redundant has_eval_set verification in early stopping callbacks for XGBoost and LightGBM. The check is unnecessary because early_stopping_rounds is only assigned a non-None value when has_eval_set is True, making the condition implicitly guaranteed.
Jérôme Benoit [Tue, 6 Jan 2026 23:30:18 +0000 (00:30 +0100)]
refactor(xgboost): migrate to callback-based early stopping for API 3.x compatibility
- Replace deprecated early_stopping_rounds parameter with EarlyStopping callback
- Extract early_stopping_rounds from model parameters using pop() before instantiation
- Configure callback with metric_name='rmse', data_name='validation_0', save_best=True
- Reorganize LightGBM callback initialization for improved code readability
- Maintains backward compatibility with eval_set validation approach
- Ensures compatibility with XGBoost 3.1.2+ API requirements
Jérôme Benoit [Tue, 6 Jan 2026 18:12:36 +0000 (19:12 +0100)]
refactor(quickadapter): return cached sets directly in optuna_samplers_by_namespace
- Add _optuna_hpo_samplers_set() and _optuna_label_samplers_set() cached methods
- Change return type from tuple[tuple[OptunaSampler, ...], OptunaSampler] to tuple[set[OptunaSampler], OptunaSampler]
- Remove redundant set() conversion in sampler validation
- Align with existing pattern used by other constant set methods (_scaler_types_set, _threshold_methods_set, etc.)
Jérôme Benoit [Mon, 5 Jan 2026 22:12:20 +0000 (23:12 +0100)]
feat(plot): add smoothed extrema line to min_max subplot with weighted smoothing
- Add 'smoothed-extrema' column displaying weighted extrema after smoothing
- Position smoothed extrema line below maxima/minima bars in plot z-order
- Use 'wheat' color for better visual distinction from red/green bars
- Store smoothed result in variable before assigning to both EXTREMA_COLUMN and smoothed-extrema
Jérôme Benoit [Mon, 5 Jan 2026 20:41:43 +0000 (21:41 +0100)]
feat(quickadapter): add multiple aggregation methods for combined extrema weighting (v3.10.4)
- Add 5 new aggregation methods: arithmetic_mean, harmonic_mean, quadratic_mean, weighted_median, softmax
- Replace weighted_average (deprecated) with arithmetic_mean as new default
- Add softmax_temperature parameter (default: 1.0) for softmax aggregation
- Implement all methods using scipy.stats.pmean for power means (p=1,-1,2) and numpy for weighted_median
- Add softmax aggregation with temperature scaling and coefficient weighting
- Add validation and logging for softmax_temperature parameter
- Update README with precise mathematical formulas for all aggregation methods
- Bump version to 3.10.4 in strategy and model
- Add conditional logging for softmax_temperature when aggregation is softmax
Jérôme Benoit [Mon, 5 Jan 2026 15:32:12 +0000 (16:32 +0100)]
feat(quickadapter): Add configurable feature normalization to QuickAdapterRegressorV3 (#31)
* feat(quickadapter): add configurable feature normalization to data pipeline
Add support for configurable feature scaling/normalization in QuickAdapterRegressorV3
via define_data_pipeline() override. Users can now select different sklearn scalers
through feature_parameters configuration.
Supported normalization methods:
- minmax: MinMaxScaler with configurable range (default: -1 to 1)
- maxabs: MaxAbsScaler (scales by max absolute value)
- standard: StandardScaler (zero mean, unit variance)
- robust: RobustScaler (uses median and IQR, robust to outliers)
Implementation details:
- Overrides define_data_pipeline() to replace scalers in pipeline
- Optimizes default case (minmax with -1,1 range) by using parent pipeline
- Replaces both 'scaler' and 'post-pca-scaler' steps with selected scaler
- normalization_range parameter only applies to minmax scaler
Note: Changing normalization config requires deleting existing models
(rm -rf user_data/models/*) due to pipeline serialization.
* fix(quickadapter): address PR review comments for feature normalization
- Remove unused 'datasieve as ds' import
- Add validation for normalization parameter using _validate_enum_value
- Add comprehensive validation for normalization_range (type, length, values, min < max)
- Fix tuple/list comparison by using tuple() conversion
- Store normalization_range in variable to avoid fetching twice
- Optimize scaler creation by creating once instead of calling get_scaler() multiple times
* refactor(quickadapter): harmonize validation error messages with codebase style
- Use consistent 'Invalid {param} {type}:' format matching existing patterns
- Remove unnecessary try-except block around float conversion
- Simplify error messages to be more concise
- Let float() raise its own errors for non-numeric values
* refactor(quickadapter): rename data pipeline parameters for clarity
- Rename ft_params.normalization → ft_params.scaler
- Rename ft_params.normalization_range → ft_params.range
- Add ScalerType Literal and _SCALER_TYPES constant
- Document new parameters in README
More intuitive naming that better reflects sklearn terminology.
* docs(README.md): format
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor(quickadapter): rename data pipeline parameters for clarity
- Rename ft_params.normalization → ft_params.scaler
- Rename ft_params.normalization_range → ft_params.range
- Add ScalerType Literal and _SCALER_TYPES constant
- Document new parameters in README under feature_parameters section
More intuitive naming that better reflects sklearn terminology.
Users configure these via freqai.feature_parameters.* in config.json.
* fix(quickadapter): address PR review comments for feature normalization
- Extract hardcoded defaults to class constants (SCALER_DEFAULT, RANGE_DEFAULT)
- Remove redundant tuple() call in feature_range comparison
- Follow codebase pattern for default values similar to other constants
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* docs(README): add note about model retraining for scaler changes
* docs(README): clarify extrema weighting strategy requires model retraining
Only switching between 'none' and other strategies changes the label pipeline.
Other parameter changes within the same strategy do not require retraining.
* docs(README): format
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* chore: bump model and strategy version to 3.10.3
Jérôme Benoit [Mon, 5 Jan 2026 12:13:14 +0000 (13:13 +0100)]
refactor(quickadapter): harmonize log and error messages across codebase
Standardize error and log messages for consistency and clarity:
- Standardize 29 ValueError messages with 'Invalid {param} value {value}' format
- Harmonize 35 warning messages with fallback defaults ('using default'/'using uniform')
- Replace {trade.pair} with {pair} in 33 log messages for consistent context
- Ensure all 7 exception handlers use exc_info=True for complete stack traces
- Normalize punctuation and capitalization in validation messages
This improves debugging experience and maintains uniform message patterns
throughout the QuickAdapter, Utils, and ExtremaWeightingTransformer modules.
Jérôme Benoit [Sun, 4 Jan 2026 23:02:21 +0000 (00:02 +0100)]
feat(quickadapter): add combined extrema weighting strategy with multi-metric aggregation
Add new 'combined' strategy to extrema weighting that aggregates multiple
metrics (amplitude, amplitude_threshold_ratio, volume_rate, speed,
efficiency_ratio, volume_weighted_efficiency_ratio) using configurable
coefficients and aggregation methods.
Features:
- New strategy type 'combined' with per-metric coefficient weighting
- Support for weighted_average and geometric_mean aggregation methods
- Normalize all metrics to [0,1] range for consistent aggregation:
* amplitude: x/(1+x)
* amplitude_threshold_ratio: x/(x+median)
* volume_rate: x/(x+median)
* speed: x/(1+x)
- Deterministic metric iteration order via COMBINED_METRICS constant
- Centralized validation in get_extrema_weighting_config()
- Comprehensive logging of new parameters
Configuration:
- metric_coefficients: dict mapping metric names to positive weights
- aggregation: 'weighted_average' (default) or 'geometric_mean'
- Empty coefficients dict defaults to equal weights (1.0) for all metrics
Documentation:
- README updated with new strategy and parameters
- Mathematical formulas for aggregation methods
- Style aligned with existing documentation conventions
Jérôme Benoit [Sun, 4 Jan 2026 17:06:10 +0000 (18:06 +0100)]
fix(quickadapter): handle missing scaler attributes in ExtremaWeightingTransformer
Use getattr with default None value to properly handle cases where scaler
attributes don't exist (e.g., when loading pre-existing models), allowing
the RuntimeError to be raised with a clear message instead of AttributeError.
Jérôme Benoit [Sun, 4 Jan 2026 15:54:26 +0000 (16:54 +0100)]
refactor: enhance extrema weighting with sklearn scalers and new methods
Replace manual standardization/normalization calculations with sklearn scalers
for better maintainability and correctness.
Standardization changes:
- Add power_yj (Yeo-Johnson) standardization method
- Replace manual zscore with StandardScaler
- Replace manual robust with RobustScaler
- Add mask size checks for all methods including MMAD
- Store fitted scaler objects instead of manual stats
Normalization changes:
- Add maxabs normalization (new default)
- Replace manual minmax with MinMaxScaler
- Fix sigmoid to output [-1, 1] range (was [0, 1])
- Replace manual calculations with MaxAbsScaler and MinMaxScaler
Other improvements:
- Remove zero-exclusion from mask (zeros are valid values)
- Fit normalization on standardized data (proper pipeline order)
- Add proper RuntimeError for unfitted scalers
Docs:
- Update README to reflect maxabs as new normalization default
- Document power_yj standardization type
- Harmonize mathematical formulas with code notation
Jérôme Benoit [Sat, 3 Jan 2026 22:36:37 +0000 (23:36 +0100)]
refactor: improve string literal replacements and use specific tuple constants
- Replace hardcoded method names in error messages with tuple constants
(lines 2003, 2167: use _DISTANCE_METHODS[0] and [1] instead of literal strings)
- Use _CLUSTER_METHODS instead of _SELECTION_METHODS indices for better
code maintainability (e.g., _CLUSTER_METHODS[0] vs _SELECTION_METHODS[2])
- Fix trial_selection_method comparison order to match tuple constant order
(compromise_programming [0] before topsis [1])
- Remove redundant power_mean None check (already validated by _validate_power_mean)
- Add clarifying comments to tuple constant usages (minkowski, rank_extrema)
Jérôme Benoit [Sat, 3 Jan 2026 20:48:51 +0000 (21:48 +0100)]
fix: eliminate data leakage in extrema weighting normalization (#30)
* fix: eliminate data leakage in extrema weighting normalization
Move dataset-dependent scaling from strategy (pre-split) to model label
pipeline (post-split) to prevent train/test data leakage.
Changes:
- Add ExtremaWeightingTransformer (datasieve BaseTransform) in Utils.py
that fits standardization/normalization stats on training data only
- Add define_label_pipeline() in QuickAdapterRegressorV3 that replaces
FreqAI's default MinMaxScaler with our configurable transformer
- Simplify strategy's set_freqai_targets() to pass raw weighted extrema
without any normalization (normalization now happens post-split)
- Remove pre-split normalization functions from Utils.py (~107 lines)
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: align ExtremaWeightingTransformer with BaseTransform API
- Call super().__init__() with name parameter
- Match method signatures exactly (npt.ArrayLike, ArrayOrNone, ListOrNone)
- Return tuple from fit() instead of self
- Import types from same namespaces as BaseTransform
* refactor: cleanup type hints
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: remove unnecessary type casts and annotations
Let numpy types flow naturally without explicit float()/int() casts.
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: use scipy.special.logit for inverse sigmoid transformation
Replace manual inverse sigmoid calculation (-np.log(1.0 / values - 1.0))
with scipy.special.logit() for better code clarity and consistency.
- Uses official scipy function that is the documented inverse of expit
- Mathematically equivalent to the previous implementation
- Improves code readability and maintainability
- Maintains symmetry: sp.special.expit() <-> sp.special.logit()
Also improve comment clarity for standardization identity function.
The _n_train attribute was being set during fit() but never used
elsewhere in the class or by the BaseTransform interface. Removing
it to reduce code clutter and improve maintainability.
* fix: import paths correction
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix: add Bessel correction and ValueError consistency in ExtremaWeightingTransformer
- Use ddof=1 for std computation (sample std instead of population std)
- Add ValueError in _inverse_standardize for unknown methods
- Add ValueError in _inverse_normalize for unknown methods
* chore: refine config-template.json for extrema weighting options
Jérôme Benoit [Fri, 2 Jan 2026 23:10:44 +0000 (00:10 +0100)]
refactor: remove redundant 'Only' prefix from namespace comments
Simplify code comments by removing the word 'Only' from namespace
identifier comments. The context already makes it clear that these
are the supported namespaces.
Jérôme Benoit [Fri, 2 Jan 2026 23:05:31 +0000 (00:05 +0100)]
Remove Optuna "train" namespace as preliminary step to eliminate data leakage
Remove the "train" namespace from Optuna hyperparameter optimization to
address data leakage issues in extrema weighting normalization. This is a
preliminary step before implementing a proper data preparation pipeline
that prevents train/test contamination.
Problem:
Current architecture applies extrema weighting normalization (minmax, softmax,
zscore, etc.) on the full dataset BEFORE train/test split. This causes data
leakage: train set labels are normalized using statistics (min/max, mean/std,
median/IQR) computed from the entire dataset including test set. The "train"
namespace hyperopt optimization exacerbates this by optimizing dataset
truncation with contaminated statistics.
Solution approach:
1. Remove "train" namespace optimization (this commit)
2. Switch to binary extrema labels (strategy: "none") to avoid leakage
3. Future: implement proper data preparation that computes normalization
statistics on train set only and applies them to both train/test sets
This naive train/test splitting hyperopt is incompatible with a correct
data preparation pipeline where normalization must be fit on train and
transformed on test separately.
Changes:
- Remove "train" namespace from OptunaNamespace (3→2 namespaces: hp, label)
- Remove train_objective function and all train optimization logic
- Remove dataset truncation based on optimized train/test periods
- Update namespace indices: label from [2] to [1] throughout codebase
- Remove train_candles_step config parameter and train_rmse metric tracking
- Set extrema_weighting.strategy to "none" (binary labels: -1/0/+1)
- Update documentation to reflect 2-namespace architecture
Jérôme Benoit [Fri, 2 Jan 2026 15:16:18 +0000 (16:16 +0100)]
refactor(quickadapter): consolidate custom distance metrics in Pareto front selection
Extract shared distance metric logic from _compromise_programming_scores and
_topsis_scores into reusable static methods:
- Add _POWER_MEAN_MAP class constant as single source of truth for power values
- Add _power_mean_metrics_set() cached method for metric name lookup
- Add _hellinger_distance() for Hellinger/Shellinger computation
- Add _power_mean_distance() for generalized mean computation with validation
- Add _weighted_sum_distance() for weighted sum computation
Harmonize with existing validation API using ValidationMode and proper contexts.
Jérôme Benoit [Fri, 2 Jan 2026 14:21:16 +0000 (15:21 +0100)]
refactor(quickadapter): unify validation helpers with ValidationMode support
- Add ValidationMode type for "warn", "raise", "none" behavior
- Rename _UNSUPPORTED_CLUSTER_METRICS to _UNSUPPORTED_WEIGHTS_METRICS
- Refactor _validate_minkowski_p, _validate_quantile_q with mode support
- Add _validate_power_mean_p, _validate_metric_weights_support, _validate_label_weights
- Add _validate_enum_value for generic enum validation
- Add _prepare_knn_kwargs for sklearn-specific weight handling
- Remove deprecated _normalize_weights function
- Update all call sites to use new unified API
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix: use unbounded cache for constant-returning helper methods
Replace @lru_cache(maxsize=1) with @lru_cache(maxsize=None) for all
static methods that return constant sets. Using maxsize=None is more
idiomatic and efficient for parameterless functions that always return
the same value.
* refactor: add _prepare_distance_kwargs to centralize distance kwargs preparation
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: cleanup extrema weighting API
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor: cleanup extrema smoothing API
- Remove ClusterSelectionMethod type and related constants
- Unify selection methods to use DistanceMethod for both cluster and trial selection
- Add separate trial_selection_method parameter for within-cluster selection
- Change power_mean default from 2.0 to 1.0 for internal consistency
- Add validation for selection_method and trial_selection_method parameters
* fix: add missing validations for label_distance_metric and label_density_aggregation_param
- Add validation for label_distance_metric parameter at configuration time
- Add early validation for label_density_aggregation_param (quantile and power_mean)
- Ensures invalid configuration values fail fast with clear error messages
- Harmonizes error messages with existing validation patterns in the codebase
* fix: add validation for label_cluster_metric and custom metrics support in topsis
- Add validation that label_cluster_metric is in _distance_metrics_set()
- Implement custom metrics support in _topsis_scores (hellinger, shellinger,
harmonic/geometric/arithmetic/quadratic/cubic/power_mean, weighted_sum)
matching _compromise_programming_scores implementation
* docs: update README.md with refactored label selection methods
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* docs: fix config parameter and bump to v3.9.0
- Fix config-template.json: label_metric -> label_method
- Bump version from 3.8.5 to 3.9.0 in model and strategy
Parameter names now match QuickAdapterRegressorV3.py implementation.
Jérôme Benoit [Sun, 28 Dec 2025 18:51:56 +0000 (19:51 +0100)]
refactor(quickadapter)!: normalize tunables namespace for semantic consistency (#26)
* refactor(quickadapter): normalize tunables namespace for semantic consistency
Rename config keys and internal variables to follow consistent naming conventions:
- `_candles` suffix for time periods in candle units
- `_fraction` suffix for values in [0,1] range
- `_multiplier` suffix for scaling factors
- `_method` suffix for algorithm selectors
Internal variable renames for code consistency:
- threshold_outlier → outlier_threshold_fraction
- thresholds_alpha → soft_extremum_alpha
- extrema_fraction → keep_extrema_fraction (local vars and function params)
- _reversal_lookback_period → _reversal_lookback_period_candles
- natr_ratio → natr_multiplier (zigzag function param)
All deprecated aliases emit warnings and remain functional for backward compatibility.
* chore(quickadapter): remove temporary audit file from codebase
* refactor(quickadapter): align constant names with normalized tunables
Rename class constants to match the normalized config key names:
- PREDICTIONS_EXTREMA_THRESHOLD_OUTLIER_DEFAULT → PREDICTIONS_EXTREMA_OUTLIER_THRESHOLD_FRACTION_DEFAULT
- PREDICTIONS_EXTREMA_THRESHOLDS_ALPHA_DEFAULT → PREDICTIONS_EXTREMA_SOFT_EXTREMUM_ALPHA_DEFAULT
- PREDICTIONS_EXTREMA_EXTREMA_FRACTION_DEFAULT → PREDICTIONS_EXTREMA_KEEP_EXTREMA_FRACTION_DEFAULT
* fix(quickadapter): rename outlier_threshold_fraction to outlier_threshold_quantile
The value (e.g., 0.999) represents the 99.9th percentile, which is
mathematically a quantile, not a fraction. This aligns the naming with
the semantic meaning of the parameter.
* fix(quickadapter): add missing deprecated alias support
* refactor(quickadapter): rename safe configuration value retrieval helper
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor(quickadapter): rename natr_ratio_fraction to natr_multiplier_fraction
- Align naming with label_natr_multiplier for consistency
- Rename get_config_value_with_deprecated_alias to get_config_value
* refactor(quickadapter): centralize label_natr_multiplier migration in get_label_defaults
- Move label_natr_ratio -> label_natr_multiplier migration to get_label_defaults()
- Update get_config_value to migrate in-place (pop old key, store new key)
- Remove redundant get_config_value calls in Strategy and Model __init__
- Simplify cached properties to use .get() since migration is done at init
- Rename _CUSTOM_STOPLOSS_NATR_RATIO_FRACTION to _CUSTOM_STOPLOSS_NATR_MULTIPLIER_FRACTION
* fix(quickadapter): check that df columns exist before using them