Jérôme Benoit [Fri, 1 May 2026 14:03:12 +0000 (16:03 +0200)]
fix: validate epoch-ms range before converting int64 date columns
Reject int64 values outside [2010, 2035] epoch-ms range to fail fast
on corrupted data instead of silently producing wrong dates. Catches
nanosecond/microsecond values that would pass the int64 dtype check
but produce garbage timestamps if interpreted as milliseconds.
Jérôme Benoit [Fri, 1 May 2026 13:46:19 +0000 (15:46 +0200)]
fix: add 30min stop_grace_period to prevent data corruption on shutdown
FreqAI training can take minutes to hours. Docker's default 10s grace
period causes SIGKILL mid-write, corrupting feather/pickle files.
Give freqtrade up to 30 minutes to finish training and flush data
before Docker sends SIGKILL.
Jérôme Benoit [Fri, 1 May 2026 10:32:33 +0000 (12:32 +0200)]
fix: align ensure_datetime_series with freqtrade data handler pattern
Chain .dt.as_unit("ms") to guarantee datetime64[ms, UTC] output
resolution regardless of pandas version, matching the contract
established in freqtrade commit 2c5dc72.
refactor: extract ensure_datetime_series helper for date dtype workaround
Centralizes the int64 epoch-ms vs datetime detection logic into a shared
helper. Handles both formats correctly: unit='ms' for int64, passthrough
for existing datetime columns.
fix: workaround freqtrade 2026.4 date column dtype regression
Freqtrade 2026.4 (commit 2c5dc72) changed feather/parquet handlers to
use .dt.as_unit("ms") instead of to_datetime(col, unit="ms", utc=True).
This breaks when data files store dates as int64 epoch-ms, causing
AttributeError in feature_engineering_standard.
Use pd.to_datetime(col, utc=True) defensively to handle both int64 and
datetime inputs.
Jérôme Benoit [Tue, 31 Mar 2026 00:29:28 +0000 (02:29 +0200)]
docs: fix semantic accuracy of README configuration tunables
- polyorder: correct range from int >= 1 to int >= 0 (savgol accepts degree-0)
- robust standardization: replace 'IQR' with '(Q₃-Q₁)' (quantiles are configurable)
- label_weights: broaden scope from 'distance calculations to ideal point' to 'trial selection methods'
- label_p_order: replace 'p-order parameter for distance metrics' with 'Lp exponent for parameterized metrics'
- label_density_aggregation_param: replace 'p-order' with 'Lp exponent' for consistency
Jérôme Benoit [Thu, 12 Feb 2026 23:10:08 +0000 (00:10 +0100)]
fix(ReforceXY): add context-aware guard for efficiency coefficient division
Prevent division explosion in _compute_efficiency_coefficient() when
max_unrealized_profit ≈ min_unrealized_profit by requiring a minimum
meaningful range based on pnl_target. Also adds validation warnings
for potential_gamma=0 and pnl_target<=0 edge cases.
Jérôme Benoit [Thu, 12 Feb 2026 14:17:10 +0000 (15:17 +0100)]
feat(ReforceXY): tune reward sensitivity and extend training period
- Increase pnl_amplification_sensitivity from 0.5 to 2.0 for stronger
reward signal differentiation
- Extend train_period_days from 60 to 120 for more training data
Jérôme Benoit [Mon, 9 Feb 2026 21:04:23 +0000 (22:04 +0100)]
fix(quickadapter): use Optuna params for TimeSeriesSplit gap calculation
Previously gap was calculated from ft_params with a hardcoded default, which could return incorrect values when Optuna optimized parameters. Also standardizes log message format to use [pair] prefix.
- Use test_size parameter in TimeSeriesSplit
- Remove unused dk parameter from _make_timeseries_split_datasets()
- Assign dk.data_dictionary = dd before logging
- Fix typo: train_test_test -> train_test_split in README
* docs: integrate data_split_parameters into tunables table
Remove standalone section and add parameters to existing table
with freqai. prefix for consistency.
* refactor: use FreqAI APIs for weight calculation and data dictionary
- Use dk.set_weights_higher_recent() instead of duplicating weight formula
- Use dk.build_data_dictionary() for consistent data structure
- Respects feature_parameters.weight_factor configuration
- Fix bug: was using data_kitchen_thread_count instead of weight_factor
* refactor: extract _apply_pipelines() to reduce code duplication
- Move pipeline definition and application logic to helper method
- Reduces train() override complexity while keeping same behavior
- Helper can be reused by future custom split implementations
* style: harmonize namespace and remove inline comments
- Rename DATA_SPLIT_METHODS to _DATA_SPLIT_METHODS (private tuple pattern)
- Reference DATA_SPLIT_METHOD_DEFAULT from _DATA_SPLIT_METHODS[0]
- Remove 22 inline comments to match self-documenting codebase style
* fix: align TimeSeriesSplit weight calculation with FreqAI semantics
Calculate weights on combined train+test set before splitting to maintain
temporal weight continuity, matching FreqAI's make_train_test_datasets behavior.
* feat: add gap=0 warning and improve TimeSeriesSplit validation
- Warn when gap=0 about look-ahead bias risk (reference label_period_candles)
- Add _compute_timeseries_min_samples() for accurate minimum sample calculation
- Account for gap and test_size in minimum sample validation
- Improve error message with all relevant parameters
* style: harmonize error messages with codebase conventions
- Use 'Invalid {param} value {value!r}: {constraint}' pattern
- Align with existing validation error format (lines 718, 1145)
* style: add cached set accessor for data split methods
- Add _data_split_methods_set() with @staticmethod @lru_cache
- Use QuickAdapterRegressorV3 prefix for class attribute access
- Use cached set for O(1) membership check in validation
* fix: address PR review comments for TimeSeriesSplit
- Use dd consistently in training logs instead of dk.data_dictionary
- Use self.data_split_parameters consistently in _apply_pipelines
- Add explicit type coercion for n_splits, gap, max_train_size
- Add validation for gap >= 0 and max_train_size >= 1
- Improve test_size validation: float in (0,1) as fraction, int >= 1 as count
- Fix _compute_timeseries_min_samples formula: (n_splits+1)*test_size + n_splits*gap
- Optimize tscv.split() iteration to avoid unnecessary list materialization
* fix: correct min_samples formula to match sklearn validation
sklearn validates: n_samples - gap - (test_size * n_splits) > 0
Correct formula: test_size * n_splits + gap + 1
* feat: auto-calculate TimeSeriesSplit gap from label_period_candles
When gap=0 is configured, automatically set gap to label_period_candles
to prevent look-ahead bias from overlapping label windows. This ensures
temporal separation between train and test sets without requiring manual
configuration.
* fix: remove redundant time import shadowing module
* fix: correct min_samples formula for dynamic test_size and document test_size param
* docs: clarify test_size default per split method
* refactor: move DependencyException import to file header
* style: use class name for class constant access
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* docs: use Python None instead of null in README
* docs: fix train_test_split description (sequential, not random)
* fix: use explicit None check for max_train_size validation
* docs: clarify timeseries_split as chronological split, not cross-validation
* refactor(quickadapter): shorten log prefixes and tailor empty test set error by split method
* refactor(quickadapter): use index pattern for timeseries_split method constant
Replace string literals with index access pattern following existing
codebase convention for _DATA_SPLIT_METHODS.
Also renames variables for semantic clarity:
- test_size_param -> test_size
- feat_dict -> feature_parameters
* refactor(quickadapter): use _TEST_SIZE constant instead of hardcoded 0.1
* chore(quickadapter): bump version to 3.11.2
* fix(quickadapter): restore test_size parameter in TimeSeriesSplit
The test_size variable from data_split_parameters was being
immediately overwritten by a type annotation line, making it
always None regardless of user configuration.
Jérôme Benoit [Mon, 26 Jan 2026 20:27:52 +0000 (21:27 +0100)]
refactor(quickadapter): harmonize parameter naming in threshold computation
- Rename extrema_selection → selection_method to match tunable name
- Rename pred_extrema → pred_label for consistency across methods
- Rename n_extrema → n_values in _build_weights_array (generic function)
- Fix bug: use default_weight param instead of constant in early return
Jérôme Benoit [Mon, 26 Jan 2026 12:23:51 +0000 (13:23 +0100)]
refactor(quickadapter): add format_dict helper and improve numeric formatting
- Add format_dict() with singledispatch for type-safe dict/params formatting
- Refactor format_number() with unified significant digits formula
- Replace raw dict logging with format_dict() across strategy and model
- Remove redundant _format_label_method_config method
- Bump version to 3.11.1
- Refactor label processing into 4 orthogonal phases:
1. Weighting: apply weights to raw label values per column
2. Smoothing: smooth weighted values per column
3. Pipeline: LabelTransformer standardization per column
4. Prediction: threshold calculation per column
- Loop over LABEL_COLUMNS for weighting and smoothing in set_freqai_targets()
- Loop over dk.label_list for thresholds in fit_live_predictions()
- All config helpers return {default, columns} structure with glob pattern support
- Rename ExtremaWeightingTransformer to LabelTransformer
- Harmonize namespace: label_weighting, label_smoothing, label_prediction
- Backward compatible with flat configs and legacy column names
* refactor: remove deprecated internal APIs
Remove unused deprecated functions that were replaced by the orthogonal
label processing architecture:
- get_label_transformer_config() from Utils.py
- get_label_transformer_config import from QuickAdapterV3.py
- extrema_smoothing property from QuickAdapterV3.py
* refactor: use DEFAULTS_LABEL_PREDICTION for outlier_quantile fallback
* refactor: make label weighting generic with metrics dict
- compute_label_weights() takes generic metrics dict instead of hardcoded params
- _compute_combined_weights() takes generic metrics dict
- apply_label_weighting() takes generic metrics dict
- Caller builds metrics dict, making weighting truly transverse to any label
* refactor: centralize deprecation handling with PARAM_DEPRECATIONS table
* refactor: call resolve_deprecated_params once at startup
- Change resolve_deprecated_params to modify dict in-place (returns None)
- Centralize all deprecation calls in bot_start() and __init__()
- Remove calls from properties and utility functions that run multiple times
- This ensures deprecation warnings are logged once, not repeatedly
* fix: resolve deprecations in __init__ before regressor loads
- Move deprecation resolution from bot_start() to Strategy.__init__()
so it runs before FreqaiModel.__init__() (which reads same config)
- Remove label_transformer legacy support (never released)
- Simplify label_weighting/label_pipeline properties
- Keep regressor-specific deprecations in regressor __init__
* fix: address PR review comments
- Fix label_weighting['strategy'] KeyError by using ['default']['strategy']
- Respect label_prediction.method='none' in min_max_pred()
- Use float('inf') specificity for exact matches in get_column_config
- Reuse Utils.get_column_config in LabelTransformer
- Default label_smoothing method to 'gaussian'
* refactor: unify threshold column naming and soft_extremum_alpha
- Remove MINIMA_THRESHOLD_COLUMN/MAXIMA_THRESHOLD_COLUMN constants
- Use uniform {label}_minima_threshold/{label}_maxima_threshold for all labels
- Rename internal soft_alpha to soft_extremum_alpha for consistency with config
- Remove redundant docstrings from LabelTransformer (code is self-documenting)
* refactor: cleanup docstrings and rename internal functions
* refactor: make label_pipeline orthogonal from label_weighting
* refactor: rename get_column_config to get_label_column_config
* fix: add missing method field to label_prediction logging
* refactor: per-column logging and deprecate label_smoothing.window
- Update logging in QuickAdapterV3 and QuickAdapterRegressorV3 to show
resolved per-column configs instead of just defaults with override keys
- Move get_label_column_config() to LabelTransformer.py (re-export from Utils)
- Add deprecation mapping for label_smoothing.window -> window_candles
- Fix extrema_direction undefined variable bug in populate_any_indicators
* fix: correct deprecation mappings for label_prediction params
* refactor: move label_pipeline property and logging to regressor
- Move label_pipeline property from QuickAdapterV3 strategy to QuickAdapterRegressorV3
- Move Pipeline configuration logging from _log_strategy_configuration() to
_log_model_configuration()
- Simplify define_label_pipeline() to use self.label_pipeline property
- Remove unused get_label_pipeline_config import from strategy
- Rename local variable label_weighting to label_weighting_raw for consistency
* fix: import get_label_column_config from LabelTransformer
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* refactor(quickadapter): replace string literals with constant references in LabelTransformer
* refactor(quickadapter): use per-column prediction config in regressor and strategy
* fix: reference correct config paths for label processing
Signed-off-by: Jérôme Benoit <jerome.benoit@piment-noir.org>
* fix(quickadapter): warn when column doesn't match any config pattern
* feat(deprecation): support cross-section parameter moves
Extend PARAM_DEPRECATIONS to handle parameters that moved between
config sections, not just renames within the same section.
- Add tuple[str, str] value type for (old_section, old_key) moves
- Add root_config parameter to resolve_deprecated_params()
- Add deprecation entries for 7 params moved from label_weighting
to label_pipeline: standardization, robust_quantiles,
mmad_scaling_factor, normalization, minmax_range, sigmoid_scale,
gamma
- Add call sites in QuickAdapterV3 and QuickAdapterRegressorV3
* refactor(quickadapter): replace imperative deprecation handling with declarative path-based migrations
- Replace PARAM_DEPRECATIONS dict and resolve_deprecated_params() with
CONFIG_MIGRATIONS tuple and migrate_config()
- Single migrate_config() call in __init__ replaces 6+ resolve_deprecated_params() calls
- Fix bug in set_freqai_targets: move maxima/minima column creation after weighting
- Fix DI_value_param assignment to only occur when Weibull fit succeeds
* refactor(validation): replace imperative validation with declarative system
- Add dataclass-based validators (_EnumValidator, _NumericValidator, etc.)
- Replace ~240 lines of repetitive validation code with _validate_params()
- Consolidate type aliases in LabelTransformer.py (avoid duplicates)
- Fix pyright errors: float() casts, np.asarray() for pmean returns
- Use np.nan as default for optuna .get() (proper 'no value' sentinel)
- Add pyright to requirements-dev.txt
* chore(ReforceXY): add pyright to dev dependencies
- Move DI_value stats computation before label loop
- Unify warmed_up conditional to single if/else block
- Always set threshold values (defaults when not warmed up)
* refactor(quickadapter): add OPTUNA_*_DEFAULT constants and fix static member access
- Add OPTUNA_*_DEFAULT class constants for n_jobs, n_trials, timeout,
n_startup_trials, min_resource, label_candles_step, space_reduction,
space_fraction, and seed
- Update _optuna_config property to use constants instead of hardcoded values
- Update all .get() calls to use constants as defaults for type safety
- Fix static method/property access: use QuickAdapterRegressorV3.method()
instead of self.method() for static members
- Add assertions for narrowing Optional types (weights)
- Fix min_max_pred signature to accept Optional[int] for label_period_candles
Reduces pyright errors from 174 to 158 (-16)
* fix(quickadapter): default label_prediction method to 'thresholding' for backward compatibility
DEFAULTS_LABEL_PREDICTION['method'] was 'none' which broke backward
compatibility - legacy configs without explicit method would skip
threshold computation. Changed to 'thresholding' to preserve historical
behavior where thresholds were always computed by default.
Jérôme Benoit [Mon, 12 Jan 2026 15:58:52 +0000 (16:58 +0100)]
perf(quickadapter): eliminate ~15k np.log() recalculations via pure log space (#41)
* perf(zigzag): eliminate ~15k np.log() recalculations via pure log space
Comprehensive optimization of zigzag() function to operate entirely in
logarithmic space, eliminating redundant np.log() recalculations.
**Performance Impact:**
- ~11,000-15,000 fewer np.log() calls per zigzag() execution
- Pre-computation: ~10,000 calls eliminated
- Pure log space conversion: ~1,050-5,100 calls eliminated
**Implementation Changes:**
Utils.py (zigzag function):
- Pre-compute log arrays once: closes_log, highs_log, lows_log (L1195-1199)
- Convert update_candidate_pivot() to accept log values (L1245)
- Convert add_pivot() to accept log values (L1401)
- Convert initial phase to log space (L1531-1569)
- Convert main loop comparisons to log space (L1583-1615)
- Rename top_change_percent() → top_log_return() (L813)
- Rename bottom_change_percent() → bottom_log_return() (L834)
- Convert efficiency ratio calculations to log space (L1343, L1368)
**API Changes:**
- zigzag() now returns pivots_values_log instead of pivots_values
- calculate_pivot_metrics() accepts log values directly
**Callers Updated:**
- QuickAdapterV3.py: Use renamed functions, add TODO comments (L674, L676, L702)
- QuickAdapterRegressorV3.py: Use len(pivots_indices) instead of len(pivots_values) (L3350, L3396)
**Mathematical Correctness:**
- Maintains semantic equivalence via log monotonicity: a > b ⟺ log(a) > log(b)
- Provides symmetric treatment of returns in log space
- All comparisons and calculations mathematically equivalent
**Breaking Changes (Future):**
- Added TODO comments for feature renaming (requires model retraining)
- %-tcp-period → %-top_log_return-period
- %-bcp-period → %-bottom_log_return-period
- %-close_pct_change → %-close_log_return
* refactor(zigzag): harmonize log variable naming to _log suffix
Jérôme Benoit [Mon, 12 Jan 2026 01:43:33 +0000 (02:43 +0100)]
fix: prevent division by zero in price_retracement_percent when prior range collapses
- Replace linear price change calculations with log-space formulas in top_change_percent, bottom_change_percent, and price_retracement_percent
- Add masked division with np.isclose() guard in price_retracement_percent to handle flat prior windows (returns 0.0 when denominator ≈ 0)
- Migrate zigzag amplitude and threshold calculations to log-space for numerical stability
- Remove normalization (x/(1+x)) from zigzag amplitude and speed metrics (now unbounded in log units)
- Update %-close_pct_change feature from pct_change() to log().diff() for consistency
- Bump version to 3.10.10
Jérôme Benoit [Fri, 9 Jan 2026 12:48:49 +0000 (13:48 +0100)]
fix(quickadapter): preserve rsm parameter for CatBoost GPU pairwise modes (#37)
* fix(quickadapter): preserve rsm parameter for CatBoost GPU pairwise modes
The previous fix unconditionally removed the rsm parameter when using GPU,
but according to CatBoost documentation, rsm IS supported on GPU for
pairwise loss functions (PairLogit and PairLogitPairwise).
This commit refines the logic to only remove rsm for non-pairwise modes
on GPU, allowing users to benefit from rsm optimization when using
pairwise ranking loss functions.
* refactor(quickadapter): define _CATBOOST_GPU_RSM_LOSS_FUNCTIONS as global constant
- Define _CATBOOST_GPU_RSM_LOSS_FUNCTIONS as a reusable global constant
- Remove duplicate definitions in fit_regressor() and get_optuna_study_model_parameters()
- Improves maintainability: single source of truth for GPU rsm compatibility
- Ensures consistency between runtime logic and Optuna hyperparameter search
* chore: bump version to 3.10.8
Includes:
- CatBoost GPU rsm parameter fix for pairwise loss functions
- Optuna hyperparameter search optimization for rsm parameter
- Global constant for GPU rsm compatibility