]> Piment Noir Git Repositories - freqai-strategies.git/commitdiff
refactor(quickadapter): rename known_at_index to known_at_lookahead (#96)
authorJérôme Benoit <jerome.benoit@piment-noir.org>
Mon, 22 Jun 2026 01:20:11 +0000 (03:20 +0200)
committerGitHub <noreply@github.com>
Mon, 22 Jun 2026 01:20:11 +0000 (03:20 +0200)
PR #95 retained the historical column name `<label>_known_at_index` for what is now a per-row label lookahead in candles, to keep that hotfix strictly minimal. This PR converges the column suffix, the helper, the dataclass field, the static method, and the per-call-site locals onto `_known_at_lookahead`, with a retro-compat alias on the only externally-named public helper (`label_known_at_column_name = label_known_at_lookahead_column_name`).

The auxiliary `<label>_known_at_*` column is regenerated on every training run inside `set_freqai_targets`; FreqAI persists only the fitted model and `extra_returns_per_train`, never auxiliary dataframe columns -- the rename invalidates no on-disk artifact.

Reviewed by three parallel Oracle passes (math + claims-coherence; Python state-of-the-art + harmonization; documentation + terminology + PR-description coherence), each citing upstream evidence from `freqtrade/freqai/freqai_interface.py`, `data_kitchen.py`, and `data_drawer.py`. Consensus fixes were applied: README `causal_mode` formula symbol bound to the column token (`row-wise max(<label>_known_at_lookahead)`) to colocate definition with usage.

The two causal-guard local variable pairs were also harmonized to the local `train_<noun>` family (`train_known_at_lookahead`, `train_known_at_position`) used by the surrounding `_make_*_datasets` methods.

README.md
quickadapter/user_data/freqaimodels/QuickAdapterRegressorV3.py
quickadapter/user_data/strategies/QuickAdapterV3.py
quickadapter/user_data/strategies/Utils.py

index 2019e55d7d7e2cf9f42208ed9eb3a7e37389fb43..b453b6ee153aec5ef5f1085c95282f3de8fdf8d4 100644 (file)
--- a/README.md
+++ b/README.md
@@ -101,8 +101,8 @@ docker compose up -d --build
 | freqai.label_pipeline.gamma                                    | 1.0                           | float (0,10]                                                                                                                                           | Contrast exponent applied to labels after normalization: >1 emphasizes extrema, values between 0 and 1 soften.                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
 | _Feature parameters_                                           |                               |                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
 | freqai.feature_parameters.label_period_candles                 | min/max midpoint              | int >= 1                                                                                                                                               | Zigzag labeling NATR horizon.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
-| freqai.feature_parameters.label_horizon_candles                | `label_period_candles`        | int >= 1                                                                                                                                               | Number of candles after a label row before the label is considered known by causal split guards. Recommended: cover the zigzag pivot confirmation lag (the smoothing kernel half-width is added automatically by `set_freqai_targets`). Used by causal split guards and `<label>_known_at_index` metadata. When unset, falls back to `label_period_candles`.                                                                                                                                                                                                                                              |
-| freqai.feature_parameters.causal_mode                          | true                          | bool                                                                                                                                                   | Causal split guard toggle. When `true` (default): rejects `data_split_parameters.shuffle=true`, `shuffle_after_split=true`, `reverse_train_test_order=true`; for `timeseries_split` auto-sets `gap=label_horizon_candles` when unset/`0` (rejects explicit `gap<label_horizon_candles`); for `train_test_split` drops train rows where position `>=first_test_position-label_horizon_candles`; with `<label>_known_at_index` columns (per-row label lookahead in candles), additionally drops rows where `local_position + row-wise max(lookahead) >= first_test_position`. `false` is deprecated; acausal baselines only.                                                                                                                                                                                                                                       |
+| freqai.feature_parameters.label_horizon_candles                | `label_period_candles`        | int >= 1                                                                                                                                               | Number of candles after a label row before the label is considered known by causal split guards. Recommended: cover the zigzag pivot confirmation lag (the smoothing kernel half-width is added automatically by `set_freqai_targets`). Used by causal split guards and `<label>_known_at_lookahead` metadata. When unset, falls back to `label_period_candles`.                                                                                                                                                                                                                                              |
+| freqai.feature_parameters.causal_mode                          | true                          | bool                                                                                                                                                   | Causal split guard toggle. When `true` (default): rejects `data_split_parameters.shuffle=true`, `shuffle_after_split=true`, `reverse_train_test_order=true`; for `timeseries_split` auto-sets `gap=label_horizon_candles` when unset/`0` (rejects explicit `gap<label_horizon_candles`); for `train_test_split` drops train rows where position `>=first_test_position-label_horizon_candles`; with `<label>_known_at_lookahead` columns, additionally drops rows where `local_position + row-wise max(<label>_known_at_lookahead) >= first_test_position`. `false` is deprecated; acausal baselines only.                                                                                                                                                                                                                                       |
 | freqai.feature_parameters.min_label_period_candles             | 12                            | int >= 1                                                                                                                                               | Minimum labeling NATR horizon used for reversals labeling HPO.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
 | freqai.feature_parameters.max_label_period_candles             | 24                            | int >= 1                                                                                                                                               | Maximum labeling NATR horizon used for reversals labeling HPO.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
 | freqai.feature_parameters.label_natr_multiplier                | min/max midpoint              | float > 0                                                                                                                                              | Zigzag labeling NATR multiplier.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
index e0606fb84794b7bd15a6f7eed583cacf54bd1cb7..3a4a92f37ef6358713a69cb9717616bdd52811f4 100644 (file)
@@ -85,7 +85,7 @@ from Utils import (
     get_label_weighting_config,
     get_min_max_label_period_candles,
     get_optuna_study_model_parameters,
-    label_known_at_column_name,
+    label_known_at_lookahead_column_name,
     label_weight_column_name,
     migrate_config,
     optuna_load_best_params,
@@ -121,7 +121,7 @@ def _log_known_at_none_once(pair: str, context: str) -> None:
         return
     _KNOWN_AT_NONE_LOGGED.add(key)
     logger.info(
-        f"[{pair}] {context}: no <label>_known_at_index column present; "
+        f"[{pair}] {context}: no <label>_known_at_lookahead column present; "
         "causal guards use position-based purge only (label-aware filtering disabled)"
     )
 
@@ -511,18 +511,18 @@ class QuickAdapterRegressorV3(BaseRegressionModel):
         return positions.loc[filtered_dataframe.index]
 
     @staticmethod
-    def _known_at_index(
+    def _known_at_lookahead(
         filtered_dataframe: pd.DataFrame,
         unfiltered_df: pd.DataFrame,
     ) -> pd.Series | None:
         """Per-row label lookahead (in candles) across all registered labels.
 
-        See ``LabelData.known_at_index`` for the lookahead-vs-position
+        See ``LabelData.known_at_lookahead`` for the lookahead-vs-position
         contract and the slice-invariance rationale; callers must add the
         row's LOCAL position in ``unfiltered_df`` to recover the local
         index at which the label becomes causally available.
 
-        Row-wise ``max`` of every present ``<label>_known_at_index``
+        Row-wise ``max`` of every present ``<label>_known_at_lookahead``
         column; labels with a missing column or any NaN are skipped
         silently (opt-in by emission). Returns ``None`` when no label is
         usable; callers then fall back to the position-based purge.
@@ -532,13 +532,15 @@ class QuickAdapterRegressorV3(BaseRegressionModel):
         )
         series_list: list[pd.Series] = []
         for label_col in LABEL_COLUMNS:
-            known_at_col = label_known_at_column_name(label_col)
-            if known_at_col not in unfiltered_df.columns:
+            known_at_lookahead_col = label_known_at_lookahead_column_name(label_col)
+            if known_at_lookahead_col not in unfiltered_df.columns:
                 continue
-            known_at = unfiltered_df.loc[filtered_dataframe.index, known_at_col]
-            if known_at.isna().any():
+            lookahead = unfiltered_df.loc[
+                filtered_dataframe.index, known_at_lookahead_col
+            ]
+            if lookahead.isna().any():
                 continue
-            series_list.append(pd.to_numeric(known_at, errors="raise"))
+            series_list.append(pd.to_numeric(lookahead, errors="raise"))
         if not series_list:
             return None
         if len(series_list) == 1:
@@ -1970,16 +1972,18 @@ class QuickAdapterRegressorV3(BaseRegressionModel):
                     train_positions.to_numpy(dtype=np.int64)
                     < first_test_position - label_horizon_candles
                 )
-                known_at_index = QuickAdapterRegressorV3._known_at_index(
+                known_at_lookahead = QuickAdapterRegressorV3._known_at_lookahead(
                     features, unfiltered_df
                 )
-                if known_at_index is not None:
-                    known_at_train_delta = known_at_index.loc[train_features.index]
-                    known_at_train_position = (
+                if known_at_lookahead is not None:
+                    train_known_at_lookahead = known_at_lookahead.loc[
+                        train_features.index
+                    ]
+                    train_known_at_position = (
                         train_positions.to_numpy(dtype=np.int64)
-                        + known_at_train_delta.to_numpy(dtype=np.int64)
+                        + train_known_at_lookahead.to_numpy(dtype=np.int64)
                     )
-                    keep_mask &= known_at_train_position < first_test_position
+                    keep_mask &= train_known_at_position < first_test_position
                 else:
                     _log_known_at_none_once(dk.pair, "train_test_split causal guard")
                 (
@@ -2397,16 +2401,16 @@ class QuickAdapterRegressorV3(BaseRegressionModel):
             )
             first_test_position = int(row_positions.iloc[test_idx].min())
             train_positions = row_positions.iloc[train_idx]
-            known_at_index = QuickAdapterRegressorV3._known_at_index(
+            known_at_lookahead = QuickAdapterRegressorV3._known_at_lookahead(
                 filtered_dataframe, unfiltered_df
             )
-            if known_at_index is not None:
-                known_at_train_delta = known_at_index.iloc[train_idx]
-                known_at_train_position = (
+            if known_at_lookahead is not None:
+                train_known_at_lookahead = known_at_lookahead.iloc[train_idx]
+                train_known_at_position = (
                     train_positions.to_numpy(dtype=np.int64)
-                    + known_at_train_delta.to_numpy(dtype=np.int64)
+                    + train_known_at_lookahead.to_numpy(dtype=np.int64)
                 )
-                keep_mask = known_at_train_position < first_test_position
+                keep_mask = train_known_at_position < first_test_position
                 (
                     train_features,
                     train_labels,
index d2ed0aac2f697490f792745077c1b22bcba1c850..4145cafab2843737f15c6b48ff446c66fbb5ecf3 100644 (file)
@@ -59,7 +59,7 @@ from Utils import (
     get_label_smoothing_config,
     get_label_weighting_config,
     get_zl_ma_fn,
-    label_known_at_column_name,
+    label_known_at_lookahead_column_name,
     label_weight_column_name,
     migrate_config,
     nan_average,
@@ -961,9 +961,9 @@ class QuickAdapterV3(IStrategy):
 
             dataframe[label_col] = label_data.series
 
-            if label_data.known_at_index is not None:
-                dataframe[label_known_at_column_name(label_col)] = (
-                    label_data.known_at_index
+            if label_data.known_at_lookahead is not None:
+                dataframe[label_known_at_lookahead_column_name(label_col)] = (
+                    label_data.known_at_lookahead
                 )
 
             label_weight_col = label_weight_column_name(label_col)
@@ -998,14 +998,14 @@ class QuickAdapterV3(IStrategy):
             # Zero-phase smoothing reads future candles within the kernel
             # half-width; extend the per-row label lookahead so causal
             # split guards account for the smoothing lookahead.
-            known_at_column = label_known_at_column_name(label_col)
-            if known_at_column in dataframe.columns:
+            known_at_lookahead_column = label_known_at_lookahead_column_name(label_col)
+            if known_at_lookahead_column in dataframe.columns:
                 kernel_half_width = get_smoothing_kernel_half_width(
                     col_smoothing_config, series_length=series_length
                 )
                 if kernel_half_width > 0:
-                    dataframe[known_at_column] = (
-                        dataframe[known_at_column] + kernel_half_width
+                    dataframe[known_at_lookahead_column] = (
+                        dataframe[known_at_lookahead_column] + kernel_half_width
                     )
 
             if label_col == EXTREMA_COLUMN:
index 6ff7413536d62087266832f87623516cf7b13bd9..7182f2418d80260c4155ce8e6f745bf9da904380 100644 (file)
@@ -531,9 +531,7 @@ EXTREMA_DIRECTION_COLUMN: Final[str] = "extrema_direction"
 EXTREMA_DIRECTION_SMOOTHED_COLUMN: Final[str] = "extrema_direction_smoothed"
 EXTREMA_WEIGHT_COLUMN: Final[str] = "extrema_weight"
 EXTREMA_WEIGHT_SMOOTHED_COLUMN: Final[str] = "extrema_weight_smoothed"
-# Suffix is historical; stored values are per-row label lookaheads
-# (in candles), not absolute indexes. See ``LabelData.known_at_index``.
-_LABEL_KNOWN_AT_SUFFIX: Final[str] = "_known_at_index"
+_LABEL_KNOWN_AT_LOOKAHEAD_SUFFIX: Final[str] = "_known_at_lookahead"
 
 LABEL_WEIGHT_SUFFIX: Final[str] = "_weight"
 
@@ -557,7 +555,7 @@ def _label_aux_column_name(label_col: str, suffix: str) -> str:
     Examples:
         ``("&s-extrema", "_weight")``  -> ``"s-extrema_weight"``
         ``("&-amplitude", "_weight")`` -> ``"amplitude_weight"``
-        ``("&s-extrema", "_known_at_index")`` -> ``"s-extrema_known_at_index"``
+        ``("&s-extrema", "_known_at_lookahead")`` -> ``"s-extrema_known_at_lookahead"``
     """
     stripped = _FREQAI_LABEL_SIGIL_PATTERN.sub("", label_col, count=1)
     if not stripped or not any(c.isalpha() for c in stripped):
@@ -580,13 +578,12 @@ def label_weight_column_name(label_col: str) -> str:
     return _label_aux_column_name(label_col, LABEL_WEIGHT_SUFFIX)
 
 
-def label_known_at_column_name(label_col: str) -> str:
-    """Return the per-row label-lookahead column name for a label column.
+def label_known_at_lookahead_column_name(label_col: str) -> str:
+    """Return the lookahead column name for ``label_col`` (see ``LabelData.known_at_lookahead``)."""
+    return _label_aux_column_name(label_col, _LABEL_KNOWN_AT_LOOKAHEAD_SUFFIX)
 
-    Column values are lookaheads in candles, not absolute positions; see
-    ``LabelData.known_at_index``.
-    """
-    return _label_aux_column_name(label_col, _LABEL_KNOWN_AT_SUFFIX)
+
+label_known_at_column_name = label_known_at_lookahead_column_name
 
 
 @dataclass
@@ -597,18 +594,18 @@ class LabelData:
         series: per-row label values aligned to ``dataframe.index``.
         indices: positions of detected pivots in ``series``.
         metrics: per-pivot metric lists (parallel to ``indices``).
-        known_at_index: optional per-row label lookahead in candles
+        known_at_lookahead: optional per-row label lookahead in candles
             (NOT an absolute position). Invariant under
             ``dk.slice_dataframe``. Causal split guards recover the
             local availability position as ``row_local_position +
-            known_at_index[row]``. ``None`` opts the label out of
+            known_at_lookahead[row]``. ``None`` opts the label out of
             label-aware causal filtering.
     """
 
     series: pd.Series
     indices: list[int]
     metrics: dict[str, list[float]]
-    known_at_index: pd.Series | None = None
+    known_at_lookahead: pd.Series | None = None
 
 
 LabelGenerator = Callable[[pd.DataFrame, dict[str, Any], Logger | None], LabelData]
@@ -746,7 +743,7 @@ def _generate_extrema_label(
     # freqtrade's ``dk.slice_dataframe`` runs AFTER ``set_freqai_targets``,
     # so any pre-slice absolute position would no longer match the causal
     # guard's local ``np.arange(len(unfiltered_df))`` coordinate system.
-    known_at_index = pd.Series(
+    known_at_lookahead = pd.Series(
         int(label_horizon_candles),
         index=dataframe.index,
         dtype=np.int64,
@@ -756,7 +753,7 @@ def _generate_extrema_label(
         series=series,
         indices=pivots_indices,
         metrics=metrics,
-        known_at_index=known_at_index,
+        known_at_lookahead=known_at_lookahead,
     )
 
 
@@ -798,7 +795,7 @@ def get_smoothing_kernel_half_width(
 ) -> int:
     """Half-width (in candles) of the smoothing kernel's lookahead.
 
-    Equals the lookahead applied to ``known_at_index`` after smoothing.
+    Equals the lookahead applied to ``known_at_lookahead`` after smoothing.
     Mirrors ``smooth()`` window normalization and short-series gating
     via shared primitives (``get_odd_window``, ``get_even_window``,
     ``get_savgol_params``).