fix(quickadapter): use slice-invariant lookahead for causal guard (#95)
* fix(quickadapter): use slice-invariant lookahead for causal guard
PR #78 stored '<label>_known_at_index' as 'arange(len) + horizon +
kernel_half_width' -- absolute positions in the dataframe passed to
'set_freqai_targets'. freqtrade's 'dk.slice_dataframe' (a '.loc' filter)
runs AFTER 'set_freqai_targets' and drops warmup rows but preserves
column values, so those pre-slice positions survived into the post-slice
'unfiltered_df'. The causal guard then compared them against
'first_test_position' derived from 'np.arange(len(unfiltered_df))' --
local post-slice positions in a different coordinate system. The unit
mismatch wiped out most or all training rows on every pair.
Production crash on 2026-06-22 (XRP/USD): "removed 2621
causal-unsafe train rows" followed by "causal guard removed all
train rows, skipping".
Fix: the column now stores a per-row label lookahead (in candles),
invariant under 'dk.slice_dataframe'. Consumers combine the row's
local position with the lookahead to recover the local known-at
position before comparing to 'first_test_position'. Column name
'<label>_known_at_index' is retained for this hotfix; a rename to
'<label>_known_at_lookahead' (with rétro-compatible alias) is left
to a follow-up PR per AGENTS.md 'small, verifiable changes'.
Touches:
- Utils.py: producer rewritten to store a constant per-row lookahead;
'LabelData' and 'label_known_at_column_name' docstrings document the
new contract; '_LABEL_KNOWN_AT_SUFFIX' carries an inline disambiguation.
- QuickAdapterV3.py: smoothing-lookahead advance comment harmonized to
the canonical 'per-row label lookahead (in candles)' phrasing.
- QuickAdapterRegressorV3.py: '_known_at_index' docstring rewritten;
'train_test_split' and 'timeseries_split' causal-mode branches add
'train_positions + delta' before the '< first_test_position' check;
'timeseries_split' hoists 'train_positions' for symmetry with
'train_test_split'.
- README.md: 'causal_mode' tunable description reflects the new
comparison semantic.
Reviewed by three parallel Oracle passes (math/algo/scope,
Python state-of-the-art / harmonization, documentation /
terminology / concision) with cross-validation; one false alarm
on a missing position-only fallback in 'timeseries_split' was
resolved by confirming 'TimeSeriesSplit.gap' enforces the
chronological purge at the sklearn layer.
* docs(quickadapter): shrink _known_at_index docstring to LabelData pointer
Per multi-oracle PR #95 review (Oracle 3 §8.1): paragraph 1 of
_known_at_index duplicated the slice-invariance rationale already
canonical on LabelData.known_at_index. Replace with a thin pointer per
AGENTS.md *No duplication: maintain single authoritative documentation
source; reference other sources rather than copying.*