feat(label_weighting): add epsilon_gaussian fill_method, fix sub-floor pivot-row dip (#84)
Add a fourth off-pivot weighting mode that superposes the epsilon
floor and the gaussian bumps additively, and fix a related defect in
_scatter_weights that allowed pivot rows to sit *below* the off-pivot
field whenever that field at the pivot's index could legitimately
exceed the pivot's raw weight.
Math
----
Define the off-pivot field
f(i) = phi + max_{p in P} w_p * exp(-(i - p)^2 / (2 * sigma_p^2))
with phi = eps * B(W) (B mean or median, eps in [0, 1]; phi = 0 on
empty pivots or non-finite baseline), and per-pivot sigma_p from
_compute_pivot_sigmas (fixed or k-NN). The combined formulation reuses
both existing closed forms verbatim:
fill_method = 'zero' -> f(i) = 0
fill_method = 'epsilon' -> f(i) = phi (constant in i)
fill_method = 'gaussian' -> f(i) = max_p w_p * exp(...)
fill_method = 'epsilon_gaussian' -> f(i) = phi + max_p w_p * exp(...)
Bound: phi <= f(i) <= phi + max_p w_p. The new mode reduces to pure
gaussian when eps = 0 (bit-identical). The reduction is a per-row max
over per-pivot Gaussian bumps; phi is the epsilon floor.
Sub-floor / sub-bump pivot-row dip (bug fix)
--------------------------------------------
Before this change _scatter_weights wrote out[p] = w_p unconditionally,
so a pivot whose raw weight was below the off-pivot field at its
index appeared as a sharp dip relative to its neighbors. Two
manifestations of the same defect class:
- 'epsilon' / 'epsilon_gaussian' (sub-floor): a pivot with
w_p < phi (e.g. W = (0.001, 1.0, 1.0) with eps = 0.5 and
B = median, phi = 0.5) sat at 0.001 while neighbor rows sat at phi.
- 'gaussian' / 'epsilon_gaussian' (sub-bump): a weak pivot with a
strong neighbor (e.g. W = (0.001, 1.0) at indices (0, 1), sigma = 1)
sat at 0.001 while the off-pivot field at the pivot's own index was
1.0 * exp(-0.5) ~= 0.6065 (the neighbor's gaussian bump).
Both cases are corrected by a single uniform change: _scatter_weights
now writes out[p] = max(w_p, fill[p]) so pivot rows are never written
below the off-pivot field. 'zero' is bit-identical (fill is always 0,
so max(w_p, 0) = w_p when w_p >= 0). 'gaussian' in the sparse-pivot
regime (the typical configuration, especially with k-NN bandwidth) is
also bit-identical because fill[p] equals w_p when no neighbor's bump
at p exceeds w_p.
Implementation
--------------
- _scatter_weights: pivot rows take np.maximum(weights, fill_weights)
unconditionally. Off-pivot rows unchanged.
- _compute_epsilon_floor (renamed from _epsilon_floor): extracted
helper that returns phi (mean / median / fallback). Reused by
'epsilon' and 'epsilon_gaussian'. Parameter baseline narrowed to
the FillEpsilonBaseline Literal type.
- _compute_gaussian_bumps (renamed from _gaussian_bumps): extracted
adapter over _gaussian_fill_weights. Reused by 'gaussian' and
'epsilon_gaussian'. logger is kwarg-only.
- compute_label_weights: dispatcher gains the FILL_METHODS[3] branch.
The combined branch computes bumps once and adds phi in-place via
np.add(out=fill_weights), keeping peak memory at the existing
(chunk, M) buffer; phi is constant in p so the post-reduction add
is algebraically identical to adding inside the chunk loop while
saving O(chunk * M) writes. ValueError messages tightened to
include 'supported values are ...' for parity with
_compute_pivot_sigmas and _aggregate_metrics.
- LabelTransformer.py: extends FillMethod Literal and FILL_METHODS
tuple with 'epsilon_gaussian' at index 3. No new tunables, no new
validators (the existing _EnumValidator(FILL_METHODS) picks up the
new value automatically; existing range / type validators on
fill_epsilon / fill_sigma_* / fill_bandwidth_* apply unchanged).
- QuickAdapterV3.py: logging block refactored from if/elif chain to
parallel if blocks keyed on tuple membership so epsilon and sigma
parameter groups emit independently for each mode that uses them.
Documentation
-------------
README cells updated with set-membership 'Ignored when ...' clauses
matching the new index sets (epsilon | epsilon_gaussian for the
floor parameters, gaussian | epsilon_gaussian for the kernel
parameters). The fill_method description names the additive
composition explicitly and the pivot-row lift invariant
(out[p] = max(w_p, f(p))).
Verified manually on the host via AST extraction harness (no automated
test infrastructure exists in quickadapter/):
- zero mode: bit-exact with prior code (fill is 0, max(w_p, 0) = w_p).
- gaussian mode, sparse pivots: bit-identical to prior code (no
neighbor's bump at p exceeds w_p, so the lift is a no-op).
- gaussian mode, neighbor-dominated regime: pivot rows lifted to the
local field max, fixing the sub-bump dip. Verified with the
counterexample W = (0.001, 1.0) at indices (0, 1), sigma = 1:
legacy out[0] = 0.001, fixed out[0] = 1.0 * exp(-0.5) ~= 0.6065.
- epsilon back-compat (above-floor pivots): phi = eps * mean(W)
reproduced; pivots above phi unchanged.
- epsilon pivot-dip fix: W = (0.001, 1.0, 1.0), eps = 0.5,
baseline = median; legacy out[0] = 0.001, fixed out[0] = phi = 0.5.
- epsilon_gaussian with eps = 0: bit-identical to pure gaussian.
- epsilon_gaussian additive decomposition: out_eg - out_g = phi at
every off-pivot row.
- epsilon_gaussian pivot-row lifted: W = (0.001, 1.0, 1.0) at
well-separated indices (e.g. (0, 100, 200)), eps = 0.5,
baseline = median, sigma = 2.0; out[0] = phi + 0.001 ~= 0.501
(was 0.001 before the scatter fix).
- empty pivots: all four modes return all-zero.
- negative pivot weights still rejected by _gaussian_fill_weights.
- knn bandwidth + epsilon_gaussian: finite, bounded below by phi.
- ValueError messages on invalid fill_method / fill_epsilon_baseline
include 'supported values are ...'.
feat(label_weighting): adaptive k-NN bandwidth for gaussian off-pivot fill (#77)
* feat(label_weighting): adaptive k-NN bandwidth for gaussian off-pivot fill
Address the crushing of weaker pivots by stronger neighbors when pivots
fall within ~sigma_candles of each other in fill_method='gaussian'. The
per-row max aggregator preserves the upper bound Out[i] <= max_p w_p
but a wide constant sigma lets a strong neighbor's Gaussian dominate a
weak pivot's tail.
Add a k-nearest-neighbor bandwidth selector (Loftsgaarden &
Quesenberry 1965; Silverman 1986, paragraph 5.2) that adapts each
pivot's sigma to local pivot density:
sigma_p = clip(alpha * d_k(p), sigma_min, sigma_max)
where d_k(p) is the index distance to the k-th pivot neighbor. The
upper bound on Out[i] is preserved (no over-amplification) and dense
clusters automatically contract their Gaussians to stop overlapping.
Implementation:
- Pivots are emitted chronologically by zigzag, so the 1D k-NN reduces
to a sliding k-window over sorted indices, O(M) without a spatial
index.
- _gaussian_fill_weights accepts a per-pivot sigma vector via NumPy
broadcasting; the existing chunked exp/multiply/max kernel is
unchanged.
- Default fill_bandwidth='fixed' preserves byte-for-byte the previous
algorithm.
Tunables (added to DEFAULTS_LABEL_WEIGHTING, validated via _WEIGHTING_SPECS):
- fill_bandwidth: 'fixed' | 'knn' (default 'fixed')
- fill_bandwidth_neighbors: int >= 1 (default 1)
- fill_bandwidth_alpha: float > 0 (default 1.0)
- fill_sigma_min_candles: float >= 0.5 (default 0.5)
README updated.
* fix(label_weighting): correct gaussian kNN bandwidth
* chore(quickadapter): bump strategy and regressor version 3.11.12 -> 3.11.13
feat(weights): add uniform pivot weighting strategy (#75)
* feat(weights): add uniform pivot weighting strategy
Adds "uniform" to WEIGHT_STRATEGIES (between "none" and the metric
names) which assigns weight=1.0 to every detected pivot. Off-pivot rows
remain governed by the existing fill_method (zero / epsilon / gaussian),
so uniform + gaussian collapses cleanly to a pure proximity kernel
around each pivot. Naming follows sklearn convention
(KNeighbors(weights="uniform"), DummyClassifier(strategy="uniform")).
* chore(quickadapter): bump strategy and regressor version 3.11.11 -> 3.11.12
* refactor(weights): hoist indices_array and valid_mask in compute_label_weights
Compute indices_array and valid_mask once at the top of the function
instead of after the strategy dispatch. The uniform branch can now use
indices_array.size instead of len(indices), and the duplicate
np.asarray / valid_mask construction lower in the function is removed.
Saves one np.asarray and one mask computation per call.
* refactor(weights): consolidate _scatter_weights signature
Drop the redundant indices: list[int] parameter now that the only
caller (compute_label_weights) hoists indices_array and valid_mask.
The function takes them positionally and uses indices_array.size for
size checks, removing three len(indices) calls and the optional-kwarg
fallback paths.
* refactor(weights): pipeline API consolidation pass
- compute_label_weights: drop Optional placeholder, accept
Sequence[int] | NDArray[np.integer] for indices
- standardize Optional[X] -> X | None across module (PEP 604)
- _impute_weights: positional call instead of keyword on single arg
- _pivot_equivalent_count: remove unreachable threshold <= 0 branch
(survivors.size > 0 implies survivors.max() > 0 because the input
has been sanitized to non-negative values upstream)
- _scatter_weights: drop dead 'if not np.any(valid_mask)' early
return; vectorized assignment is a no-op when the mask is all-False
- sanitize_and_renormalize: clarify empty-input semantics in docstring
* fix(weights): zero leading and trailing non-finite runs in _impute_weights
The boundary mask only covered the strict tip positions (index 0 and -1),
so multi-element non-finite runs at the boundary were median-imputed
instead of zeroed. With input [NaN, NaN, 1.0, 2.0, NaN, NaN] the function
returned [0.0, 1.5, 1.0, 2.0, 1.5, 0.0] instead of [0, 0, 1.0, 2.0, 0, 0],
silently extending pivot weight to the unconfirmed boundary candles.
Use np.argmax on the finite mask to detect the leading and trailing
non-finite runs and zero the entire run, matching the docstring contract.
* fix(weights): floor stacked metrics in geometric and harmonic aggregation
Power means with p<=0 collapse to 0 on a single zero in the stack:
pmean([1, 0, 3], p=-1) = 0.0 and pmean([1, 0, 3], p=0) = 0.0. Combined
with compose_sample_weights' (arr <= 0) drop_mask predicate, a single
metric returning 0 on a pivot silently drops that row entirely.
Floor stacked_metrics at np.finfo(float).tiny only inside the
geometric_mean and harmonic_mean branches so all-positive pivots
survive aggregation. arithmetic_mean, quadratic_mean, weighted_median
and softmax branches are untouched.
* fix(weights): log when out-of-range pivot indices are dropped
compute_label_weights silently filters out pivot indices outside
[0, n_values) via valid_mask. This made upstream contract violations
invisible: a stale or off-by-one index list would simply produce zero
training weight on those rows with no diagnostic.
Emit logger.warning with the count and dropped fraction whenever
n_dropped > 0 so the upstream caller can spot the issue.
* refactor(weights): collapse 4x label-config validators into a registry
Replace the four near-identical _validate_*_params + get_label_*_config
pairs with a single _LABEL_KIND_REGISTRY mapping each kind name to
(specs, defaults). _label_kind_validator builds the validator on the
fly and get_label_kind_config dispatches to _get_label_config with the
appropriate spec/default pair. The four public get_label_*_config
helpers remain as thin wrappers so existing callers in QuickAdapterV3
and QuickAdapterRegressorV3 are unaffected.
_LabelTransformerConfig.from_dict (LabelTransformer.py) is intentionally
out of scope: it would require propagating a logger through
BaseTransform's freqtrade-side interface, which is upstream-controlled.
* docs(weights): drop verbose empty-input note from sanitize_and_renormalize
The added sentence paraphrased the existing collapse line and restated
obvious facts about zero-length vectors without contractual information.
Revert to the concise pre-W1 docstring.
* docs(weights): drop get_label_kind_config docstring for consistency
The 4 sibling get_label_*_config wrappers have no docstrings; their
parameter names and the registry name self-document the contract. Drop
the redundant docstring on get_label_kind_config to match the family
style.
* fix(weights): drop tiny floor in geometric/harmonic aggregation
The floor at np.finfo(float).tiny added in
b1f86a0 preserved pivots
whose metrics included an exact zero, but a zero metric is itself a
'signal absent' marker that downstream compose_sample_weights drops
via the (arr <= 0) mask. Floor was masking the intended drop.
Restore the upstream pmean behavior so a zero in any geometric or
harmonic input produces an exact 0.0 combined weight, allowing
drop_mask to drop the pivot as designed.
* docs(readme): document uniform label weighting strategy
* style(readme): re-align tunables table columns
feat(quickadapter): add soft off-pivot weighting (epsilon, gaussian) to label_weighting (#74)
* feat(quickadapter): add soft off-pivot weighting (epsilon, gaussian) to label_weighting
Adds three off-pivot weighting modes behind a new fill_method tunable in
freqai.label_weighting:
- zero (default): current hard-zero behavior, retained for backward
compatibility.
- epsilon: off-pivot rows receive a flat baseline
fill_epsilon * <baseline>(pivot_weights), where <baseline> is mean or
median, controlled by fill_epsilon_baseline.
- gaussian: off-pivot rows receive a per-row weight from a heatmap-style
decay max_p w_p * exp(-(i-p)^2 / (2 sigma^2)), controlled by
fill_sigma_candles (>= 0.5).
The default is zero so existing configs without the new keys behave
identically. Switching fill_method materially changes per-leaf weight
mass and may require GBM hyperparameter retuning; flagged in the README
description column.
Implementation:
- Adds FillMethod/FILL_METHODS and FillEpsilonBaseline/FILL_EPSILON_BASELINES
Literal types and tuples in LabelTransformer.py.
- Extends DEFAULTS_LABEL_WEIGHTING with the four new keys and their
defaults.
- Extends _WEIGHTING_SPECS in Utils.py with corresponding _EnumValidator
and _NumericValidator entries (epsilon in [0, 1], sigma_candles >= 0.5).
- Refactors _scatter_weights to accept fill_weights as a precomputed
array plus optional indices_array/valid_mask kwargs; preserves
pre-existing length-mismatch ValueError and empty-input early-return
semantics.
- Adds _gaussian_fill_weights helper with in-place pipeline keeping peak
memory at one (chunk, M) buffer; chunk-by-N keyed on
_GAUSSIAN_FILL_CHUNK_BUDGET = 50_000_000 cells (~400 MB peak); emits a
density warning when M / N > 0.1; rejects negative pivot weights.
- Adds *, logger: Logger keyword-only parameter to compute_label_weights
and updates the single call site in QuickAdapterV3.py.
- Replaces the raw nonzero count in compose_sample_weights with a
pivot-equivalent count helper (_pivot_equivalent_count) so the sparse
training mass warning stays meaningful under epsilon / gaussian.
Documentation:
- Four new rows added to the README configuration tunables table under
Label weighting; fill_method flagged as requiring trained-model
deletion when changed.
- Four new keys added to config-template.json under label_weighting.
Verified manually on host via AST extraction harness (no automated test
infrastructure exists in quickadapter/):
- STATIC_OK: defaults + tuples assertions pass.
- SPOTCHECK_4..9: cluster amplification (out[50] ~= 8.0), sigma < 0.5
rejected, negative pivot weights rejected, density warning emitted at
M/N=0.2, empty pivots return zeros, mean/median epsilon ratio = 20.8x.
- SPARSE_4A..C: sparse-mass warning fires under zero mode + sparse
pivots and gaussian sigma=0.5 underflow regime; silent under broad
gaussian fills.
* fix(quickadapter): harden sanitize_and_renormalize against rescale overflow and drop_mask contract violations
Two production-quality safeguards on the load-bearing primitive used by
the new fill_method dispatch in PR #74, plus one cosmetic comment
cleanup.
1. Subnormal-total rescale overflow guard:
When the sum of sanitized weights falls into a subnormal range
(e.g. a single 1e-310 survivor among zeros, n=1000), n/total
overflows to +Inf and safe * Inf propagates Inf to every nonzero
entry, producing mean(out) = NaN and silently violating the
documented mean=1 invariant. The fix computes the rescale factor
into a local c, checks np.isfinite(c), and falls through to the
existing uniform-fallback path with a distinct warning message
('rescale factor non-finite') so operators can distinguish this
regime from the existing 'weights collapsed' case. Bit-identical
on all common paths; c -> 0 underflow is unreachable
(min c = 1/DBL_MAX > 0).
2. drop_mask shape and dtype assertions:
sanitize_and_renormalize is now load-bearing for compose_sample_weights
under all three fill_method modes (zero/epsilon/gaussian). Numpy
broadcasts a (k, n)-shaped mask silently, breaking the (n,) output
contract. Shape and dtype precondition checks raise ValueError
early with prefixed messages matching the function's existing
logger style. Dtype check uses np.issubdtype(..., np.bool_) so
any boolean alias (bool, np.bool_, 'bool') is accepted; integer
masks are rejected.
3. LabelTransformer.py: replace 'current behavior' comment with
'default' on FILL_METHODS[0] since the comparison no longer makes
sense once the PR is merged.
Verified manually:
- REVIEW_FIX_1A..C: bool, np.bool_, 'bool' all accepted; int rejected.
- REVIEW_FIX_2A..B: subnormal-overflow path emits the new distinct
warning; real collapse path emits the original warning.
- REVIEW_FIX_3_OK: docstring contradiction removed.
- REGRESSION_OK: bit-identical common path.
- All original PR #74 verifications still pass (SPARSE_4A..C,
SPOTCHECK_4..9).
* fix(quickadapter): short-circuit compute_label_weights on empty pivot weights
When metrics[strategy] is empty but indices is non-empty, the new
fill_method dispatch in epsilon/gaussian arms slices weights[valid_mask]
before _scatter_weights can short-circuit, raising IndexError on a
size-0 / N-mask shape mismatch. Pre-PR _scatter_weights returned the
default-filled array silently in this case (preserved invariant noted
inline at the empty-input early return).
Add a short-circuit before the dispatch so the contract is consistent
across all three fill methods.
Also trim _gaussian_fill_weights docstring to match the codebase style
(neighboring private helpers carry no docstring or a single short
paragraph) and drop a redundant in-line comment that the in-place
np.multiply(out=buf) pattern already conveys.
Verified on the AST-extraction harness (pre-fix reproduction → fix
verification): 12 contract assertions across 4 edge cases x 3 fill
methods, plus crossmode + non-empty differentiation, all pass; PR #74
SPOTCHECK_4..9, SPARSE_4A..C, REVIEW_FIX_*, REGRESSION_OK still pass.
* chore(quickadapter): bump strategy and regressor version 3.11.10 -> 3.11.11
* refactor(quickadapter): polish label_weighting docs, comments, and sparse-mass diagnostic
Three coordinated polish edits following final-review feedback:
1. _pivot_equivalent_count: replace the 0.5 * median threshold with
_PIVOT_EQUIVALENT_MAX_FRACTION * surviving max (default 0.1). The
median-based heuristic saturated at N under epsilon mode (off-pivot
floor dominates the median once N >> M), silencing the warning the
docstring claimed to provide. The max-relative threshold separates
pivot-class rows from off-pivot fill across the bimodal regimes
fill_method introduces. Constant is module-level and named so the
choice is auditable; warning text now self-describes the threshold
('rows above 10% of surviving max').
2. _scatter_weights: trim the 'Order matters...' comment from 3 lines
to 1 line. The shorter form pins the intentional ordering without
paraphrasing git history; future 'validate inputs first' refactors
are still flagged.
3. README: extend the fill_method row with a concise retuning hint
(per-leaf regularization + Optuna study reset) so the operator
guidance surfaces in user docs, not only in the planning artefact.
Tighten fill_sigma_candles description to match neighboring-row
density.
Verified manually:
- SPARSE_4A..C: original PR cases still pass.
- SPARSE_4D: epsilon+sparse pivots (M=20, N=1000) now correctly fires
the sparse-mass warning (was silenced with median-based threshold).
- SPARSE_4E: zero+skewed pivots ([1,1,...,10]) still fire under the
new threshold (no regression on the skew case).
- SPOTCHECK_4..9, BUG_74_FIX_*, REVIEW_FIX_*, REGRESSION_OK: all
unchanged.
chore(reforcexy): rename reverse_test_train_order to reverse_train_test_order in config template
Align ReforceXY config template with the canonical tunable name already used in quickadapter config and QuickAdapterRegressorV3.
fix(weights): guard DI_values None and read label_frequency_candles from freqai_info
- fit_live_predictions: pred_df.get('DI_values') returns None when
feature_parameters.DI_threshold is 0 or absent (the default), causing
AttributeError on the subsequent .mean()/.std() calls. Fall back to
zeros instead.
- _label_frequency_candles: read from self.freqai_info['feature_parameters']
(matching every other access in the file) instead of self.config, which
is the top-level config dict and never contains feature_parameters
directly. The previous code silently ignored user-provided values and
always fell back to the default 'max(2, 2 * len(self.pairs))'.
feat(weights): per-label sample weights propagated to model.fit(sample_weight=...) (#72)
* chore(quickadapter): bump strategy and regressor version 3.11.8 → 3.11.9
* feat(weights): add compose_sample_weights helper with mean=1 multiplicative composition
AFML §4.10 / mlfinpy canonical: per-label mean=1 normalization, multiplicative composition with temporal decay, geometric-mean aggregation for multi-label, NaN/inf handling, all-zero degenerate fallback. Validated locally with pytest (evidence: .omo/evidence/task-5-{red,green}.txt).
* fix(weights): persist label weights into <label>_weight column instead of rescaling target
Removes statistically incorrect target rescaling (label = direction × weight). Persists raw direction labels and a separate <label>_weight column for downstream sample_weight composition. Validated locally with pytest (evidence: .omo/evidence/task-6-{red,green}.txt).
* feat(weights): add _strip_label_weight_columns helper for find_labels collision avoidance
* feat(weights): compose per-label weights with temporal decay before model.fit
* feat(weights): integrate sample_weight composition into both train() data split paths
Add _train_default() mirroring BaseRegressionModel.train() with _compose_train_weights inserted between make_train_test_datasets and _apply_pipelines. Routes train_test_split path through _train_default instead of super().train(). Inserts _compose_train_weights before _apply_pipelines in timeseries_split path. Calls _strip_label_weight_columns(dk) at top of train() for both branches. Validated locally with pytest + structural AST checks (evidence: .omo/evidence/task-9-{pytest,structural}.txt).
* refactor(weights): align _train with BaseRegressionModel.train
Rename _train_default to _train to match the upstream method name (with
underscore prefix to mark it as the internal mirror, since the public
train() method routes between data split paths).
Mirror BaseRegressionModel.train line-for-line with _compose_train_weights
as the single intentional insertion between make_train_test_datasets and
the pipeline application:
- Drop ensure_datetime_series wrapper around unfiltered_df['date']:
upstream calls .iloc[].strftime() directly.
- Drop **kwargs from self.fit(dd, dk) to match upstream signature.
- Use dk.data_dictionary['train_features'].columns for feature count log,
matching upstream source of truth.
- Apply the same cosmetic alignment to the timeseries_split path for
consistency between both train code paths.
- Add docstring documenting the mirror relationship and the single
functional difference.
* refactor(weights): drop dead apply_label_weighting wrapper
Now that QuickAdapterV3.set_freqai_targets persists raw label direction
into the label column and weights into <label>_weight (consumed by
sample_weight downstream), the apply_label_weighting wrapper that
multiplied label values by their weights is no longer used.
- Drop Utils.apply_label_weighting (returned (weighted_label, weights)).
- Drop Utils._apply_label_weights (the values × weights helper).
- Switch QuickAdapterV3.set_freqai_targets to call compute_label_weights
directly (already used internally by the removed wrapper).
* refactor(weights): simplify MAXIMA/MINIMA plot columns for binary direction
Now that the label column holds raw direction in {-1, 0, +1}, the MAXIMA
and MINIMA plot columns reduce to a where(direction>0, 0.0) /
where(direction<0, 0.0) projection.
The previous magnitude-aware logic (plot_eps padding, mask of zero values
on positive direction, etc.) was tailored to the weighted label
amplitudes and is now dead code:
- extrema.abs().where(extrema.ne(0.0)).min() always evaluates to 1.0,
so plot_eps is always max(0.5, _PLOT_EXTREMA_MIN_EPS) = 0.5.
- direction.gt(0) & extrema.eq(0.0) is always False because direction>0
implies extrema = +1, never 0. The .mask() branches never trigger.
Drop the now-unused _PLOT_EXTREMA_MIN_EPS class constant.
* refactor(weights): rename smooth_label to smooth for genericity
The function applies generic smoothing kernels (gaussian, kaiser, triang,
smm, sma, savgol, gaussian_filter1d) to any pd.Series. The 'label'
suffix narrowed it to label-specific use, but it now also smooths the
<label>_weight column (next commit). Drop the suffix; the smoothing
config dict is still named label_smoothing because the per-column
config map remains label-keyed.
* feat(weights): smooth <label>_weight column with the same kernel as label
Per-label weights are pointwise: only pivot indices carry the
metric-derived weight, while non-pivot indices are filled with the
median weight (see compute_label_weights / _build_weights_array).
The label column is smoothed with smooth() to spread pivot signals
over neighbouring candles. Without smoothing the weight column, the
smoothed label values around a pivot keep the constant median weight,
so the model treats high-amplitude pivot neighbours and the pivot
itself as equally important during training.
Apply smooth() to the <label>_weight column with the exact same
per-column smoothing config as the label, so the weight profile
follows the label profile candle-for-candle.
The smooth() positional argument list was redundant with the
col_smoothing_config dict keys; collapse both calls to **kwargs
unpacking. _SMOOTHING_SPECS keys exactly match smooth() parameter
names, so the unpack is type-safe.
compose_sample_weights already replaces non-finite or non-positive
values with 1.0, which absorbs any sign overshoot from kernels like
savgol at series edges.
* feat(weights): add direction and weight subplots showing raw + smoothed signals
Replace the MAXIMA/MINIMA bar visualization with two new subplots that
show both the raw and the smoothed direction/weight curves:
- Subplot 'direction' overlays raw direction (extrema_direction) and
smoothed direction (smoothed_extrema).
- Subplot 'weight' overlays raw weight (extrema_weight) and smoothed
weight (smoothed_extrema_weight).
Each visualization column is captured at the right point in
set_freqai_targets:
- Raw columns (EXTREMA_DIRECTION_COLUMN, EXTREMA_WEIGHT_COLUMN) are
written before the smooth() call.
- Smoothed columns (SMOOTHED_EXTREMA_COLUMN, SMOOTHED_EXTREMA_WEIGHT_COLUMN)
are written after the smooth() call.
Drop MAXIMA_COLUMN and MINIMA_COLUMN constants — they were only used by
the old min_max bar subplot. The new subplots convey the same direction
information plus the per-pivot weight magnitude that the legacy
weighted_label visualization showed (before the sample-weight refactor).
All four visualization column names lack the '&' prefix, so FreqAI's
find_labels auto-detection ignores them; they cannot leak into model
targets.
* refactor(weights): align visualization column names on extrema_<axis>[_smoothed]
Three-agent audit (explore + librarian + oracle) found the previous viz
column names suffered from three inconsistencies:
- 'extrema_direction' (raw) carries the axis word, but 'smoothed_extrema'
drops it, breaking the 2x2 grid (axis × stage).
- Stage qualifier appears as PREFIX ('smoothed_extrema') for one column
and as SUFFIX ('smoothed_extrema_weight') for another.
- 'smoothed_extrema_weight' mixes prefix stage word with suffix axis word.
Production codebases (statsmodels Kalman, bukosabino/ta MACD, mlflow
NPMI, FreqAI's own _mean/_std) overwhelmingly use suffix-decorated
processed forms with the raw form as the plain base. FreqAI's internal
pattern is suffix (&s-extrema_weight, &s-extrema_mean, &s-extrema_std);
align with it.
Rename:
- SMOOTHED_EXTREMA_COLUMN ('smoothed_extrema')
-> EXTREMA_DIRECTION_SMOOTHED_COLUMN ('extrema_direction_smoothed')
- SMOOTHED_EXTREMA_WEIGHT_COLUMN ('smoothed_extrema_weight')
-> EXTREMA_WEIGHT_SMOOTHED_COLUMN ('extrema_weight_smoothed')
Result: every viz column follows extrema_<axis>[_smoothed]. The 2x2
grid is uniform, sort-order groups raw and smoothed pairs together,
and the pattern is internally consistent with FreqAI's existing
suffix-based derivations.
* fix(weights): address PR #72 review comments
Three-agent cross-validation (explore + librarian + oracle) of Copilot
review comments produced these verdicts:
C1 (Utils.py:compose_sample_weights) — REAL BUG. Replacing 0-valued
weights with 1.0 silently undoes sklearn / AFML §4.10's canonical 'drop
this sample' semantic. sklearn's _check_sample_weight_equivalence,
DecisionTree _splitter.pyx, HistGBM docs, LightGBM #5553/#905, XGBoost
#3787 and mlfinlab time-decay all converge on the same contract:
sample_weight=0 means 'this sample contributes nothing'. Preserve zeros
via a drop_mask that is OR'd across labels (any label saying 'drop'
wins), then re-applied after the geometric-mean composition. Non-zero
non-finite or negative values still collapse to 1.0 (geometric mean's
neutral element) since they represent undefined weights, not exclusions.
C3/C4 (QuickAdapterRegressorV3.py:_train, timeseries_split) — REAL
REGRESSION. Commit
9953f0c removed ensure_datetime_series with the
rationale 'mirror BaseRegressionModel.train exactly'. But
ensure_datetime_series was introduced in commit
ce843f9 specifically
as a workaround for freqtrade issue #13107 (int64 epoch-ms date
columns from feather/parquet handlers). Mirror the algorithm, retain
project-specific safety patches. Restore ensure_datetime_series in
both train paths.
C2/C7 (_label_weight_column_name unused) — DRY violation. Both call
sites in _compose_train_weights now use the helper instead of inline
f-strings.
C8 (train() docstring) — Inaccurate. The default path was claimed to
'Delegate to BaseRegressionModel.train()' but actually routes to
self._train() (a mirror with weight composition). Fix docstring to
reflect actual control flow.
C5/C6 (**kwargs forwarding) — FALSE POSITIVE. freqai_interface never
passes kwargs to model.train(); upstream BaseRegressionModel.train
also calls self.fit(dd, dk) without kwargs. The current code matches
upstream and the call chain is dead in practice.
* refactor(weights): factor train paths and relocate methods for structural coherence
Three-agent structural audit (explore + librarian + oracle) identified
five issues; fixes that don't fight existing conventions:
1. _train and the timeseries_split inline branch in train() shared
~30 lines of identical scaffolding (filter_features, dates logging,
fit_labels guard, weight composition, pipeline application, fit,
timing logs). Extract _train_common(unfiltered_df, pair, dk, split_fn)
that owns the full mirror; _train_default and _train_timeseries_split
become 4-line dispatchers passing the split callback. train() routing
collapses to a clean two-line if/elif.
2. _label_weight_column_name, _strip_label_weight_columns and
_compose_train_weights were inserted into the middle of the class
constants block (between _TEST_SIZE and _SQRT_2), interrupting the
constant-block coherence. Move them to the private instance method
zone, immediately after _apply_pipelines (their natural neighbour).
3. _compose_train_weights duplicated the train/test weight extraction
loop verbatim. Factor into a static _extract_split_weights helper
that takes a split index and returns the per-label weight map; both
train and test call sites become single expressions.
4. The four visualization column constants (EXTREMA_DIRECTION_COLUMN,
EXTREMA_DIRECTION_SMOOTHED_COLUMN, EXTREMA_WEIGHT_COLUMN,
EXTREMA_WEIGHT_SMOOTHED_COLUMN) were 75 lines below EXTREMA_COLUMN /
LABEL_COLUMNS, separated by the LabelData dataclass and label
generator registry. Move them next to EXTREMA_COLUMN where they
logically belong.
5. Revert the train() docstring bullet to its upstream form. The
modification introduced RST cross-reference syntax inconsistent with
the surrounding plain-text docstring style.
* refactor(weights): centralize LABEL_WEIGHT_SUFFIX in Utils
The "_weight" suffix was duplicated as a class constant in
QuickAdapterRegressorV3 and as 5 hardcoded f-strings in QuickAdapterV3.
Three-agent audit (explore + librarian + oracle) converged on moving
this column-naming convention to Utils.py:
- PEP 8 default for constants is module-level; class-level is the
exception for class-private semantics.
- Both consumers already import column-naming constants from Utils.py
(LABEL_COLUMNS, EXTREMA_COLUMN, EXTREMA_WEIGHT_COLUMN, etc.). The
suffix belongs with them.
- Production precedents (sklearn UNUSED/WARN/UNCHANGED, mlflow
_SAMPLE_WEIGHT/_TRAINING_PREFIX, lightgbm _DatasetNames, pandas
LOCAL_TAG) all place cross-module string tokens at module level.
- The constant describes a dataframe schema contract (column names),
not model behaviour. Schema concerns belong in the schema module.
Add LABEL_WEIGHT_SUFFIX to Utils.py next to LABEL_COLUMNS. Remove the
class-level _LABEL_WEIGHT_SUFFIX in QuickAdapterRegressorV3 and import
the module-level constant. Replace 5 hardcoded f-strings in
QuickAdapterV3 with a local label_weight_col binding using the
imported constant.
The private _label_weight_column_name helper is kept in the regressor
since it is used twice (in _strip_label_weight_columns and
_compose_train_weights) and still adds a thin DRY layer over the
suffix synthesis.
* style: apply ruff formatting
* feat(weights): add label_weight_column helper with regex prefix strip and collision assertion
* fix(weights): preserve drop_mask and prevent NaN in compose_sample_weights fallback
* refactor(weights): adopt label_weight_column helper for canonical training column
* feat(weights): add _build_per_row_weights helper for pre-split weight composition
* feat(weights): add _make_default_split_datasets mirror with sklearn-key whitelist
* refactor(weights): make timeseries split helper accept external weights parameter
* refactor(weights): refactor _train_common chain and delete obsolete weight helpers
* refactor(weights): replace train() if/elif with dispatch dict and add weight-column uniqueness check
* fix(weights): harden compose_sample_weights for degenerate inputs
Address audit findings A0-1, A0-2, A0-3, A0-4, P2 #6, P2 #9, P2 #11
on branch feat/per-label-sample-weights.
- Extract _sanitize_and_renormalize private helper with four-guard chain
(positive sum, finite sum, finite ratio, finite scaled) and uniform
fallback. Used at empty-map fast path and at final fallback site.
- Empty label_weights_map now sanitizes raw temporal (A0-3).
- Subnormal temporal no longer overflows: ratio + scaled both checked
for finiteness before returning (A0-2).
- Drop predicate unified as 'arr <= 0 or non-finite' instead of exact
zero, eliminating the discontinuity at zero from smoothing artifacts
(A0-4); negatives now drop, no longer rescued to 1.0 (subsumes A1-10).
- Surviving positive values floored at np.finfo(float).tiny to prevent
subnormal arithmetic in the geo-mean log step.
- drop_mask covering all rows now raises ValueError instead of silently
returning all-zero weights that crash XGBoost / sklearn HGBR (A0-1).
- Up-front per-label shape validation raises a precise error instead of
letting numpy broadcasting fail mid-computation (P2 #11).
* docs(weights): document compose_sample_weights contract
Address audit findings A1-2, P2 #12, A1-12 on branch
feat/per-label-sample-weights.
Add 11-line docstring covering: output invariant (mean=1), per-label
sanitization predicate, aggregation operator, drop semantics, error
conditions, and the bounded full-series-median leakage in
compute_label_weights.
* refactor(weights): inline data-split dispatch with match/case
Address audit findings A1-1, A1-7, A1-6 (obsolete), P2 #14 on branch
feat/per-label-sample-weights. Conflict C1: A1-7 wins over A1-3
(YAGNI on subclass extension; LSP traceability preserved).
- Delete _DATA_SPLIT_DISPATCH class attribute (4 LOC).
- Delete _data_split_methods_set lru_cached helper (4 LOC dead code).
- Delete _train_default and _train_timeseries_split wrappers (~30 LOC
pure boilerplate dispatching to _train_common).
- Inline match/case on method name in train(); pick the right
_make_*_split_datasets via a local split_builder; nested split_fn
closes over dk. Net -37 LOC; full LSP traceability.
- Add SplitFn module-level type alias used in _train_common signature.
* fix(weights): reject bool config values in numeric validators
Address audit findings A1-4, P2 #5, P2 #7 on branch
feat/per-label-sample-weights.
Python's `bool` is an `int` subclass, so `isinstance(True, int)` is
true and config values of `true`/`false` silently passed through as
`1`/`0` in the validators for n_splits, gap, max_train_size,
test_size and weight_factor. This commit closes that footgun:
- Add static helpers _coerce_int (always returns int, raises on bool
or non-int) and _coerce_optional_int (returns Optional[int]) to
centralize the validation; both echo the offending raw value via
`{value!r}` so the diagnostic shows True/False rather than 1/0.
- Apply _coerce_int to n_splits and gap, _coerce_optional_int to
max_train_size in _make_timeseries_split_datasets.
- Add explicit bool guard for test_size in both default-split and
timeseries-split paths; previously test_size=true would slip past
isinstance(_, int) and silently train on 1 sample.
- Add explicit bool guard for weight_factor before the >0 comparison.
* fix(weights): preserve mean=1 invariant across pipeline stages
Address audit finding A1-5 on branch feat/per-label-sample-weights.
compose_sample_weights guarantees sum(w)==N (mean(w)==1) over the full
training series, but this invariant breaks twice downstream: (1) the
train/test split partitions weights into disjoint subsets whose means
no longer equal 1, and (2) feature_pipeline.fit_transform may drop
rows via SVM/DBSCAN, drifting the means further. XGBoost
min_child_weight, LightGBM min_sum_hessian_in_leaf and L2
regularization are all sensitive to absolute weight scale.
- Add static helper _renormalize_to_unit_mean with the same four-guard
chain as _sanitize_and_renormalize (positive sum, finite sum, finite
ratio, finite scaled, uniform fallback).
- Apply at four sites: before dk.build_data_dictionary in both
_make_default_split_datasets and _make_timeseries_split_datasets,
and after feature_pipeline.fit_transform / .transform in
_apply_pipelines (train and test sides).
* feat(weights): make label-weights aggregation configurable
Address audit finding A1-11 on branch feat/per-label-sample-weights.
Conflict C3: switch default to arithmetic_mean (matching
_compute_combined_label_weights), expose all 6 aggregations via
existing _aggregate_metrics infrastructure.
The hardcoded geometric mean over per-label normalized arrays was
mathematically conservative (one weak label dominates) and inconsistent
with the project's _compute_combined_label_weights default
(arithmetic_mean). For PR #44's correlated multi-target labels
(amplitude, time_to_pivot, efficiency, natr all derived from zigzag),
geomean over-counts redundant evidence and silently degrades to ~0
when any single factor is small. AFML \xc2\xa74.4 recommends arithmetic-mean
equivalents for correlated meta-labels.
- Add aggregation parameter to compose_sample_weights with default
COMBINED_AGGREGATIONS[0] ("arithmetic_mean"); also expose
softmax_temperature.
- Delegate the row-wise aggregation step to _aggregate_metrics, reusing
the existing 6-operator infrastructure with uniform unit coefficients.
- Read both knobs from feature_parameters in _build_per_row_weights:
label_weights_aggregation and label_weights_softmax_temperature.
* refactor(weights): align naming on compose/sigil/base_weights
Address audit findings A1-9, A1-14, A1-15, A1-16 on branch
feat/per-label-sample-weights.
- Rename _LABEL_WEIGHT_PREFIX_PATTERN to _FREQAI_LABEL_SIGIL_PATTERN:
the regex strips the freqtrade-native '&' sigil, not a label-weight
prefix; the new name describes what is matched (Utils.py).
- Rename compose_sample_weights parameter `temporal` to `base_weights`:
the parameter accepts any base vector (recency weights or uniform
ones), not exclusively temporal data (Utils.py).
- Rename _build_per_row_weights to _compose_per_row_weights:
standardize on the 'compose' verb to mirror compose_sample_weights;
this helper is the orchestrator that calls the kernel
(QuickAdapterRegressorV3.py).
- Rename _build_weights_array to _scatter_weights: the function
scatters sparse pivot weights into a dense default-filled array,
not a generic 'build' (Utils.py).
- Rename eval_set_and_weights to make_test_set_and_weights: aligns
with FreqAI's 'test_*' data_dictionary vocabulary while avoiding
the 'test_' prefix that pytest auto-discovers (the verb 'make_'
also clarifies it as a constructor, not a test) (Utils.py + caller).
* refactor(weights): extract _shuffle_in_unison helper
Address audit findings A1-8 and A1-13 on branch
feat/per-label-sample-weights.
- Extract the train/test shuffle pattern into a static
_shuffle_in_unison helper. Each call shuffles features, labels and
weights with the same random seed in lockstep. The shuffle block in
_make_default_split_datasets shrinks from ~28 lines (two duplicated
5-line idioms x train+test) to two helper invocations.
- Fix the dk.data_dictionary vs dd inconsistency at the feature-count
log line: read from local dd (the pipeline's return value) rather
than dk.data_dictionary (a side-effect set by _apply_pipelines).
* chore(weights): polish naming, validators, caching
Address audit P2 polish items #2, #4, #8, #13, #15 on branch
feat/per-label-sample-weights. P2 #21 and #22 (logger telemetry)
deliberately skipped per the no-new-comments/no-new-infrastructure
constraint; the existing fail-fast ValueError on degenerate inputs
already surfaces the most critical failure modes loudly.
- Counter-based duplicate-label diagnostic now names the offending
weight columns instead of merely raising on a length mismatch
(P2 #2).
- Widen shuffle seed space from random.randint(0, 100) (101 distinct
seeds, birthday collisions at sqrt(101) ~ 10) to randint(0, 2**31-1)
at both _shuffle_in_unison call sites (P2 #4).
- dsp = dict(self.config['freqai']['data_split_parameters']) replaced
with self.data_split_parameters (the safe pre-populated FreqAI
attribute used everywhere else in the file) (P2 #8).
- Cache label_weight_column with @lru_cache(maxsize=16): the helper
is pure on its single string argument and called in tight loops at
training; matches the file's existing convention for similar helpers
(P2 #13).
- Rename loop variable w to label_values in compose_sample_weights;
the outer scope spans ~20 lines and prior single-letter w obscured
the role (P2 #15).
* docs(weights): align train() and helper docstrings with current behavior
- Rewrite train() docstring to describe match-based dispatch and the per-row
weight composition flow through _train_common; remove stale delegation claim.
- Sync _compose_per_row_weights docstring: aggregation default is
arithmetic_mean, not geometric_mean.
- Fix AFML citation in compose_sample_weights from section 7.4 to chapter 4.
- Document _aggregate_metrics softmax branch as a per-column convex
combination with explicit T->0 and T->+inf limits.
* refactor(weights): consolidate sample weight renormalization helper
- Promote Utils._sanitize_and_renormalize to public sanitize_and_renormalize.
- Drop QuickAdapterRegressorV3._renormalize_to_unit_mean (cross-file
duplication of the mean=1 invariant); replace 6 call sites with the
unified helper. Sites that previously skipped per-element sanitization
now also reject non-finite or non-positive entries.
- Collapse triple-guard ladder in sanitize_and_renormalize to a single
finite-positive total check; surviving 'safe' is provably finite-nonneg
so intermediate isfinite checks were dead code.
- Tighten _shuffle_in_unison signature from Any to concrete pd.DataFrame
and NDArray types.
- Drop empty-fold sentinel in _make_timeseries_split_datasets and raise
ValueError on degenerate generator output instead of silently producing
empty index arrays.
* feat(weights): validate label_weights tunables via label_weighting block
- _compose_per_row_weights now consumes get_label_weighting_config (which
validates aggregation against COMBINED_AGGREGATIONS and enforces
softmax_temperature > 0 via _WEIGHTING_SPECS) instead of reading raw
feature_parameters.label_weights_*.
- Add CONFIG_MIGRATIONS entries auto-migrating
freqai.feature_parameters.label_weights_aggregation and
freqai.feature_parameters.label_weights_softmax_temperature to the
freqai.label_weighting block; users get one warning per key.
- Add module-level _logger to Utils.py and warn on
compose_sample_weights silent fallback so collapsed-aggregation paths
are observable.
- _make_timeseries_split_datasets honors reverse_train_test_order for
parity with _make_default_split_datasets and raises ValueError on
shuffle_after_split=True (chronological + shuffle is incoherent and
would leak future data into training).
* fix(weights): seed shuffle deterministically from data_split_parameters.random_state
Replace global random.randint() with a random.Random instance derived from
data_split_parameters.random_state. When the user provides a random_state
(whitelisted in _SKLEARN_TRAIN_TEST_SPLIT_KEYS), train and test shuffles
become reproducible end-to-end; when absent, behavior remains
non-deterministic. The single parent RNG draws two independent sub-seeds
so train and test shuffles stay decorrelated.
* refactor(weights): rename for naming coherence
- Rename _make_default_split_datasets to _make_train_test_split_datasets
to restore the case-key/method-name grep-line in train()'s match
dispatch.
- Rename label_weight_column to label_weight_column_name across Utils.py,
QuickAdapterV3.py and QuickAdapterRegressorV3.py: the helper returns
a column-name string, not a column accessor; the new name matches the
*_COLUMN constant convention used elsewhere in Utils.py.
- Drop redundant 'dk.data_dictionary = dd' in _apply_pipelines:
build_data_dictionary already self-assigns the dict on dk and dd is
the same object reference.
* style(weights): group EXTREMA_* constants and separate LABEL_* declarations
* docs(weights): rewrite _make_train_test_split_datasets docstring without history narration
Replace the deviation list and PR-history reference with a concise
description of what the function IS (sklearn-key whitelist, honored
tunables, weight propagation contract).
* fix(weights): resolve KeyError in label_weighting config consumption
_compose_per_row_weights passed self.config (root) to
get_label_weighting_config and accessed weighting_config['aggregation']
directly; the helper expects freqai.label_weighting and returns
{default, columns}. Fix consumes self.freqai_info['label_weighting']
and reads ['default']['aggregation'] / ['default']['softmax_temperature'].
* fix(weights): honor drop_mask in sanitize_and_renormalize fallback
When total <= 0 or non-finite, the helper returned np.ones_like(arr)
ignoring drop_mask, resurrecting dropped rows with weight=1. Fallback
now zeros drop_mask rows before returning.
* fix(config): rename reverse_test_train_order to reverse_train_test_order
Match the canonical key name from upstream freqtrade and from the code
in QuickAdapterRegressorV3._make_train_test_split_datasets and
_make_timeseries_split_datasets. The previous template key was silently
ignored.
* fix(strategy): clip smoothed weight column to non-negative finite
Some smoothing methods (savgol, filtfilt) can ring negative on positive
input. Clip the smoothed weight series to >= 0 and replace non-finite
with 0 before assigning to the dataframe so compose_sample_weights does
not silently drop rows that were positive before smoothing.
* fix(weights): apply project _TEST_SIZE default when data_split_parameters omits test_size
The whitelist comprehension that builds sklearn_kwargs only preserved keys
present in data_split_parameters; the local test_size variable computed via
dsp.get(..., _TEST_SIZE) was never injected back. Configs without an explicit
test_size silently fell through to sklearn's stock 0.25 default instead of
the project's 0.1.
Replace bare 'shuffle' insertion with setdefault for both shuffle and
test_size so sklearn_kwargs always carries the project defaults.
Update _make_train_test_split_datasets docstring to reflect the actual
default behavior.
* refactor(strategy): rename smoothed_weights to smoothed_label_weights
Aligns naming with surrounding label_weights variable, EXTREMA_WEIGHT_SMOOTHED_COLUMN
constant and compute_label_weights helper.
* feat(weights): split per-row aggregation into freqai.sample_weighting block
Cross-metric (per-pivot) and cross-label (per-row) compositions are
distinct distributions. Decouple their tunables:
- freqai.label_weighting.{aggregation,softmax_temperature} stay for
cross-metric aggregation in compute_label_weights (combined strategy).
- New freqai.sample_weighting.{aggregation,softmax_temperature} for
cross-label composition in
QuickAdapterRegressorV3._compose_per_row_weights.
Add _SAMPLE_WEIGHTING_SPECS, DEFAULTS_SAMPLE_WEIGHTING and
get_sample_weighting_config helper routed through _get_label_config so
the returned shape ({default, columns}) matches the get_label_*_config
family.
* refactor(weights): pass logger as parameter and log per-label weight column status
Drop the module-level _logger introduced in Utils.py; compose_sample_weights
now takes a keyword-only logger argument, matching the caller-passes-logger
pattern used by every other helper in the file.
In _compose_per_row_weights, log per-label weight column resolution at
debug when columns are present (static across retrains) and at warning
when none are found (unexpected configuration; falls back to temporal
weights only).
* fix(weights): use shape-consistent empty containers for test_size=0 sentinels
Replace np.zeros(2)/pd.DataFrame() sentinels with iloc[:0] / weights[:0]
slices so test_features, test_labels and test_weights all have 0 rows
with preserved column names and dtype. Behavior unchanged because
_apply_pipelines skips test-side processing when test_size == 0, but
shape-consistent containers respect the declared types and avoid
surprising downstream consumers.
fix: ensure_datetime_series raises ValueError on None instead of silent corruption