How to Win at SignalNet: A Technical Guide
A practical, no-fluff guide to maximizing your score and payout on SignalNet. These strategies are ranked by impact — start from the top and work your way down.
Understanding the Game
Before optimizing, understand what you're optimizing for:
Final Score = 0.5 × IC + 0.3 × TC + 0.2 × MMC
Payout = Score × Stake × 0.25
- IC rewards raw accuracy
- TC rewards uniqueness
- MMC rewards ensemble contribution
Most beginners focus exclusively on IC. The edge is in TC and MMC.
Tier 1: High Impact Strategies
1. Feature Neutralization (Expected IC boost: +50-100%)
This is the single most impactful thing you can do. Without it, your predictions are correlated with common factors (momentum, value, size) that every other contributor also captures. The meta-model already knows these signals. You add no unique value.
from signalnet import Client
client = Client(api_key="sk-...")
predictions = model.predict(features)
# One line. Massive impact.
predictions = client.neutralize(predictions)
Feature neutralization orthogonalizes your predictions against the most common features. Your IC might dip slightly, but your TC and MMC will spike — and those are worth 50% of your final score.
When to skip it: If you're in a very early round with few contributors, the meta-model is weak and neutralization matters less.
2. Noise Feature Identification (Expected IC boost: +10-20%)
SignalNet includes ~13 pure noise features in every dataset. They're random numbers, rank-normalized to look like real features. They hurt your model by adding spurious correlations.
How to find them:
import numpy as np
# Method 1: Cross-round correlation stability
# Noise features will have near-zero correlation with the target
# consistently across all training rounds
train = client.get_training_data()
correlations = []
for round_data in train:
corr = round_data.features.corrwith(round_data.target)
correlations.append(corr)
avg_corr = np.mean(correlations, axis=0)
std_corr = np.std(correlations, axis=0)
# Noise features: low average correlation AND high variance
noise_mask = (avg_corr.abs() < 0.01) & (std_corr > avg_corr.abs())
noise_features = avg_corr[noise_mask].index.tolist()
print(f"Likely noise features: {len(noise_features)}")
features_clean = features.drop(columns=noise_features)
Method 2: Permutation importance. Train your model with and without each feature. Noise features will have near-zero importance across all training rounds.
3. Ensembling (Expected IC boost: +20-40%)
Combine 3-5 diverse models. Diversity is key — three gradient boosters are barely better than one. Mix model families:
from sklearn.ensemble import GradientBoostingRegressor, RandomForestRegressor
from sklearn.linear_model import Ridge
from sklearn.neural_network import MLPRegressor
models = {
'gb': GradientBoostingRegressor(n_estimators=200, max_depth=4),
'rf': RandomForestRegressor(n_estimators=300, max_depth=6),
'ridge': Ridge(alpha=1.0),
'nn': MLPRegressor(hidden_layer_sizes=(64, 32), max_iter=500),
}
predictions = {}
for name, model in models.items():
model.fit(X_train, y_train)
predictions[name] = model.predict(features)
# Equal weight ensemble
final = np.mean(list(predictions.values()), axis=0)
Pro tip: Don't equal-weight blindly. Use cross-validated IC to weight your models:
weights = {'gb': 0.35, 'rf': 0.25, 'ridge': 0.25, 'nn': 0.15}
final = sum(w * predictions[name] for name, w in weights.items())
Tier 2: Medium Impact Strategies
4. Feature Engineering
The raw 98 features are a starting point. Engineer more:
- Interactions:
feature_5 × feature_12might capture a signal neither feature has alone - Rolling z-scores: How unusual is today's feature value relative to its history?
- PCA components: The first few principal components often capture latent factors
from sklearn.decomposition import PCA
# Add top 5 PCA components as new features
pca = PCA(n_components=5)
pca_features = pca.fit_transform(features)
features_augmented = np.hstack([features, pca_features])
Warning: More features = more overfitting risk. Always validate on out-of-sample training rounds.
5. Target Engineering
The target is rank-normalized 20-day forward returns. But you don't have to predict the raw target.
- Clipped target: Cap extreme target values to reduce outlier influence
- Residualized target: Predict the target after removing the effect of known factors
- Binary target: Predict top-half vs bottom-half (simpler, sometimes more robust)
# Clipped target (winsorize at 1st/99th percentile)
target_clipped = target.clip(target.quantile(0.01), target.quantile(0.99))
# Binary target
target_binary = (target > target.median()).astype(float)
6. Cross-Validation with Era Splitting
Never randomly split your training data. Stock returns are autocorrelated — random splits leak future information.
Always split by round (era):
from sklearn.model_selection import TimeSeriesSplit
# Each "fold" is one round of training data
rounds = train_data['round_id'].unique()
for i in range(5, len(rounds)):
train_rounds = rounds[:i]
val_round = rounds[i]
X_tr = train_data[train_data['round_id'].isin(train_rounds)]
X_val = train_data[train_data['round_id'] == val_round]
# Train on past, validate on future
model.fit(X_tr[features], X_tr['target'])
ic = spearman_corr(model.predict(X_val[features]), X_val['target'])
print(f"Era {val_round}: IC = {ic:.4f}")
Tier 3: Fine-Tuning
7. Staking Strategy
You can stake between 100 and 10,000 SIGNAL per round, up to 3 models.
Conservative: Stake 100-500 SIGNAL while you're learning. The 25% max loss cap protects you, but losing 2,500 SIGNAL on a 10,000 stake still hurts.
Confident: Once your model has 5+ rounds of positive IC, increase your stake. The payout formula (Score × Stake × 0.25) means higher stakes amplify both wins and losses linearly.
Multi-model: Submit 2-3 diverse models with different stakes. Put more stake on your most confident model. This gives you diversification AND higher expected return.
8. Regularization
Overfitting is the silent killer. Signs your model is overfit:
- IC on training data: +0.08. IC on live rounds: +0.005.
- Feature importance is dominated by 2-3 features
- Performance drops dramatically on the most recent training rounds
Fixes:
- Reduce model complexity (fewer trees, shallower depth)
- Increase regularization (higher alpha, more dropout)
- Use fewer features (top 30-50 instead of all 85 non-noise features)
- Bag your predictions (train 10 models on bootstrap samples, average)
9. Sector Neutralization
Beyond feature neutralization, make your predictions sector-neutral:
# After identifying approximate sector clusters via PCA
sector_cluster = kmeans.predict(features)
# Within each sector, rank-normalize your predictions
for sector in np.unique(sector_cluster):
mask = sector_cluster == sector
predictions[mask] = rankdata(predictions[mask]) / mask.sum()
This prevents your model from having a "tech is always best" bias.
Common Mistakes
❌ Mistake 1: Optimizing only IC
IC is 50% of your score, but it's the metric where you have the least edge over other contributors. Everyone can achieve decent IC with basic ML. TC and MMC are where you differentiate.
❌ Mistake 2: Changing models mid-round
You see provisional IC is negative on day 3 and panic-submit a new model. Don't. Provisional scores are noisy. Stick with your submission.
❌ Mistake 3: Looking at the features as "just numbers"
The features are encrypted, but they still have structure. Feature 47 might always be high for the best-performing stocks. Feature 12 might have a nonlinear relationship with returns. Explore your data visually.
❌ Mistake 4: Maximum stake on Round 1
You haven't validated your model against real live data yet. Start small. Scale up when you have evidence.
❌ Mistake 5: Ignoring the training data
You get 30+ rounds of historical features + targets. That's enough to backtest your model thoroughly. Contributors who skip backtesting and go straight to submission are guessing, not investing.
The Meta-Game
SignalNet isn't just a prediction competition. It's a signal marketplace. The meta-model aggregates all contributor predictions into a combined signal that's better than any individual.
Your goal isn't to have the highest IC. It's to provide signal that the meta-model doesn't already have. That's TC. That's MMC. That's where the money is.
The contributors who consistently earn the most on SignalNet aren't the ones with the fanciest models. They're the ones who understand that unique signal is worth more than accurate signal.
Build something the crowd can't replicate.