Encrypted Features
SignalNet provides encrypted, obfuscated feature datasets for each tournament round.
What Are Encrypted Features?
We source raw financial data (price, volume, fundamentals, sentiment, etc.) and transform it into an anonymized feature set:
- ~98 features per stock
- ~503 stocks (S&P 500 universe)
- Features are normalized to [0, 1] range
- Feature names are obfuscated (feature_001, feature_002, ...)
- Random monotonic transformations applied (preserves rank order)
- A few noise features included (for baseline testing)
Why Obfuscate?
- Level playing field — success depends on modeling skill, not data access
- IP protection — prevents reverse-engineering of our data pipeline
- Anti-gaming — can't exploit known feature semantics
- Legal safety — no licensing issues with redistributing transformed data
Downloading Features
Python SDK
from signalnet import Tournament
t = Tournament(api_key="your_key")
# Current round features
df = t.get_features()
# DataFrame: index=ticker, columns=feature_001...feature_098
# Historical features (for backtesting)
df_hist = t.get_features(round_id=42)
# Feature metadata
meta = t.features.metadata()
print(f"Features: {meta.num_features}")
print(f"Stocks: {meta.num_stocks}")
print(f"Updated: {meta.last_updated}")
REST API
# Download as CSV
curl -H "Authorization: Bearer YOUR_API_KEY" \
https://api.signalnet.io/v1/features/current \
-o features.csv
# Download as Parquet (smaller, faster)
curl -H "Authorization: Bearer YOUR_API_KEY" \
https://api.signalnet.io/v1/features/current?format=parquet \
-o features.parquet
Feature Statistics
Each round's feature dataset includes basic statistics:
features.describe()
# feature_001 feature_002 ... feature_098
# count 503.0 503.0 503.0
# mean 0.50 0.49 0.51
# std 0.29 0.28 0.30
# min 0.00 0.00 0.00
# max 1.00 1.00 1.00
Tips for Modeling
- Don't assume feature meaning — feature_042 might be momentum this round and volatility next round
- Use rank-based methods — since features are monotonically transformed, rank information is preserved
- Watch for noise features — some features are pure random noise. Good feature selection helps
- Ensemble your own models — combine multiple approaches for robustness
- Don't overfit — practice round performance ≠ live performance