-
Notifications
You must be signed in to change notification settings - Fork 0
Submission history
Suspect apriori that this should be a very useful feature. Found exceptionally high separation on iso-plots. Expected ICA to outperform CSP, and both to outperform RAW. Found CSP outperformed ICA (though very narrowly)!
mvar_raw
Public AUROC 0.68743
Expected values:
Dog_1 Dog_2 Dog_3 Dog_4 Dog_5 Patient_1 Patient_2 Overall
0.450020833333 0.777798564477 0.769036265432 0.753514200268 0.881382632633 0.814868731309 0.360648148148 0.769611833552
mvar_ica
Public AUROC 0.74684
0.476815972222 0.953311776156 0.809512731481 0.779338422456 0.929014014014 0.869317201518 0.425092592593 0.830230271653
mvar_csp
Public AUROC 0.74837
0.426253472222 0.919725826918 0.825064236111 0.776440593067 0.930520520521 0.853518907563 0.432222222222 0.819721454993
Thought that combining the features used in our previous best classifier along with those we used for our current best submission would probably work well. The following features were used:
"FEATURES": ["cln,csp,dwn_feat_lmom-3_",
"cln,ica,dwn_feat_xcorr-ypeak_",
"cln,csp,dwn_feat_pib_ratioBB_",
"cln,ica,dwn_feat_mvar-GPDC_",
"cln,ica,dwn_feat_PSDlogfcorrcoef_",
"cln,ica,dwn_feat_pwling1_",
"raw_feat_corrcoef_",
"raw_feat_cov_",
"raw_feat_pib_",
"raw_feat_var_",
"raw_feat_xcorr_"],
And the predicted performance was:
predicted AUC score for Dog_1: 0.53
predicted AUC score for Dog_2: 0.95
predicted AUC score for Dog_3: 0.82
predicted AUC score for Dog_4: 0.77
predicted AUC score for Dog_5: 0.94
predicted AUC score for Patient_1: 0.77
predicted AUC score for Patient_2: 0.45
predicted AUC score over all subjects: 0.83
Then, submitted and got 0.76012.
Only slightly worse, probably relies on the features for the current best and these raw features aren't useful.
Using a simple variance threshold and then also filtering by f-scores. Doing both predicted AUC was 0.86, but I had been fiddling with the cross-val code so that won't map onto other results. Full AUC results can be found here.
Submitted and got 0.77171, moving up the leaderboard 5 places.
Now running with just the variance threshold to see what its contribution is on its own.
Also got a predicted AUC of 0.86. Submitted and got 0.77171, exactly the same, which doesn't make a lot of sense. Going to check that it actually took out the f1-score selector.
Will run f1-score on its own tommorow.
Investigating which modtyp is best for the single channel time domain statistics. Intuitively, CSP should be best. We need to check this against the leaderboard because CSP takes knowledge of all features into account.
singlech_timestats_raw
Public AUROC: 0.69832
Predicted AUROC: 0.73461
Dog_1 Dog_2 Dog_3 Dog_4 Dog_5 Patient_1 Patient_2 Overall
0.479159722222 0.779531588137 0.734937885802 0.687944059975 0.781799299299 0.74687284334 0.551018518519 0.734610190333
singlech_timestats_ica
Public AUROC: 0.68967
Predicted AUROC: 0.71428
0.494840277778 0.770932191196 0.677896219136 0.678842378697 0.772084584585 0.666515700483 0.555925925926 0.714281427218
singlech_timestats_csp
Public AUROC: 0.70466
Predicted AUROC: 0.72564
0.506454861111 0.781471208263 0.72861246142 0.665038026046 0.803454704705 0.578890614217 0.578240740741 0.725640056988
We predicted raw > csp > ica We found on leaderboard csp > raw > ica Only narrow margins between them though.
Ran the batch train and predict script on all single features. Sorted the list by overall ROC prediction.
A lot of the top features are MVAR flavours, so I just used the best overall and no others.
NB: these are 20-times-CV predictions, not 10.
SVC_ica_mvar-arf
Public AUROC: 0.68460
Predicted AUROC: 0.8427
Dog_1 Dog_2 Dog_3 Dog_4 Dog_5 Patnt1 Patnt2 Overall
0.4651 0.9571 0.8439 0.7830 0.9539 0.7216 0.3756 0.8427
SVC_csp_coher_logf
Public AUROC: 0.75269
Predicted AUROC: 0.8301
0.5468 0.9579 0.8264 0.7656 0.9296 0.8225 0.5712 0.8301
SVC_ica_phase-high_gamma-sync
Public AUROC: 0.68427
Predicted AUROC: 0.8247
0.6218 0.9789 0.7523 0.7990 0.9082 0.8884 0.5332 0.8247
SVC_ica_pib_ratioBB
Public AUROC: 0.77110
Predicted AUROC: 0.8012
0.6928 0.8832 0.7556 0.7968 0.8065 0.8332 0.5598 0.8012
MVAR-ARF is supposed to be better than GPDC according to CV, but is not on the public leaderboard (see top of this page). Not sure how we should pick the best of the MVARs.
ica_phase-high_gamma-sync does reasonably well with CV/public = 0.8247/0.68427
csp_coher_logf is surprisingly well with CV/public = 0.8301/0.75269
ica_pib_ratioBB does incredibly well on the public leaderbord, with 0.8012/0.77110. This is basically as good as the best current submission, which is Gavin's probablygood with automatic dropping of worst elements.
At the moment, it seems like the public score is most correlated with the Patient_2 score. This might be because it is the worst performing subject. We probably overestimate the number of Patient_2 preictals. Might be we could improve the overall score with an improvement to the Patient_2 prior. Open to suggestions on why the worst subject would be linked to overall performance, and open to other suggestions to the relationship between prediction and public leaderboard.