Skip to content

Features (Finalised)

scottclowe edited this page Nov 10, 2014 · 3 revisions

Preprocessing models

  • raw: Raw. No preprocessing. The data as provided from the competition website.
  • cln: Cleaned. Patient_1 and Patient_2 have 60Hz line noise (and its harmonics) removed with a notch reject filter, and are high-pass filtered above 0.1Hz. Very high frequency noise (>300Hz) and artefacts remain. Other sessions are unchanged.
  • dwn: Downsampled. Patient_1 and Patient_2 are downsampled from 5kHz to 400Hz, to match the Dog data (low-pass filtered to Nyquist frequency of 200Hz first).
  • ica: Independent Component Analysis. Data is transformed onto a new basis separating out sources so they are maximally independent. Subject-specific weight matrices are computed from all datapoints in with blind source separation using FastICA algorithm.
  • icadr: Dimension-reduced Independent Component Analysis. Data is transformed with the same basis as ica, but then all but the first 8 components are discarded. I am not sure, but I'm since I used the greedy algorithm to produce the ICA, I think the first components are starting from larger PCA eigenvectors.
  • csp: Common Spatial Patterns. Data is transformed onto a new basis giving maximal separability between the preictal and interictal data sets. Basis transformed channels are ordered from most separable to least seperable. Separability is measured by variance.
  • cspdr: Dimension-reduced Common Spatial Patterns. Data is transformed with the same basis as csp and then all but the first 8 components are discarded.

The preprocessing models can be stacked, for example 'cln,cspdr,ica,dwn' will first clean the data (cln), then do a CSP transformation and drop all but the first 8 most discriminable components (cspdr), and then do an ICA basis transformation (ica), then downsample (dwn).

Bands

When bands are used, these are the bands we consider

  • delta: (1-4 Hz)
  • theta: (4-8 Hz)
  • alpha: (8-12 Hz)
  • beta: (12-30 Hz)
  • low_gamma: (30-70 Hz)
  • high_gamma: (70-180 Hz)

Features

  • feat_var: Variance of each channel (single-channel timedomain statistic)
  • feat_lmom-1: 1st order L-moment (mean) (single-channel timedomain statistic)
  • feat_lmom-2: 2nd order L-moment (variance) (single-channel timedomain statistic)
  • feat_lmom-3: 3rd order L-moment (skewness) (single-channel timedomain statistic)
  • feat_lmom-4: 4th order L-moment (kurtosis) (single-channel timedomain statistic)
  • feat_lmom-5: 5th order L-moment (Higher Order Statistic) (single-channel timedomain statistic)
  • feat_lmom-6: 6th order L-moment (Higher Order Statistic) (single-channel timedomain statistic)
  • feat_cov: Covariance of each pair of channels (cross-channel timedomain statistic) [skew-symmetric]
  • feat_corrcoef: Pearson's correlation coefficient (cov normalised by var) (cross-channel timedomain statistic) [skew-symmetric]
  • feat_corrcoefeig: Eigenvalues of corrcoef (cross-channel timedomain summary-statistic)
  • feat_spearman: Spearman's rank correlation (cross-channel timedomain statistic)
  • feat_pib: Power in bands, formed from summing PSD power within each band (single-channel frequency-domain)
  • feat_pib_ratioBB: Ratio of power in each band to the broadband power [1-180Hz] (single-channel frequency-domain)
  • feat_pib_ratio: Ratio of power in bands with each other (single-channel cross-frequency) [skew-symmetric]
  • feat_psd: Power spectral density, linearly sampled in frequency domain (single-channel frequency-domain) [do not use! too many datapoints]
  • feat_coher: Power spectral coherence, linearly sampled in frequency domain (cross-channel frequency-domain) [skew-symmetric] [do not use! too many datapoints]
  • feat_psd_logf: Power spectral density, power within bands logarithmically spaced in frequency domain (single-channel frequency-domain) [use with caution! many datapoints]
  • feat_coher_logf: Power spectral coherence, normalised cross-channel ratio of power within bands logarithmically spaced in frequency domain (cross-channel frequency-domain) [skew-symmetric] [use with caution! many datapoints]
  • feat_act: Auto-correlation width (single-channel timedomain)
  • feat_xcorr-ypeak: Cross-correlation peak normalised against variance, within -5<t<5sec window (cross-channel timedomain) [skew-symmetric]
  • feat_xcorr-tpeak: Cross-correlation peak lag (cross-channel timedomain causal-model) [skew-symmetric]
  • feat_xcorr-twidth: Cross-correlation peak width (cross-channel timedomain) [skew-symmetric]
  • feat_FFT: First 250 coefficients of Fast Fourier Transform of data (single-channel frequencydomain) [do not use! too many datapoints]
  • feat_FFTcorrcoef: Pearson's correlation coefficient between first 250 datapoints of FFT of data (cross-channel frequencydomain statistic) [skew-symmetric]
  • feat_FFTcorrcoefeig: Eigenvalues of FFTcorrcoef (cross-channel frequencydomain summary-statistic)
  • feat_PSDcorrcoef: Pearson's correlation coefficient of PSD of data, sampled every 1Hz (cross-channel frequencydomain statistic) [skew-symmetric]
  • feat_PSDcorrcoefeig: Eigenvalues of PSDcorrcoef (cross-channel frequencydomain summary-statistic)
  • feat_PSDlogfcorrcoef: Pearson's correlation coefficient of PSD of data, sampled logarithmically in frequency domain (cross-channel frequencydomain statistic) [skew-symmetric]
  • feat_PSDlogfcorrcoefeig: Eigenvalues of PSDlogfcorrcoef (cross-channel frequencydomain summary-statistic)
  • feat_phase-#-sync: [#=band] Phase synchrony (cross-channel timedomain) [skew-symmetric]
  • feat_phase-#-dif: [#=band] Phase offset between channels (cross-channel timedomain) [skew-symmetric]
  • feat_ampcorrcoef-#: [#=band] Pearson's correlation coefficient for envelope amplitude of band (cross-channel timedomain statistic) [skew-symmetric]
  • feat_ampcorrcoef-#-eig: [#=band] Eigenvalues of ampcorrcoef (cross-channel timedomain summary statistic) [skew-symmetric]
  • feat_pwling1: Entropy based PairWise Linear-Non-Gaussian model of connection strengths (cross-channel timedomain causal-model) [skew-symmetric]
  • feat_pwling2: First-order approximation, good for sparse variables, of PairWise Linear-Non-Gaussian model of connection strengths (cross-channel timedomain causal-model) [skew-symmetric]
  • feat_pwling4: Skewness measure of PairWise Linear-Non-Gaussian model of connection strengths (cross-channel timedomain causal-model) [skew-symmetric]
  • feat_pwling5: Dodge-Rousson measure, for skewed variables, of PairWise Linear-Non-Gaussian model of connection strengths (cross-channel timedomain causal-model) [skew-symmetric]
  • feat_ilingam-connweights: Independent Component Analysis derived Linear-Non-Gaussian model of connection strengths (cross-channel timedomain causal-model) [skew-symmetric]
  • feat_ilingam-causalorder: Independent Component Analysis derived Linear-Non-Gaussian model of causal ordering of channels [ch4, ch2, ch3, ch1,...] (cross-channel timedomain causal-model)
  • feat_ilingam-causalindex: Independent Component Analysis derived Linear-Non-Gaussian model of causal ordering of each channel [4th, 2nd, 3rd, 1st,...] (cross-channel timedomain causal-model)
  • feat_mvar-ARF: Coefficients for a fitted 12-point MultiVariate-AutoRegressive model (cross-channel timedomain)
  • feat_mvar-COH: Coherence of the coefficients for the MVAR (cross-channel frequencydomain)
  • feat_mvar-COHphs: Phase of Coherence for MVAR (cross-channel frequencydomain)
  • feat_mvar-DC: Directed Coherence of the coefficients for the MVAR (cross-channel frequencydomain causal-model)
  • feat_mvar-DCphs: Phase of Directed Coherence for MVAR (cross-channel frequencydomain)
  • feat_mvar-DTF: Directed Transfer Function of the coefficients for the MVAR (cross-channel frequencydomain causal-model)
  • feat_mvar-DTFphs: Phase of Directed Transfer Function for MVAR (cross-channel frequencydomain)
  • feat_mvar-PCOH: Partial Coherence of the coefficients for the MVAR (cross-channel frequencydomain)
  • feat_mvar-PCOHphs: Phase of Partial Coherence for MVAR (cross-channel frequencydomain)
  • feat_mvar-PDC: Partial Directed Coherence of the coefficients for the MVAR (cross-channel frequencydomain causal-model)
  • feat_mvar-PDCphs: Phase of Partial Directed Coherence for MVAR (cross-channel frequencydomain)
  • feat_mvar-GPDC: Generalized Partial Directed Coherence of the coefficients for the MVAR (cross-channel frequencydomain causal-model)
  • feat_mvar-GPDCphs: Phase of Generalized Partial Directed Coherence for MVAR (cross-channel frequencydomain)
  • feat_mvar-H: Tranfer Function Matrix of the coefficients for the MVAR (cross-channel frequencydomain)
  • feat_mvar-Hphs: Phase of Tranfer Function Matrix for MVAR (cross-channel frequencydomain)
  • feat_mvar-S: Spectral Matrix of the coefficients for the MVAR (cross-channel frequencydomain)
  • feat_mvar-Sphs: Phase of Spectral Matrix for MVAR (cross-channel frequencydomain)
  • feat_mvar-P: Inverse Spectral Matrix of the coefficients for the MVAR (cross-channel frequencydomain)
  • feat_mvar-Pphs: Phase of Inverse Spectral Matrix for MVAR (cross-channel frequencydomain)
  • feat_emvar-#: Extended Multivariate Analysis, taking into account instantaneous effects. Contains all the MVAR features, plus extended versions of them.

[skew-symmetric] = The feature is skew-symmetric, so only the upper-triangle of the combination of pairs is used to prevent redundancy in the data. (The lower-triangle is always the negative of the upper-triangle.) Also, the diagonal is removed since it is always 1 in these cases.

Assumed feature ordering for each feature class

  • Single-channel timedomain statistics: (csp > ica > raw), lmom-3 > lmom-2 > var > lmom-1 > lmom-4 > act > lmom-5 > lmom-6
  • Cross-channel timedomain statistics: (ica > csp > raw), xcorr-ypeak > ampcorrcoef-# > spearman > corrcoef > xcorr-twidth > cov
  • Single-channel frequencydomain: (csp > ica > raw >), pib_ratioBB > pib > psd_logf > FFT
  • Cross-channel frequencydomain: (csp > ica > raw >), mvar-PCOH > mvar-COH > pib_ratio > coher_logf
  • Cross-channel frequencydomain statistics: (ica > csp > raw), PSDlogfcorrcoef > PSDlogfcorrcoefeig > PSDcorrcoef > PSDcorrcoefeig > FFTcorrcoef > FFTcorrcoefeig
  • Causal-model timedomain: (ica > raw > csp), pwling1 > ilingam-connweights > ilingam-causalindex > pwling4 = pwling5 > phase-#-sync > phase-#-diff > mvar-ARF > xcorr-tpeak > pwling2 > ilingam-causalorder
  • Causal-model frequencydomain: (ica > raw > csp), mvar-GPDC > mvar-PDC > mvar-PDC
  • Wildcards: mvar-#phs
  • Of little use: FFT, coher_logf, psd_logf, mvar-H, mvar-S, mvar-P, ilingam-causalorder

You should probably not use more than one feature from each class because they will be correlated with each other.

Also, only use one MVAR feature at a time, as they all all different representations of the same time series model.

Clone this wiki locally