Potentially get worse fits when fitting knee. #35

TomDonoghue · 2017-09-26T20:01:27Z

In theory, asking to fit a knee, should reduce to a linear fit, if that is indeed a better fit. In practice, that is not the case - there are PSDs in which setting to not fit the knee (linear fit) leads to a better fit (in R^2 sense), then also fitting a knee parameter.

It's weird that with an extra parameter to play with, it can perform worse. It might have to do with the interaction between fitting slope & oscillations. It may also simply be due to the fact that fitting a knee effectively adds a new constraint - that the slope of the background go to zero below the knee - and this, in at least some cases, may be unhelpful.

Something to perhaps look into, play around with a bit. If nothing else, supports a strong suggestion to only try to fit a knee if there really is one (although that's potentially hard to evaluate).

TomDonoghue · 2017-09-30T04:21:34Z

@rdgao Any thoughts on this?

rdgao · 2017-09-30T16:48:33Z

hmm, that's weird. Was this before or after the b=1 fix in lorentzian_nk? It's possible that the "linear" fit was better but R^2 was computed on the b=1 fit.

Although, theoretically, the Lorentzian case should still converge to the same optimal fit because it's a superset, in which case it might be a problem with seeding? A few things to potentially explore:

do multiple fits with random seeds and see if the converge
visually examine test cases when linear fit returns a greater R^2

From a fitting perspective (as oppose to interpretability), it should use whatever gives the best R^2. Theoretically, that should be the general Lorentzian, but if that's not always the case, then yeah having a note would help. Or we can make modality specific suggestions, i.e. EEG/MEG use linear, esp if fitting only up to 30-40Hz, whereas LFP/ECoG should benefit from the knee fit.

I'll take a look in the monkey data to see whether any of the linear fits give a better R^2 and check back.

rdgao · 2017-09-30T17:27:46Z

Left panel: linear vs. lorentzian fit R^2 for awake eyes closed (blue) and anesthetized (red)
middle and right panels: example PSDs where linear R^2 > lorentzian R^2

It seems like Lorentzian has the most trouble when the low freq region is not flat, which is unsurprising given it tries to fit a flat top, but unexpected because it should converge to a linear fit anyway, so the optimization procedure perhaps warrants some more exploration.

This brings out another issue: what do we do about PSDs with oscillations in the low frequency region? In the right panel, the reason for the non-flat top PSDs is due to a slow oscillation (~1Hz) during anesthesia. This is apparent in the time domain, as well as when you increase the freq resolution of the PSDs (below). It may be the case that people expect to fit a delta oscillation that has the left half cut off?

rdgao · 2017-11-05T20:38:05Z

@TomDonoghue was this the only issue with the knee fitting, that it doesn't find the knee=0 solution even when default linear fits better?

This seems more to be a scipy optimize problem. In any case, I'm trying to collect all the problems about it to see if I can fix them in one go.

TomDonoghue · 2017-11-06T18:21:54Z

In terms of why I thought perhaps b=0 is some kind of weird (discontinuous) special case, that optimize wouldn't land on, but it seems from just playing around with the function that it's not really that. It's perhaps/probably some weird interaction of the many steps of FOOOF. A sort of brute-force fix might just be to explicitly test and choose b=0 if it's better...

I'm currently working on the synthetic datasets, which will then give us a better way to parse between versions. I'm also going to do a sweep of the other issues first. This I consider relatively low priority - FOOOF paper as is doesn't really use knee fits, so it's tractable to tag as a v0.1, with a marker that knee fits are still experimental. Long story short - if you have a suggested fix great, but if it's a matter of poking around, I'd wait a bit until we settle any other things over the next couple days, and have synthetic data to properly test with.

rdgao · 2017-11-06T18:50:53Z

you're not using knee fits at all? LFP & ECoG are (typically) much better with a knee fit, so I'd recommend at least showing an example of that to capture a greater audience.

in terms of the error, yeah that was my guess too, since background and gaussians are fit in separate steps, so it's possible that error is minimize for the initial background fit, leading to a worse oscillation fit because some of it was already "captured". I was going to try fitting the linear first and use those parameters to fit the background, like you were doing with the quadratic fit originally, but that feels like just piling more hacks. I might play around with some LFP data in the meantime, but won't push anything till you are done with the synthetic test data.

TomDonoghue · 2017-11-26T10:10:54Z

So, @rdgao we should revisit this, and check in on it. Now that I have a more proper synthetic data testing thing set up, we can more formally explore this a bit - perhaps come to some fixes and/or guidelines.

At a first pass - synthetic fitting is not wildly bad, or good - it can seem to have some problems reconstructing generated knees/slopes (so sometimes, it's not all that good at reconstructing generated parameters).

I think it's partly a degeneracy of the solution space - multiple slope/knee combinations can capture the bend. In some tests, once you have a little noise, it seems to end up with a solution that very reasonable captures the background, but with a different parameter combo than actually generated the data.

So: knee-fitting, as currently implemented, is a great way to capture the background (and thus extract oscillations well), but the actual background fits may be degenerated, and difficult to interpret. It might be a curve_fitting thing that we can tune with bounds, etc, but overall I'm not too sure what to do here.

So far I've only run a small number of tests. It might be, for example, that for more extreme cases (larger frequency range, knee's & slopes that come apart more) it does much better - but even if so, it's not necessarily clear how to relate that back to guidelines and interpretations for real data, plus the issue from above that the procedure doesn't necessarily converge on a background fit that leads to the best fit overall.

rdgao · 2017-11-27T19:26:45Z

interesting...so let's say you generate fake data with knee, how often is it that the with-knee fit gives you a slope that's worse than just the linear fit, over the full range?

I think the knee fitting is most useful in getting good oscillations, esp when the knee gets confused as an oscillation, but for slope fitting it's perhaps not as good. So this gets at two potentially different use cases:
#1. fit PSDs as well as it can, in terms of minimizing MSE and capturing oscillations when there are oscillations and don't fit oscillations when there are none.
#2. get best estimates for slope for regions that are actually 1/f.

imo, fitting the knee gets you #1, by virtue of having one more parameter but also in cases when there is obviously a knee, although this is not a given (see monkey plots above).
If the experiment turns out that even with non-accurate parameter retrieval, slope fit with exp is still better than with linear, then it gets you #2 as well, and I think we're good in that case.
If it doesn't, then we should make some decisions, basically either explicitly informing the user of this distinction (i.e. trust slope values less when you fit with exp even if you get a better model overall), or bake in some mechanism to do the fits separately, although it might be confusing since the "best slope fit" slope is a bit different from the "best overall fit" slope.

THAT BEING SAID, I advocate for fitting slope (for the explicit slope value) separately anyway, over a region where it is definitely linear, because otherwise it's less meaningful.

I have a few ideas, starting from easiest to hardest to implement:

just run a linear fit anyway prior to the exp fit, when asked to fit exp, and use that slope as seed, or compare the MSE and pick the better model (i.e. manually override the knee fit even when asked)
provide both exp and linear fit models regardless, and remove the user option to specify.
run exp a few times and pick some convergent value, with potential tradeoff for accuracy in favor of more found sets, or just pick the best fit. The evaluation should probably come after the oscillation fit, since the background likely captures some oscillations, but this might not be worth the complexity for the little gain (compared to just re-run prior to osc fit).
fit slope linearly after exp fit, over the region PAST the knee parameter (where it's more likely to be actually linear)
iteratively exclude oscillatory regions found with an initial run and redo background fit, up to some time or until convergence.

rdgao · 2017-12-08T07:05:59Z

did you want the knee fitting to be relatively not-stupid for the release? If so, I'll try to push something for tomorrow, at the very least implementing the linear vs knee comparison for with-knee fits.

TomDonoghue · 2017-12-08T07:23:41Z

I'm not sure what you mean by 'implementing linear vs knee comparison' but at this point, I think no algorithmic / API changes for a v0.1.0 tag / soft release (we can note knee fitting is still experimental, with caveats as mentioned above).

Figuring out what's best to do the for knee fitting is probably best served after properly running through simulations, then some exploring options, which is all development over and above tagging a first version, after which I'll focus more on the simulations. From there, if knee-related updates are relatively minor, we can add to v0.1.1, with any other small updates, that being the tagged version I foresee publicizing. (If updates are bigger, maybe supporting proper knee-fitting becomes a v0.2.0 thing, and then we figure out what goes into paper, etc., trying to be careful about scope creep and eternal beta, etc).

rdgao · 2017-12-08T07:25:11Z

i mean when the user requests a knee fit, internally running a linear fit beforehand and return the better model (with knee = 0 if linear is better)

TomDonoghue · 2017-12-26T22:19:03Z

For v0.1.0, we'll note that 'knee' fitting is still somewhat experimental, and in particular, only knee fit when you have high confidence there is a knee. A fuller figuring out / updating of this is pushed to v0.2

TomDonoghue · 2020-02-18T03:34:46Z

Okay, so I'm going to say this is more of a development question (concept / algorithm related more code related) and also that some aspects of this thread are quite outdated.

Moving this over to the development board here:
fooof-tools/Development#7

TomDonoghue added the v2 label Dec 26, 2017

TomDonoghue mentioned this issue Feb 18, 2020

Investigating Knee Values & Fitting fooof-tools/Development#7

Open

TomDonoghue closed this as completed Feb 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potentially get worse fits when fitting knee. #35

Potentially get worse fits when fitting knee. #35

TomDonoghue commented Sep 26, 2017 •

edited

Loading

TomDonoghue commented Sep 30, 2017

rdgao commented Sep 30, 2017

rdgao commented Sep 30, 2017

rdgao commented Nov 5, 2017

TomDonoghue commented Nov 6, 2017

rdgao commented Nov 6, 2017

TomDonoghue commented Nov 26, 2017

rdgao commented Nov 27, 2017

rdgao commented Dec 8, 2017

TomDonoghue commented Dec 8, 2017

rdgao commented Dec 8, 2017

TomDonoghue commented Dec 26, 2017

TomDonoghue commented Feb 18, 2020

Potentially get worse fits when fitting knee. #35

Potentially get worse fits when fitting knee. #35

Comments

TomDonoghue commented Sep 26, 2017 • edited Loading

TomDonoghue commented Sep 30, 2017

rdgao commented Sep 30, 2017

rdgao commented Sep 30, 2017

rdgao commented Nov 5, 2017

TomDonoghue commented Nov 6, 2017

rdgao commented Nov 6, 2017

TomDonoghue commented Nov 26, 2017

rdgao commented Nov 27, 2017

rdgao commented Dec 8, 2017

TomDonoghue commented Dec 8, 2017

rdgao commented Dec 8, 2017

TomDonoghue commented Dec 26, 2017

TomDonoghue commented Feb 18, 2020

TomDonoghue commented Sep 26, 2017 •

edited

Loading