-
-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cross-validation with folds of size < 5 #10
Comments
There is a while since I implemented the 'KernelKnnCV' R function and normally I add exceptions to the code whenever I receive errors when testing the functions. I'm willing to make potential changes but for which kind of dataset would you use fewer than 5 observations per fold with these custom internal functions? If you intend to use 'leave-one-out crossvalidation' my suggestion would be to use a resampling R package and write a few lines of code and include the 'KernelKnn' function. That way I think you will be in place to debug easier potential errors. For new functionality, a pull request is welcome. |
Thanks for your reply. I am using a bunch of different data sets. I am not specifically interested in leave-one-out per se (I avoid it if I can) but with large My suggestion would be to only trigger an error if any of the folds is empty. In fact, the easiest would be to allow any value of As for your suggestion to use a resampling package, I am not sure I understand. I am not specifically interested in leave-one-out per se, but I want to be able to use large This is what I am actually doing: |
In case of a PR, it would be nice to add a test case with one of the existing datasets of the KernelKnn R package to test the modified code and also that can be used as a reference for future users of the package |
Perfect. I'll try to do it and add a test. I would leave a stop condition if folds < nrow(X). (Not immediately, though, since I am currently swamped with ... the analysis of some data 😅 . |
I'll close the issue for now, feel free to open the pull request to adjust the code |
This is not a bug per se, but I think an unnecessary limitation.
If we try to run KernelKnnCV with a data set and folds such that folds have less than 5 observations, we get an error. This is in current line 64 of kernelknnCV.R:
KernelKnn/R/kernelknnCV.R
Line 64 in b892865
What is the rationale? This precludes leave-one-out crossvalidation, but also any cross-validation with, well, folds of size less than 5.
The text was updated successfully, but these errors were encountered: