-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix type coercion in bmerge #6603
Conversation
Generated via commit f5fb825 Download link for the artifact containing the test results: ↓ atime-results.zip
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6603 +/- ##
=======================================
Coverage 98.60% 98.61%
=======================================
Files 79 79
Lines 14518 14536 +18
=======================================
+ Hits 14316 14334 +18
Misses 202 202 ☔ View full report in Codecov by Sentry. |
@MichaelChirico is it worth to make the (length(unique(icols))!=length(icols) || length(unique(xcols))!=length(xcols)) ensuring that either |
Sorry, I just saw that this is a CRAN requirement, checking now. |
Definitely thanks for triaging a fix here. I think we both agree it's pretty hack-y & ideally not needed. I'm not sure I fully understand the bug yet, but as stated in OP we're doing IMO |
Unfortunately this "contract" does not hold. I have a Windows dev version installed here and get the following > typeof(.Date(0L))
[1] "integer"
> typeof(as.Date(.Date(0L)))
[1] "integer" |
Thanks, also confirmed that on 4.4.1 |
This is strange as Kurt mentioned there wasn't an intention to do this but I figure it would have been fixed by now if that were the case. |
Update this fix now can convert into one direction from integer to double # this works
x = data.table(a=1L)
y = data.table(c=1L, d=2)
y[x, on=.(c==a, d==a)]
y[x, on=.(d==a, c==a)]
# this still needs to fixed
x = data.table(a=1)
y = data.table(c=1, d=2L)
y[x, on=.(c==a, d==a)]
y[x, on=.(d==a, c==a)] |
Should I consider this PR in-progress for now? The diff has grown enough that it would help to add a brief overview of the changes to the PR description to orient reading |
Just received the "deadline" on this from Kurt:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
* fix type coercion in bmerge * fix bracket * add test cases * fix lint * fix old test case * rename x/i class * add minimal test * indent loop * add fix in one direction * remove indent to cater for diff * Revert "remove indent to cater for diff" This reverts commit 562a9fd. * remove indent * add 2nd case * remove trailing ws * update all cases * fix typo * fix test cases * update testcases * update copying attributes from int to dbl * start modularize * fix cases * ensure same types for test * add test for codecov * simplify * fix test on windows * simplify * add coerce function * modularize more * Use gettext() on character strings directly * rename getClass helper: mergeType * rename: {i,x}c --> {i,x}col I found myself wondering `ic`... "`i` character? `i` class?". Simpler to encode more info in the name * comment ref. issue * exchange subset with .shallow * undo test * Revert "undo test" This reverts commit c9d3d74. * update tests * add comment * add non right join testcase * move helper outside bmerge * update comment * add NEWS * update numbering * tweak NEWS --------- Co-authored-by: Michael Chirico <[email protected]>
Closes #6602
Base does not encounter this problem since one join column can not be in multiple join conditions.
What do we do:
In bmerge we check if the types to merge x and i in
x[i]
are compatible. This also does some type coercion for example when the responsible columns are integer and double.If a single column of i is in multiple conditions with columns of x, all of the to be joined columns need to coerce, this is why we introduce the extra check
length(ic_idx)>1L
and then iterate over the corresponding columns of x.