Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aggregation error with SUM() #953

Open
edward-burn opened this issue Jan 6, 2025 · 1 comment
Open

aggregation error with SUM() #953

edward-burn opened this issue Jan 6, 2025 · 1 comment

Comments

@edward-burn
Copy link
Contributor

It seems like the below dbply code won't work with duckdb (unlike with other dbms) because duckdb doesn't automatically cast the results. Am wondering if duckdb-r could handle internally to make this typical dplyr code work out of the box

library(DBI)
#> Warning: package 'DBI' was built under R version 4.4.1
library(dplyr, warn.conflicts = FALSE)
db <- dbConnect(duckdb::duckdb())
DBI::dbWriteTable(db, "test", data.frame(x = c(1L, 2L, NA)), overwrite = TRUE)
# does not work
tbl(db, "test") |> 
  summarise(sum_is_1 = sum(x == 1L, na.rm = TRUE)) |> 
  dplyr::show_query()
#> <SQL>
#> SELECT SUM(x = 1) AS sum_is_1
#> FROM test
tbl(db, "test") |> 
  summarise(sum_is_1 = sum(x == 1L, na.rm = TRUE))
#> Error in `collect()`:
#> ! Failed to collect lazy table.
#> Caused by error in `dbSendQuery()`:
#> ! rapi_prepare: Failed to prepare query SELECT SUM(x = 1) AS sum_is_1
#> FROM test
#> LIMIT 11
#> Error: Binder Error: No function matches the given name and argument types 'sum(BOOLEAN)'. You might need to add explicit type casts.
#>  Candidate functions:
#>  sum(DECIMAL) -> DECIMAL
#>  sum(SMALLINT) -> HUGEINT
#>  sum(INTEGER) -> HUGEINT
#>  sum(BIGINT) -> HUGEINT
#>  sum(HUGEINT) -> HUGEINT
#>  sum(DOUBLE) -> DOUBLE
#> 
#> LINE 1: SELECT SUM(x = 1) AS sum_is_1
#>                ^
# works
tbl(db, "test") |> 
  summarise(sum_is_1 = sum(as.integer(x == 1L), na.rm = TRUE)) |> 
  dplyr::show_query()
#> <SQL>
#> SELECT SUM(CAST(x = 1 AS INTEGER)) AS sum_is_1
#> FROM test
tbl(db, "test") |> 
  summarise(sum_is_1 = sum(as.integer(x == 1L), na.rm = TRUE))
#> # Source:   SQL [?? x 1]
#> # Database: DuckDB v1.1.3 [eburn@Windows 10 x64:R 4.4.0/:memory:]
#>   sum_is_1
#>      <dbl>
#> 1        1

Created on 2025-01-06 with reprex v2.1.0

@toppyy
Copy link
Contributor

toppyy commented Jan 8, 2025

SUM(BOOL) is added to DuckDB with this PR. And the commit seems to be included in duckdb-r: #763. Your example works with latest version built from source. I suppose the issue will be resolved with the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants