Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for lambda/higher order functions #14205

Open
gstvg opened this issue Jan 20, 2025 · 0 comments
Open

Add support for lambda/higher order functions #14205

gstvg opened this issue Jan 20, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@gstvg
Copy link
Contributor

gstvg commented Jan 20, 2025

Is your feature request related to a problem or challenge?

Some engines, like DuckDB and Clickhouse, supports lambda functions, like:

SELECT list_filter(numbers, x -> x > 40) as greater_than_40s FROM relation

There's already support for the syntax in sqlparser-rs

From #12206 (comment) by @jayzhan211

Describe the solution you'd like

One of

  1. Add Expr::Lambda{arg_names: Vec<String>, expr: Expr} variant
  2. Change ScalarFunction.args :
struct ScalarFunction {
    args: Vec<ScalarFunctionArgument>
    ... omitted
}

enum ScalarFunctionArgument {
    Expr(Expr),
    Lambda{ arg_names: Vec<String>, body: Expr }
}

Create a LambdaPhysicalExpr that holds the lambda physical expr and returns ScalarValue::Null in PhysicalExpr::evaluate or an error
Open question: Define a way to a ScalarUDFImpl declare the types of the args of all lambda functions it receives, so a input schema can be built and used to generated the LambdaPhysicalExpr

And then, one of:

  1. Make ScalarFunctionArgs generic over the arg type, with ColumnarValue as default, and add a new ScalarUDF method invoke_with_lambda_args / invoke_higher_order_with_args that receives ScalarFunctionArgs<ColumnarValueOrLambda> instead of ScalarFunctionArgs, and has a default implementation calling invoke_with_args, returning an error if any arg is an lambda. The lambda arg is created if any children PhysicalExpr is a LambdaPhysicalExpr
enum ColumnarValueOrLambda {
    Value(ColumnarValue),
    Lambda(&dyn PhysicalExpr)
}

struct ScalarFunctionArgs<T = ColumnarValue> {
    args: Vec<T>,
    ... omitted
}
  1. Add physical_exprs: Vec<Arc<dyn PhysicalExpr>> to ScalarFunctionArgs, so lambda expressions can be extracted from
  2. Add LambdaScalarUDF trait (Duplication with ScalarUDF, more work and more code to mantain, less flexible than 1, one more trait to document and to users to reason about)
  3. Change ScalarFunctionArgs.args to Vec<ColumnarValueOrLambda> instead of Vec<ColumnarValue> (A lot of breakage, including public)
  4. Add Lambda variant to ColumnarValue (Even more breakge than 2, and a lambda doesn't quite fit the concept of a columnar value)

Describe alternatives you've considered

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant