Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add expression array_size #1122

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

Groennbeck
Copy link

Which issue does this PR close?

Closes #.

Rationale for this change

To add support for expression: array_size

What changes are included in this PR?

How are these changes tested?

Tested by adding spark plan that calculated array_size

}
}
let sizes_array = Int32Array::from(builder.finish());
Ok(ColumnarValue::Array(Arc::new(sizes_array)))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more efficient way to do this?

@viirya viirya changed the title Fix: add expression array_size feat: add expression array_size Nov 26, 2024
@SemyonSinchenko
Copy link
Member

SemyonSinchenko commented Nov 27, 2024

Why not to use an array_length from the Datafusion instead?

array_length(array, dimension)

Returns the length of the array dimension. array_length([1, 2, 3, 4, 5]) -> 5

It looks like it does the same.

It may be done in a similar way, like for the array_append:
#1072

@Groennbeck
Copy link
Author

Why not to use an array_length from the Datafusion instead?

array_length(array, dimension)

Returns the length of the array dimension. array_length([1, 2, 3, 4, 5]) -> 5

It looks like it does the same.

It may be done in a similar way, like for the array_append: #1072

Good input! will look into it!
Thank you

@viirya
Copy link
Member

viirya commented Nov 27, 2024

Note that DataFusion array_length's return type is UInt64 but Spark's array_size is Int32.

@@ -2220,6 +2220,16 @@ object QueryPlanSerde extends Logging with ShimQueryPlanSerde with CometExprShim
expr.children(2))
None
}
case Size(child, _) if child.dataType.isInstanceOf[ArrayType] =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This either needs to handle when legacySizeOfNull is true or pattern match to only false

@Groennbeck
Copy link
Author

apache/datafusion#13600

Have to wait for this to get into the next version

@andygrove
Copy link
Member

apache/datafusion#13600

Have to wait for this to get into the next version

Thanks @Groennbeck. The next DataFusion release will likely be released in the next couple of weeks.

@andygrove
Copy link
Member

@Groennbeck DataFusion 44.0.0 is now released

@Groennbeck
Copy link
Author

@Groennbeck DataFusion 44.0.0 is now released

Nice! will update this later today.

@andygrove
Copy link
Member

Hi @Groennbeck are you still planning on update this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants