-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Registering User Defined Functions in SparkSQL #125
Comments
I spotted my mistake. In the demo project in the
however it should be
In the last line I should be creating an anonymous Is this something that I could contribute to the Flambo project? |
@gareth625 hello! great work, thank you! do you have any examples with udaf (defined aggregate function)? I want to implement group by function over |
@NonaryR unfortunately I haven't. I've just had a look and I think it's a slightly different problem as the UDAF isn't an interface it's an abstract class. My Java and Clojure/Java interop isn't strong enough to know how much of a difference this makes. I'd be interesting to hear what you find out. I have some ideas but can't test them at the moment and don't want to suggested deadends as I don't really have enough experience to know if they're pointless. |
@gareth625 Good catch--yes, to send the function, it needs to be serializable. And yes, your udf macro looks helpful, and it'd be nice to have it in Note that you've duplicated UDF16 a couple times, but I imagine you've caught that by now. :) You might even use a macro to generate numbers 1-16. Thanks for the proposal, and we look forward to a PR if you're willing. |
#131 add udf1-udf3, sql test |
Hi @jiyouyou125 Thanks for that. I'd done a bit of work last week but was having trouble in my repl. I've created a PR against your repo with the helper macros to generate all the possible UDF helpers. I think there's a bit more tidying up that can be done with regard to registering the UDFs but I'm away now until next week. Sorry for the slow response. Thanks |
Hi
I asked this question over on the Flambo google group a few days ago however I'm not sure how active it is. Sorry if I'm just being impatient, I have made some progress since then and have a proper example project to go with my question this time.
I've been attempting to register a User Defined Function in SparkSQL and have made some progress by analogy with how Flambo has implemented the map, filter, etc. interface in
flambo.api
using theflambo.function
namespace. I've produced an analoguefunction
namespace to implement theorg.apache.spark.sql.api.java.UDF1
and.UDF2
,.UDF3
, etc. classes which can then be called in Clojure. This has worked locally with no problem but now I'm submitting to EMR I've getting ajava.lang.IllegalStateException: Attempting to call unbound fn: #'my-spark-sql.core/the-work-horse
.I've attached a sample project, my-spark-sql.zip, which you can call with
lein run local
and that should work. It does for me :) However I haven't managed to get it to run on EMR. I haven't tried usingspark-submit
locally yet.Does anyone have any thoughts on what might be wrong here?
The text was updated successfully, but these errors were encountered: