-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NodeJS]: readIPC from buffer fails with 'Arrow file does not contain correct header', while it works in ArrowJS #109
Comments
|
The IPC readers are implemented upstream. Could you make this issue here? https://github.com/jorgecarleitao/arrow2 |
I am a bit surprised about |
Ah.. Polars doesn't have that distinction no. So the Then we must add this. |
Hi! I'm keen to get this into polars, as Snowflake uses this for their response format and would be awesome to get it in for reading data straight from SF into Polars. Here is a quick primer about the streaming files from Arrow: https://arrow.apache.org/docs/python/ipc.html IMHO, supporting files initially is fine, later can do other streaming support. I've started looking into this, and the major blocker I can see is projections. In arrow2, projections are not supported here: https://github.com/jorgecarleitao/arrow2/blob/main/src/io/ipc/read/stream.rs#L185 So we will need to build the projection from the chunks. Thoughts? |
Transfering this to the NodeJS repo as I have no way to reproduce this using Python/Rust. Not sure if this is still relevant. |
TLDR: Solves #109 More or less the IPC Stream methods are straight copies of the IPC File (Feather) ones, swapping out the IpcReader, IpcWriter for their streaming equivalents; the API should be identical to py-polars (with the exception of file-like objects as input for read_ipc, read_ipc_stream - not much point adding that until streaming IO is exposed upstream). I've left the docstrings basically untouched, let me know if you want those tweaked (the `@param` s appear to have drifted over time).
Using Node.JS
What version of polars are you using?
"nodejs-polars": "^0.2.0"
What operating system are you using polars on?
MacOS Big Sur 11.1
Describe your bug.
Reading in a buffer from an
.ipc
(ArrowStream) file usingreadIPC
fails withError: Arrow file does not contain correct header
. At the same time the file is not corrupt since it can be loaded using apache-arrow'sTable.from
methodWhat are the steps to reproduce the behavior?
See code example below. I'll post both the .arrow file (works) and .ipc file (doesn't work) as attachment
The text was updated successfully, but these errors were encountered: