Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Rewrite of native API using rust and N-API #507

Closed
daniel-brenot opened this issue Oct 31, 2021 · 60 comments
Closed

Proposal: Rewrite of native API using rust and N-API #507

daniel-brenot opened this issue Oct 31, 2021 · 60 comments

Comments

@daniel-brenot
Copy link
Contributor

Issue description

I am looking into rewriting the existing implementation of this native library in rust with N-API. I am working on a branch currently where I will publish the changes I am proposing, and I would love feedback on it once I publish the branch as WIP.

Reasons for the rewrite

There are multiple issues regarding cross platform support and Node version that would all be resolved in a rewrite in rust. It would also make solving issues such as the blocking semantics mentioned in #85 much easier.

How does the community feel about this?

If I do write this and make sure there are no errors, will this be used?

@jerch
Copy link
Collaborator

jerch commented Oct 31, 2021

I am a fan of a full rewrite of this lib, esp. to fix those nasty low level blocking bugs, and to expose somewhat more control to the outside (termios interface). Idk if Rust is a good choice for this, is that easy enough to integrate for CI stuff and such?

On a sidenote: I already fixed #85 for POSIX systems way back (in C++), but it was not merged, because it contained too many internal chances.

@daniel-brenot
Copy link
Contributor Author

Rust will definitely still integrate well with CI from my understanding. I will take a look at #85 as well to make sure that those changes would be integrated as a part of this.

From the rewrite that i've done thus far with no changes to how the API functions, the code is already quite a bit smaller and easier to understand.

I am hoping that this change will be welcomed by the community and help to advance this library moving forward.

@Tyriar
Copy link
Member

Tyriar commented Nov 1, 2021

Can't say for sure without seeing it, but Rust would probably be way better from my perspective and would force me to learn it more which I've wanted to do for a while anyway.

@daniel-brenot
Copy link
Contributor Author

After looking at #85 my first response would be that I am going to aim to do this work with the current working version on master, and then i'll add enhancements from #85 after if it is still decided that we would like that API added in. For the moment it is just easier since there is already a decent bit of code I am rewriting.

@Tyriar
Copy link
Member

Tyriar commented Nov 1, 2021

@daniel-brenot 👍 the smaller the change the easier it will be to merge

@jerch
Copy link
Collaborator

jerch commented Nov 1, 2021

@daniel-brenot I hope you can find a better way around the POLLHUP issue on linux, than my noisy poll loop. Should be possible with a dedicated polling thread injecting the data into the JS runtime. If you wonder why I re-routed the data packages via OS pipes - that was the only way I found working with theSocket interface on JS side. Maybe thats easier now with N-API primitives, idk.

@daniel-brenot
Copy link
Contributor Author

@jerch I think once the initial migration is done we can definitely look at what's available and make an issue to do this.

@jerch
Copy link
Collaborator

jerch commented Nov 1, 2021

Sure thing. Another issue I want to point out before you get your hands dirty into the wrong direction - #487 and its issues behind. It deals with a recent fork slowdown under BigSur, but got not yet merged. So be prepared for some forth and back changes on your side as well.

@daniel-brenot
Copy link
Contributor Author

Created a branch with the new changes. So far it's mostly just the one file and it's still incomplete, but it is taking shape.
https://github.com/daniel-brenot/node-pty/blob/rust-port/src/lib.rs

@Brooooooklyn
Copy link

With rust, you can implement Node.js library based on existed Rust crates, like https://crates.io/crates/portable-pty

@daniel-brenot
Copy link
Contributor Author

@Brooooooklyn Great suggestion!
I've looked at that library and it doesn't seem to provide enough functionality vs what we currently have to justify it in this use case.

Most of the code in the rewrite is due to the fork() function. The rest is relatively trivial.

Thank you for the suggestion though :)

@daniel-brenot
Copy link
Contributor Author

@Tyriar How would you feel if the process method threw an error instead of returning undefined? It would align with the rest of the methods that throw errors when they fail. It's currently implemented as returning undefined in my version, but i'm wondering if that should be changed to allow for an error message when the process call fails.

@jerch
Copy link
Collaborator

jerch commented Nov 3, 2021

If you mean with "process method" the pty & process creation step - thats actually another open issue (see #265). Ideally the lib exposes enough error state to the caller to know why things went wrong.

@daniel-brenot
Copy link
Contributor Author

#46 Can also factor into this. Precompiling for multiple platforms will be a part of this as well

@corwin-of-amber
Copy link
Contributor

Hi 👋 I've taken the code from @daniel-brenot and got it to work using napi-rs 2.0.0.alpha4. Currently open and fork work and some of the tests even pass. There are issues with encoding and signals.

Only Unix for now... I have very little experience with Windows.

@corwin-of-amber
Copy link
Contributor

Here are the (preliminary) results of my effort.
https://github.com/corwin-of-amber/node-pty/tree/rust-port

Are ppl interested in this?

@daniel-brenot
Copy link
Contributor Author

That's great! I'm happy to see that you've gotten it working on unix. I definitely still intend to work on this, but work has been demanding as of late.

@corwin-of-amber
Copy link
Contributor

corwin-of-amber commented Nov 25, 2021

Oh good! Should I make a PR? Is this repo still active?

@daniel-brenot
Copy link
Contributor Author

If it's complete, absolutely. That being said, definitely ping me once you're done, i'd love to help on the work you're doing as well moving forward where i can

@corwin-of-amber
Copy link
Contributor

The Unix port is complete. Why don't I make a PR for your fork, then you can look at it and add Windows support when you have the time.

@daniel-brenot
Copy link
Contributor Author

Sounds good to me!

@Tyriar
Copy link
Member

Tyriar commented Nov 29, 2021

Because of the amount of changes a PR to move to rust would include, I'm thinking we can set up a branch with CI/CD so people can start selfhosting ASAP. That will also make merging easier as the it can be stabilized on that branch.

@Tyriar
Copy link
Member

Tyriar commented Dec 10, 2021

FYI a big refactor happened in #487

Also, it might be best to maintain a fork for now and consider merging into node-pty when it's more stable (or continue on as a hard fork). I'd hate to see progress stalled because of me being the bottleneck to merge things.

@daniel-brenot
Copy link
Contributor Author

Bumping this as a heads up, I am still working on this. I was on a bit of a hiatus, but i am currently working on adapting the program to use the modified build process for NAPI and doing some more touches on ensuring that everything is safe. Thanks a ton to @corwin-of-amber for his contributions to my branch, they helped a ton.

@daniel-brenot
Copy link
Contributor Author

As an update, i think i'm going to update the CI for my branch to run and validate the code against the tests so i can see if my branch is passing. Additionally, i would like to look at bundling in the .node files for multiple platforms as a part of the release process. It would mean that users wouldn't need to compile for their platform if a .node file is found. It's quite simple to implement so i'm trying is out on my branch to see how well it works.

@corwin-of-amber
Copy link
Contributor

Sounds wonderful, thanks @daniel-brenot!

@daniel-brenot
Copy link
Contributor Author

daniel-brenot commented Apr 6, 2022

Does anyone know the purpose of exposing the native module? I noticed this is done only on linux platforms and i'm curious as to why. Is this used by VSCode or other libraries? I need to know because some of the changes I made will break the exposed native module if I don't account for it, so i need to know wether I strictly need to maintain the current behaviour, or if a small change to it is acceptable.

Edit: Heres the file i'm refering to:
https://github.com/microsoft/node-pty/blob/main/src/index.ts

@daniel-brenot
Copy link
Contributor Author

@corwin-of-amber Should I add your name to the copyright notice as well for the files you assisted with? If so, should I add the name that is on your github profile? (Shachar Itzhaky)

@corwin-of-amber
Copy link
Contributor

corwin-of-amber commented Apr 10, 2022

Oh right! That is actually good progress, interfacing the C++ code in order to forward the calls to WinAPI was sooooo clunky. But perhaps we can still learn something from the old behavior.

@daniel-brenot
Copy link
Contributor Author

As a random aside, it seems like this project may now support thread safety. The way the pty handles are stored now works across threads using a mutex, so that might be one of the many issues this fixes.

@daniel-brenot
Copy link
Contributor Author

@corwin-of-amber The issue may be due to windows antivirus on my system. I was reading the main README file for this project and realized that is one of the things this project warns against. At the very least i'll see about adding error codes to the error messages so it's easier to debug what is going wrong.

@chrmarti
Copy link

@daniel-brenot Is this close to be ready for testing? Anything we can help with? (Working on a CLI where a prebuilt node-pty would make installation easier. I believe we need N-API to keep the number of prebuilds low.)

@daniel-brenot
Copy link
Contributor Author

@corwin-of-amber Are you willing to help out again? It compiles on windows currently and the tests all pass except 6. I've fixed a couple issues but i'm run a bit thin here. @chrmarti If you want to look as well feel free, any help is appreciated. I'm going to turn my effort towards the linux side for a bit and see how the tests are faring there.

@daniel-brenot
Copy link
Contributor Author

@jerch Feel free to look at this as well.

@daniel-brenot
Copy link
Contributor Author

daniel-brenot commented Nov 12, 2022

I've looked through the list of issues and it looks like completing this would solve at least these issues:

#17
#46
#112
#405
#564
#558
#557
#551
#518
#438

Just for reference of why i'm very entheuseastic about completing this issue.

@corwin-of-amber
Copy link
Contributor

@daniel-brenot thanks! I will look at it, when I have access to my Windows machine.

@lydell
Copy link

lydell commented Nov 12, 2022

Do you think there’s value in doing N-API first, and then Rust?

I tried out switching to N-API. The Unix version compiles. Running bash works. Running a command with arguments crashes – I have never written a line of C++ before so I don’t know why. But it still felt promising!

main...lydell:node-pty:napi

If we manage to do N-API, maybe we can then compile to WebAssembly and get a cross-platform prebuilt package! (In case winpty causes issues for wasm, I looked at removing it here: main...lydell:node-pty:remove-winpty)

A Rust rewrite could come later to fix other issues.

Not sure if any of this makes sense, but just in case! I’m excited about getting rid of node-gyp in one of my projects using node-pty.

@daniel-brenot
Copy link
Contributor Author

Doing them both at the same time makes sense to me. The work to convert to N-API is relatively small in the work of the rust port. It would be a large amount of effort to duplicate it in c++, only to have to do it again in rust. I think moving forward with finishing this would be a better solution.

As far as using webassembly, I don't believe that is possible, and even if it is, it wouldn't provide any benefit. The application needs to be built for each platform individually because it's specifically a bridge to native os-specific functions. A wasm build can't have the code for windows, mac and linux in it because they need to be linked on each system separately.

@daniel-brenot
Copy link
Contributor Author

Linux tests seem to all be passing. I see 16 tests passing, so as long as there aren't tests that should run but don't, then linux support should be good to go. I still have to look into macos, so i'll see about setting up a mac vm for that. Mac support should be straight forward since it's mostly the same code but with a couple code paths different.

@daniel-brenot
Copy link
Contributor Author

Just tested including this into eclipse theia. Looks like it works great now for being included into other projects and we just need help debugging the windows issue before this is ready to be squashed and made into a pr

@corwin-of-amber
Copy link
Contributor

sry, it took me some time to figure out Rust on Windows and how to install llvm (using choco). Now I am getting errors about using an experimental feature (insert_entry). Are you using a nightly build? This should perhaps be encoded in the Cargo.toml somehow so that users can know which version of Rust to use.

@daniel-brenot
Copy link
Contributor Author

sry, it took me some time to figure out Rust on Windows and how to install llvm (using choco). Now I am getting errors about using an experimental feature (insert_entry). Are you using a nightly build? This should perhaps be encoded in the Cargo.toml somehow so that users can know which version of Rust to use.

Yes, i do believe i use nightly. Sorry for not setting that in the cargo.toml, feel free to do that. Thanks for helping.

@corwin-of-amber
Copy link
Contributor

corwin-of-amber commented Dec 5, 2022

@daniel-brenot I think perhaps I'm just too clueless in Windows, and/or I'm doing something terribly wrong. First, the branch (daniel-brenot/node-pty) did not even build for me and I had to change a couple things (add #![feature(entry_insert)] for that unstable feature and also make some variable mut), but then conpty does not run at all and just says Failed to connect named pipes and winpty fails with Error launching WinPTY agent. I am pretty sure I was able to do at least something previously, but now I just get this error that's not very informative.

@corwin-of-amber
Copy link
Contributor

@daniel-brenot Am I even running the right branch? I am a bit confused because here:
https://github.com/daniel-brenot/node-pty/blob/cde234931684e5e09274a7d8541655f38f9cc180/native/src/win/conpty.rs#L203

This pointer is going to be NULL so ofc InitializeProcThreadAttributeList will fail. Compare

siEx.lpAttributeList = reinterpret_cast<PPROC_THREAD_ATTRIBUTE_LIST>(attrList);

😖

@daniel-brenot
Copy link
Contributor Author

You must not be on the correct branch then.
https://github.com/daniel-brenot/node-pty/blob/rust-port/native/src/win/conpty.rs

I don't know why that was the branch you were looking at.

@daniel-brenot
Copy link
Contributor Author

As for proc thread attribute list being null, if you check the docs that is how you get the size of the attributes list. You call the function twice. Once with null to get the size, and a second time with a list initialised with the size returned from the first call.

In the code you linked i was doing it wrong. In the correct branch I fixed that issue.

@corwin-of-amber
Copy link
Contributor

corwin-of-amber commented Dec 7, 2022

https://github.com/daniel-brenot/node-pty/blob/rust-port/native/src/win/conpty.rs

Well yeah that was the one I was looking at (the permalink might look weird) and the file in this link you just sent also has the same problem, is seems?

@daniel-brenot
Copy link
Contributor Author

It seems i have this fixed on my local windows but never pushed the change 🤦 . I'll push the update shortly, I have a meeting to get to but i may have time to push the update

@corwin-of-amber
Copy link
Contributor

Np! I was suspecting smt like that might have happened. I will check back :)

@daniel-brenot
Copy link
Contributor Author

Just pushed an update i think should help.

@dshook
Copy link

dshook commented Apr 15, 2024

Curious if this is still being worked on

@daniel-brenot
Copy link
Contributor Author

Curious if this is still being worked on

Yep. I've reached pretty much the end of the road on this sadly as the rewrite doesn't work quite correctly on windows, but i've been informed it works great on linux.

If anyone is interested in helping on the windows version it is appreciated. A lot of work went into my fork and it would make upstream development much easier i'm sure

@dshook
Copy link

dshook commented Jun 17, 2024

What are the known issues with the windows version? Is there test coverage for all of them?

I'm also curious if you noticed any performance differences spawning the pty's between the original and your branch. It would def be nice to speed up the spawn speed on windows

@daniel-brenot
Copy link
Contributor Author

What are the known issues with the windows version? Is there test coverage for all of them?

I'm also curious if you noticed any performance differences spawning the pty's between the original and your branch. It would def be nice to speed up the spawn speed on windows

I can't imagine what i've done has sped up the spawn speed. It uses more or less the same mechanism as the c++ version and at the moment it doesn't even work!

If I remember correctly, it just hangs when trying to create a terminal session at one point. That or it fails to create a pipe needed for the terminal, i'm unsure. It's been a while since i've worked on this, but i've recently gained a need to finish this so I will most likely be working on it again. Since last time i worked on it chatgpt has been invented, so it may be able to provide insight to the issue.

Yes, there are unit tests for the windows version. This is how I verify that it is not working correctly.

@daniel-brenot
Copy link
Contributor Author

I have come back to the project on windows and it seems that while it does work to create a process, it doesn't work at being connected and calling the necessary callbacks. I'll work to debug why this week as i'd like to have this finally done

@Tyriar
Copy link
Member

Tyriar commented Dec 17, 2024

I believe @deepak1556 migrated to N-API some time ago:

#include <napi.h>

#include <node_api.h>

Thanks for all the discussion/help/exploration, but we don't have plans to invest in a rust fork due to the cost/impact we would get out of it.

Link to @daniel-brenot's rust port wip: https://github.com/daniel-brenot/node-pty/tree/rust-port

@Tyriar Tyriar closed this as completed Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants