Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change fifo Io to PIPE in shim , just do like go shim. Resovled the raw fd case problem. #276

Open
jokemanfire opened this issue May 29, 2024 · 15 comments

Comments

@jokemanfire
Copy link
Contributor

jokemanfire commented May 29, 2024

Related
I have told this question to containerd . But looks like containerd will not change. So I will take a pr to change fifo to pipe.
I have complete this code , after some ci test ,I will submit this pr.

@jokemanfire
Copy link
Contributor Author

I found another two problem, when use fifo directly.

  1. ctr run -d busybox:latest test , it status will be stopping directly , but go shim will not.
  2. when containerd service is stop , all rshim io will broken, but not go shim.

This is a method to get this error.
1、Get a image
Dockerfile like this:

FROM busybox:latest

COPY test.sh /

ENTRYPOINT ["sh","/test.sh"]

test.sh is blow this:

while true; do 
    sleep 3
    echo "hello"
    result=$?
    if [ $result -ne 0 ]; then
        date >> log.txt
        echo "echo faile . Result : $result" >> /log.txt
    fi
done

docker build get this image.
use ctr import this image.
2、run a container
then use rshim to run a container.
3、get this error
stop containerd service . you can see the error message in this container. but go shim will not be influenced.
So I think use a pipe in shim may be completely needed. This pr which I test can resolve this problem #278

friendly ping , @fuweid @mxpv @Burning1020 . Looking forward to your reply.

@jokemanfire
Copy link
Contributor Author

jokemanfire commented Sep 29, 2024

tokio 1.40 pipe can resolve pipe problem perfect. friendly ping , @fuweid @mxpv @Burning1020

@jokemanfire jokemanfire changed the title Change fifo Io to PIPE in shim , just do like go shim Change fifo Io to PIPE in shim , just do like go shim. Resovled the much io case problem. Sep 29, 2024
@jokemanfire jokemanfire changed the title Change fifo Io to PIPE in shim , just do like go shim. Resovled the much io case problem. Change fifo Io to PIPE in shim , just do like go shim. Resovled the raw fd case problem. Oct 10, 2024
@fuweid
Copy link
Member

fuweid commented Oct 11, 2024

Hi @jokemanfire , would you please file pull request to fix this? thanks

@jokemanfire
Copy link
Contributor Author

@fuweid Please have a check #278

@zhaodiaoer
Copy link
Contributor

tokio 1.40 pipe can resolve pipe problem perfect. friendly ping , @fuweid @mxpv @Burning1020

Hi @jokemanfire can you give more detail about why "tokio 1.40 pipe can resolve pipe problem perfect" ?

I have also encountered similar problem as you found: "when containerd service is stop , all rshim io will broken, but not go shim.", and I found another problem: the stdout stream of container process which comes from rust-shim is not flush at real time, flush one page in one time then delay a long time, not line-by-line, I don't know if this related to that use FIFO as process stdout directly

I am following up on this issue, please give some updates, Thanks !

@jokemanfire
Copy link
Contributor Author

jokemanfire commented Nov 13, 2024

the stdout stream of container process which comes from rust-shim is not flush at real time, flush one page in one time then delay a long time, not line-by-line, I don't know if this related to that use FIFO as process stdout directly

This problem ,I didn't meet. Is there some method to get this problem? Use FIFO directly , will cause some problems , and the problem can learn from https://fuweid.com/post/2022-embedshim-kernel-is-my-sidecar/ . Thanks @fuweid . There 's some describe like
"
embedshim 同样也采用中转的方式来处理标准输入,但它直接将读写模式的有名管道交给了容器的标准输出,减少标准输出的拷贝。embedshim 插件属于 containerD 进程的一部分,一旦 containerD 重启,那么容器进程的 输入端 将收到 SIGPIPE 错误。对于这种情况,个人觉得是可以接受的。在交互模式下,用户会感知到容器引擎的停服。而线上环境的大部分场景都是采用 Headless 无交互模式,容器进程的输入端都是 /dev/null,而标准输出的状态由有名管道做持久化,不会因为 containerD 停服而出现 容器输出端 的 SIGPIPE 错误。
"
I want to change FIFO to pipe, because some problems I think which is unacceptable in Rustshim. And change the 'pipe_os' to 'tokio_pipe', because the async trait which under high concurrency IO will cause the tokio_copy spwan will be residual.(I think it caused by the raw_fd, and there is a problem with implementing the Asynchronous trait) The Rustshim can't be delete successful.If there are some replication methods here, I would be happy to determine if the problem is caused by FIFO IO.

@zhaodiaoer
Copy link
Contributor

zhaodiaoer commented Nov 13, 2024

the stdout stream of container process which comes from rust-shim is not flush at real time, flush one page in one time then delay a long time, not line-by-line, I don't know if this related to that use FIFO as process stdout directly

This problem ,I didn't meet. Is there some method to get this problem?

I didn't do any special thing before i encounter this problem, I have a program with high frequency log out, and when I follow logs via crictl logs -f xxx I got very long delay between intermittent output, after some investigating i found that log file produced from containerd-cri also intermittent, I guess some abnormal thing from new way of using FIFO or rust tokio runtime.

Simple diagram:

Go shim: |fifo reader| <-- fifo --> |io copier| <-- pipe --> |container process|
Rust shim: |fifo reader| <-- fifo --> |container process|

The fifo and fifo reader are from containerd-cri and have no difference, i guess problem comes from second half

@zhaodiaoer
Copy link
Contributor

the stdout stream of container process which comes from rust-shim is not flush at real time, flush one page in one time then delay a long time, not line-by-line, I don't know if this related to that use FIFO as process stdout directly

I think maybe I've found the cause. I'll try to file a PR about it later.

@analytically
Copy link

Seeing level=error msg="copy io failed Input/output error (os error 5)" when running this, could this be related?

@jokemanfire
Copy link
Contributor Author

jokemanfire commented Nov 16, 2024

copy io failed Input/output

If you patched #278 ? If yes, Could you provide a more detailed description or some logs . For checking if it is my patch's problem.
Ps: binary io is not realize, nerdctl -t -d will fail.

@analytically
Copy link

Not patched. I will patch and try again.

@analytically
Copy link

Patched, same error, so not fixed with #278

@jokemanfire
Copy link
Contributor Author

jokemanfire commented Nov 16, 2024

Patched, same error, so not fixed with #278

Could you support the debug log? It may caused by copy_console (tty) , there is no more information, so it cannot be determined.

@analytically
Copy link

Image

This is what I could see already, any idea? I'll look at it more closely on Monday

@jokemanfire
Copy link
Contributor Author

jokemanfire commented Nov 17, 2024

Image

This is what I could see already, any idea? I'll look at it more closely on Monday

I think in the spawn_copy while the read/write side closed suddenly, it may print this. You can check it , it should occur in tokio_copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants