Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[🚀 Feature]: Transparency on files written inside the docker containers #2382

Open
santech1983 opened this issue Sep 3, 2024 · 12 comments

Comments

@santech1983
Copy link

Feature and motivation

My organization is preventing spinning up containers without understanding where the files are written in the containers. Can we pls document all the files written inside the containers such that we can mount those folders where the files are written. I would recommend writing files only in the temp directory and not anywhere else to have a secured selenium HQ containers running.

Usage example

example: https://github.com/SeleniumHQ/docker-selenium/blob/trunk/NodeChromium/Dockerfile
When container is spin up, if there any files needs to be written by the jar or services, we need to write all the files only in /tmp folder and avoid writing any files after spin up inside the root folders including opt/bin/ etc..

Copy link

github-actions bot commented Sep 3, 2024

@santech1983, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@diemol
Copy link
Member

diemol commented Sep 3, 2024

This is mainly a request to document where the whole operating system writes files. Browsers write files in given directories, the same story for other binaries.

Why is this needed?

@VietND96
Copy link
Member

VietND96 commented Sep 3, 2024

Do you have any public articles mentioned about this practice/exercise?

@santech1983
Copy link
Author

santech1983 commented Sep 3, 2024

so the best practice is to enable to read only file system in the root directory and write files only in the /tmp directory to prevent security breaches. Currently we are not enforcing the ReadOnlyFileSystem for the root directories and this could cause files written to the root directories through unknown source of injection.

Most organization prevents writing files into the root directory as they have the ReadOnlyFileSystem policy on root files/folders

@diemol
Copy link
Member

diemol commented Sep 4, 2024

You can probably configure some of the binaries inside the container to write files only to a given path. However, there are other binaries where we do not know where files are written, for example the browser binary.

I am also aware that this Docker images are used by several banks, health institutions, and even governmental orgs in different countries. This is the first time I see this type of request. I doubt we can actually document all that (see my first comment).

@santech1983
Copy link
Author

Thanks for the response Diemol, all I'm looking for is to redirect the output of the binaries into the /tmp folder in all the containers used in the selenium grid and make the root folders and files readonly. To achieve this, the first step is looking for the lists of "What is writing where?" ...seems like we do not have a track of it. I agree, the Selenium HQ docker containers are safe historically and however, to prevent breach during vulnerabilities, we definitely need ready only setup on the root files and folders. Appreciate if this group can guide me on how to redirect all the outputs to the /tmp folder.

@diemol
Copy link
Member

diemol commented Sep 4, 2024

What I am saying is that you can't do that. Or at least I do not know how to tell the browser to only write files to /tmp, so I believe this cannot be guaranteed.

Probably research is needed and if you really need this, I would say you can help by start that research and documenting it.

@amardeep2006
Copy link
Contributor

On lighter note security team may be shocked when then realize almost everything is a file in Linux.

@VietND96
Copy link
Member

VietND96 commented Sep 6, 2024

I tried to use Podman to do a quick evaluation since it has the feature to enable root filesystem read-only.

image

The goal is for the container to be able to start properly when enabling this feat. However, looks like it failed due to supervisord is trying to write logs to /var/log/supervisor/supervisord.log which in scope of read-only file system.

  File "/usr/lib/python3/dist-packages/supervisor/loggers.py", line 213, in __init__
    FileHandler.__init__(self, filename, mode)
  File "/usr/lib/python3/dist-packages/supervisor/loggers.py", line 160, in __init__
    self.stream = open(filename, mode)
                  ^^^^^^^^^^^^^^^^^^^^
OSError: [Errno 30] Read-only file system: '/var/log/supervisor/supervisord.log'

Probably, in part of this, I will add a capability to configure superviord logs to be written in another path.

@VietND96
Copy link
Member

VietND96 commented Sep 6, 2024

In addition, I walk through few docs to understand read-only filesystems e.g https://www.thorsten-hans.com/read-only-filesystems-in-docker-and-kubernetes/. The article also mentioned in a real-world scenario, chances are pretty good that applications may have to write to the filesystem at several locations. And, of course, it is correct for docker-selenium containers since, besides selenium-grid, there are browsers and other dependencies like vnc, xvfb, novnc, and so on.
Assuming that we can transparent all the file written paths, then you also need efforts to configure tmpfs for those when starting the container.
I think you can try docker-selenium in mode read-only filesystems and evaluate how many dependencies are needed to write data/logs to different locations (it will raise an error as in the above sample). Then come back to us a request something like this dependency needs to be updated for outputs to a central location e.g /tmp or /opt/selenium/ only. Once all deps are found and resolved, then we can reach a container supports read-only filesystems seamlessly.

@santech1983
Copy link
Author

Thank you VietND96, you got the exact experience I’m facing! Thank you for the article you have posted it. In the AWS kubernates world it’s even easier to apply policy to enforce all the containers root file system as read only.

@VietND96
Copy link
Member

VietND96 commented Sep 8, 2024

You can check out the latest image tag with updates for supervisord config via ENV vars b53dc3f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants