We were notified by IVV5 about a potential security problem.
They noticed wrong permissions on their directories: The group u0mitarb
had permissions rwxs
.
We quickly found, that many directories, including the /usershare and /usershare/projects directory, containing the symlinks to the individual shares were affected.
To further analyze the issue, we gathered a list of affected directories and looked at the modified dates. It seemed, that the problem started to appear about a week before.
After a while of searching through the logs, we found that many file operations were done by IP addresses in Uni Cloud Kubernetes.
A few days ago, we were asked about non-terminating pods from one of our JupyterHub test deployments, after having switched from the Kubernetes in-tree driver to external CSI drivers.
Also, a few weeks ago, we have updated our CSI drivers for cinder and manila.
We determined, that we should look into what the kubelet does during pod termination and indeed, it was executing a lot of chmod operations.
The only reason, kubelet would be doing this, was due to fsGroup
settings in the pod.
Looking through the CSI configuration, we found that csi-cephfs, which is used by manila-csi has a default value of fsGroupPolicy: File
(source code.
Reading a few discussions in github, we found that this was a bad choice, and everyone recommends using ReadWriteOnceWithFSType
.
In JupyterHub, we have configured fsGroup for a long time, but as we were using the in-tree driver, this had no effect.
But with the CSI driver now being set to fsGroupPolicy: File
, kubelet was acutally enforcing the setting.
As we have mounted all /usershare
directories into the users Pods, kubelet was acting on all user data.
We quickly stopped the kubelet and the JupyterHub deployment and compiled a list of affected directories.
We removed access for the group u0mitarb
to all these directories.
The directory list including contact data was also transmitted to the CERT, to inform all users about the situation.
In the background, we started a task, to remove all s
permissions recursively from all files, which should not have been set by any users.
We changed the CSI configuration to a safe value fsGroupPolicy: ReadWriteOnceWithFSType
, so that for CephFS shares, this would never again happen in the future.
For our Kubernetes users, this means, that fsGroup
no longer works on manila shares, but this is easily circumvented by an initialization Job.