[PATCH] fs: store MS_BIND as MNT_BIND and show it in mountinfo

Thu Feb 2 17:48:11 UTC 2017

On Thu, Feb 2, 2017 at 4:45 PM, Seth Forshee <seth.forshee at canonical.com> wrote:
> On Thu, Feb 02, 2017 at 02:21:04PM +0100, Zygmunt Krynicki wrote:
>> This patch adds a new MNT_ flag that is set for bind mounts (it mirrors
>> MS_BIND) and surfaces it via mountinfo. This allows for easier
>> identification of mount entries that are bind mounted from somewhere
>> else.
>>
>> Signed-off-by: Zygmunt Krynicki <zygmunt.krynicki at canonical.com>
>
> This is a change in userspace ABI, so this is not something that we're
> likely to take into Ubuntu without being sure that it will also be
> accepted upstream.
>
> However my expectation is that this patch would meet resistance or be
> rejected outright upstream (explanation follows). Can you explain why
> you need this?

Yes. I will explain at the end of this message.

> Fundamentally "bind" refers to the mount operation and not to some
> property of the mount itself. Once you've performed a bind mount the new
> mount is more or less equivalent to the original - they have a peer
> relationship and not a parent/child sort of relationship. The kernel
> knows about the relationships between mounts, but whether or not one is
> the original and the other was created via a bind mount operation is
> irrelevant.
>
> The relationship between two mounts can be seen from userspace using
> /proc/<pid>/mountinfo. For example:
>
> # mount -o loop fs.img a
> # mount --bind a b
> # mount --bind a/foo c
> # cat /proc/self/mountinfo
> ...
> 158 26 7:0 / /home/ubuntu/bind-test/a rw,relatime shared:135 - ext4 /dev/loop0 rw,data=ordered
> 162 26 7:0 / /home/ubuntu/bind-test/b rw,relatime shared:135 - ext4 /dev/loop0 rw,data=ordered
> 166 26 7:0 /foo /home/ubuntu/bind-test/c rw,relatime shared:135 - ext4 /dev/loop0 rw,data=ordered
>
> The "shared:135" indicates that these mounts are all part of the same
> peer group (i.e. peer group 135). The mounts at .../a and .../b are
> completely equivalent to the kernel.

Until you change the sharing among them and...

# mount -o loop fs.img a
# mount --bind a b
# mount --bind a/foo c
# cat /proc/self/mountinfo | tail -n 3
223 199 7:19 / /home/zyga/experiment/a rw,relatime shared:262 - ext4
/dev/loop19 rw,data=ordered
236 199 7:19 / /home/zyga/experiment/b rw,relatime shared:262 - ext4
/dev/loop19 rw,data=ordered
257 199 7:19 /foo /home/zyga/experiment/c rw,relatime,bind shared:262
- ext4 /dev/loop19 rw,data=ordered
# mount --make-private b
# cat /proc/self/mountinfo | tail -n 3
223 199 7:19 / /home/zyga/experiment/a rw,relatime shared:262 - ext4
/dev/loop19 rw,data=ordered
236 199 7:19 / /home/zyga/experiment/b rw,relatime - ext4 /dev/loop19
rw,data=ordered
257 199 7:19 /foo /home/zyga/experiment/c rw,relatime,bind shared:262
- ext4 /dev/loop19 rw,data=ordered

EDIT: writing this I realized what my problem really is... let me try
to explain below (ignore that paste above)

My original problem is: given a declaration that "$source should be
bind-mounted in $destination" and the state of mountinfo, should this
operation be performed or is it already done?

# mount --bind /snap/snapd-hacker-toolbelt/14/src/
/snap/snapd-hacker-toolbelt/14/dst/
# cat /proc/self/mountinfo | tail -n 1
678 737 7:8 /src /snap/snapd-hacker-toolbelt/14/dst rw,relatime,bind
master:30 - squashfs /dev/loop8 ro

The problem with the way mountinfo presents the facts is that I was
implicitly looking for "$source" somewhere to cross-reference. I think
I now realize what I want to check for is different but already
present in the data that I have. I just need to come up with a set of
things that are equivalent to the $source mount and see if my
$destination mount is present there.

First, I want to find what the $source" really is. In the example
above source is "/snap/snapd-hacker-toolbelt/14/src/". Scanning the
mount table I can find.

737 730 7:8 / /snap/snapd-hacker-toolbelt/14 rw,relatime master:30 -
squashfs /dev/loop8 ro

Now this is not exactly $source but it is the longest prefix of
$source that I can find. A quick guess would be to look for a perfect
match and if not found, discard the final component and look again.
Once I know what the source really is (it is /dev/loop8 + the
concatenation of the discarded final components) I can state a
different question:

What is the set of mount entries using /dev/loop8:

/home/zyga # cat /proc/self/mountinfo | grep loop8
461 224 7:8 / /var/lib/snapd/hostfs/snap/snapd-hacker-toolbelt/14
rw,relatime master:30 - squashfs /dev/loop8 ro
737 730 7:8 / /snap/snapd-hacker-toolbelt/14 rw,relatime master:30 -
squashfs /dev/loop8 ro
678 737 7:8 /src /snap/snapd-hacker-toolbelt/14/dst rw,relatime,bind
master:30 - squashfs /dev/loop8 ro

I can now infer that both /snap/snapd-hacker-toolbelt/14/dst and
/snap/snapd-hacker-toolbelt/14/src represent the same object: /src
from /dev/loop8.

My original question, should I do that bind mount or is it already
done can be answered by checking if the $destination is present in the
set above.

Do you think I am on the right track?
ZK