user hook for waiting for asynchronous BTRFS activities

David Nicol davidnicol at gmail.com
Sat Nov 20 05:44:05 UTC 2010


On Fri, Nov 19, 2010 at 5:34 PM, John Johansen
<john.johansen at canonical.com> wrote:
> On 11/16/2010 02:28 PM, David Nicol wrote:
>> I have written and tested a patch, and gotten approval from Chris
>> Mason for an ioctl number, that introduces a facility for waiting for
>> certain asynchronous kernel-side activities: initially, subvolume
>> deletion, but it seems like a general interface is in order.
>>
> right if this could conceivably be applied to other filesystems then an
> ioctl is not the right interface, a generic solution is needed.  We
> already have mess with ioctls and snapshots, we really don't need another
> ioctl mess with something that does waits for asynchronous events.  This
> is something I would raise on fs-devel

ha! you see it as even more generic than being forward-looking to wait
for the various this and that that get discussed as being deferrable
on the btrfs-dev mailing list. Generic solutions to file system issues
become extensions to VFS and abstract the calls to the specific. I've
got a working btrfs ioctl; seeing the feature ( there really isn't
much to it) in action would be a prerequisite for extending the VFS.
And kernel support for waiting for asynchronous events, in general, is
already excellent.

>> Ideally, this new feature and a forward-looking extension convention
>> for the flags field (the ioctl takes two data, a timeout in
>> milliseconds and flag bits) could get discussed here before getting
>> included in ubuntu kernels and then going upstream.
>>
> I'm not saying no, but generally the order is reversed, going upstream
> or at least been presented to upstream before going into the ubuntu
> kernels.  If a generic solution was presented to upstream right now
> its unlikely it would hit .38, as it would need at least a few
> revisions.

It's been presented on BTRFS-DEV; nobody really engaged in a
discussion of designing the flags field for forward-compatibility.
Most discussion concerned the semantics of the modifications to the
btrfs user tools.

>> My ideas concerning the flags field are as follows:
>>
>> all zeroes: wait for pending subvolume deletions and nothing else
>>
>> |1:  do not wait for subvolume deletions
>> |2:  fail when there is an unexpected flag bit
>> |4 and higher: set to enable waiting for other conditions. Currently
>> the kcleaner_thread has two responsibilities, deleting subvolumes and
>> writing out deferred inode data, so |4 will enable waiting for all
>> deferred inode data to get written out, and |8 and up wait for other
>> things to become asynchronous so they can be assigned to them.
>>
>> Does that seem sound?
>>
> Hrrmm, maybe.  What other uses beyond asynchronous subvolume deletion do
> you see this being used for? Having a larger picture would help in
> understanding the interface.

A lot of proposals get floated in BTRFS-DEV to make this or that
operation asynchronous. All the currently asychronous operations were
synchronous once. The mess of making something asynchronous is, it
hasn't been possible to register for notification of when the task is
complete. One rarely cares, but disk space recovery is a situation
where it matters, in terms of throughput on machines used for
extensive testing, where each test gets its own subvolume and the
subvolumes are cleared to free space -- but there is a delay, and
registering for a notification would be better than polling `df`.

>> Also, is it worth the effort to make ioctl tools work on both 32 and
>> 64 bit systems, or is the standard practice that tools will work on
>> one or the other? There are a lot of kernel data defined as "unsigned
>> long."
>>
> Yes and this is one of the reasons to avoid ioctls.  The for your user
> space parameter vector the kernel won't do the conversion for you.  You
> have to handle it yourself.
>
> Generically there is no guarantee that the kernel and user space are even
> running in the same mode.  ie. 32 bit userspace on 64 bit kernels and
> 64 bit user space on 32 bit kernels.  It may be odd, but it can happen.

Just to avoid undetermined-size elements in interface structures. Two
u32 quantities are going to take 64 bits either way; two longs might
be 64 or 128. The ioctl interface is fine.  I'll revise it to use a
structure of u32 fields instead of longs, and if someone needs a
timeout more than four billion milliseconds or needs to wait for more
than 31 different asynchronous things to complete they can take a new
ioctl number or add another field to the bottom of it.

A generic ioctl-in-an-ioctl interface with a variable-sized interface
argument, for instance, is a thought to abandon, in my opinion.

Thanks!




More information about the kernel-team mailing list