user hook for waiting for asynchronous BTRFS activities

Sat Nov 20 08:49:00 UTC 2010

On 11/19/2010 09:44 PM, David Nicol wrote:
> On Fri, Nov 19, 2010 at 5:34 PM, John Johansen
> <john.johansen at canonical.com> wrote:
>> On 11/16/2010 02:28 PM, David Nicol wrote:
>>> I have written and tested a patch, and gotten approval from Chris
>>> Mason for an ioctl number, that introduces a facility for waiting for
>>> certain asynchronous kernel-side activities: initially, subvolume
>>> deletion, but it seems like a general interface is in order.
>>>
>> right if this could conceivably be applied to other filesystems then an
>> ioctl is not the right interface, a generic solution is needed.  We
>> already have mess with ioctls and snapshots, we really don't need another
>> ioctl mess with something that does waits for asynchronous events.  This
>> is something I would raise on fs-devel
> 

> ha! you see it as even more generic than being forward-looking to wait
> for the various this and that that get discussed as being deferrable
> on the btrfs-dev mailing list. Generic solutions to file system issues
> become extensions to VFS and abstract the calls to the specific. I've
> got a working btrfs ioctl; seeing the feature ( there really isn't
> much to it) in action would be a prerequisite for extending the VFS.
yes, indeed I am not saying just talk, a working feature is an important
step.  I was saying that if a general interface is order the next step is
to start a discussion on fs-devel.

> And kernel support for waiting for asynchronous events, in general, is
> already excellent.
> 
yes, but that doesn't mean it can't be extended or improved

>>> Ideally, this new feature and a forward-looking extension convention
>>> for the flags field (the ioctl takes two data, a timeout in
>>> milliseconds and flag bits) could get discussed here before getting
>>> included in ubuntu kernels and then going upstream.
>>>
>> I'm not saying no, but generally the order is reversed, going upstream
>> or at least been presented to upstream before going into the ubuntu
>> kernels.  If a generic solution was presented to upstream right now
>> its unlikely it would hit .38, as it would need at least a few
>> revisions.
> 
> It's been presented on BTRFS-DEV; nobody really engaged in a
> discussion of designing the flags field for forward-compatibility.
> Most discussion concerned the semantics of the modifications to the
> btrfs user tools.
> 
understandable, interface details like flag fields often don't
peak peoples interest.

>>> My ideas concerning the flags field are as follows:
>>>
>>> all zeroes: wait for pending subvolume deletions and nothing else
>>>
>>> |1:  do not wait for subvolume deletions
>>> |2:  fail when there is an unexpected flag bit
>>> |4 and higher: set to enable waiting for other conditions. Currently
>>> the kcleaner_thread has two responsibilities, deleting subvolumes and
>>> writing out deferred inode data, so |4 will enable waiting for all
>>> deferred inode data to get written out, and |8 and up wait for other
>>> things to become asynchronous so they can be assigned to them.
>>>
>>> Does that seem sound?
>>>
>> Hrrmm, maybe.  What other uses beyond asynchronous subvolume deletion do
>> you see this being used for? Having a larger picture would help in
>> understanding the interface.
> 
> A lot of proposals get floated in BTRFS-DEV to make this or that
> operation asynchronous. All the currently asychronous operations were
> synchronous once. The mess of making something asynchronous is, it
> hasn't been possible to register for notification of when the task is
> complete. One rarely cares, but disk space recovery is a situation
> where it matters, in terms of throughput on machines used for
> extensive testing, where each test gets its own subvolume and the
> subvolumes are cleared to free space -- but there is a delay, and
> registering for a notification would be better than polling `df`.
> 
Indeed, nor would I say one rarely cares, being able to wait on or be
notified when an asynchronous operation completes is a very nice feature
to have.

>>> Also, is it worth the effort to make ioctl tools work on both 32 and
>>> 64 bit systems, or is the standard practice that tools will work on
>>> one or the other? There are a lot of kernel data defined as "unsigned
>>> long."
>>>
>> Yes and this is one of the reasons to avoid ioctls.  The for your user
>> space parameter vector the kernel won't do the conversion for you.  You
>> have to handle it yourself.
>>
>> Generically there is no guarantee that the kernel and user space are even
>> running in the same mode.  ie. 32 bit userspace on 64 bit kernels and
>> 64 bit user space on 32 bit kernels.  It may be odd, but it can happen.
> 
> Just to avoid undetermined-size elements in interface structures. Two
> u32 quantities are going to take 64 bits either way; two longs might
> be 64 or 128. The ioctl interface is fine.  I'll revise it to use a
> structure of u32 fields instead of longs, and if someone needs a
> timeout more than four billion milliseconds or needs to wait for more
> than 31 different asynchronous things to complete they can take a new
> ioctl number or add another field to the bottom of it.
> 
ioctls certainly have their place, they are just often abused and its
easy to get them wrong.
 
> A generic ioctl-in-an-ioctl interface with a variable-sized interface
> argument, for instance, is a thought to abandon, in my opinion.
> 
yes that is just ugly.