Best filesystem to use for a specific type of application
Rashkae
ubuntu at tigershaunt.com
Mon Feb 13 16:20:19 UTC 2012
On 02/13/2012 11:08 AM, Steve Flynn wrote:
> Afternoon all,
>
> I'm looking at a situation where I need to read a lot of small files.
> Roughly 40,000,000 files averaging around 35KB each. Some will be
> larger and some will be smaller as they are TIFF scans. Not sure of
> overall size of the dataset yet but 3 TB feels about right (not got
> the data yet so can't tell exactly)
>
> As you're probably aware, very small files are a nightmare for
> throughput. Currently, we've been using encrypted external USB
> drivesto move this data between ourselves and my clients but now that
> the size of the dataset is increasing, it's time to move to something
> a bit more robust. I've been looking at a couple of NAS drives to
> press into action, some of which give us the option of changing the
> filesystem from to something more suitable.
>
> Can anyone point me to some stats for how differing filesystems
> (Reiser, XFS, JFS, Ext3, Ext4, BTRfs, etc) stack up against each other
> when dealing with a lot of very small files. I have a little bell
> tinkling away at the back of my mind that Reiser was particularly good
> for small files, but I could well be making that up... plus I don't
> know how well that stacks up these days against the advances made in
> other filesystems.
>
ReiserFS is deceptive. It was advertised as being very high performance
with small files. But that was only taking into account benchmarks that
favor it. In my experience with this kind of workload, one very
important 'benchmark' is the speed in which you can read all files in
the order they are returned by Readdir. (The function that lists all
files in a directory.) For the reading of small files to be efficient,
they have to be read in the order they are laid out on the hard drive.
Otherwise, the head has to thrash with lots of random seeks, and that
will drop your read performance to only a few MB/s.
I haven't tested it much for this, but I think EXT4 has overcome this
limitation. XFS was by far the fastest filesystem I tested for this
workload (before EXT4 was released.) Reiser, and EXT3 were practically
unusable.
More information about the ubuntu-users
mailing list