Performance characteristic of various filesystem

Rashkae ubuntu at tigershaunt.com
Sun Feb 15 14:57:28 UTC 2009


Lie Ryan wrote:
> Do anyone know the performance characteristic of various file systems 
> (ext2, ext3, ntfs, etc)?
> 
> I'm not interested in how fast read/write is in the file system, rather I 
> want to know how the filesystem copes with large numbers of files.
> 
> Since some time ago, I've started zipping up folders with large number of 
> small files to reduce the number of files. Recently, I've been wondering 
> whether this practice is actually useful at all, especially since zipping 
> up files would make it difficult to find duplicates with fdupes.
> 
> So, I want to know whether my zipping things up would actually give any 
> performance improvements or not.
> 
> TIA.

You aren't really going to see performance benefits from zipping up your
small files.

Ext3, by default, does not use B_tree index, and will become very
cumbersome if you try putting more than 10,000 files in a single
directory.  I'm sorry that I don't remember all the steps, but
basically, you use tune2fs to add dir_index option to your ext3 file
system then fsck with an option that makes it re-index your files.

This should make ext3 nearly as efficient as Reiser for file retrieval.
 (Ext3 is still very slow at file deletion, however.)  As a caveat, when
you do this, program that try to read all files from a directory (say,
for example, if you use zip or tar) will no longer process the files in
inode order.  That is to say, if you zip a directory with 5000 files,
chances are, the order that zip reads the files will be the same order
in which they were written to the drive, so the hard drive head doesn't
have to move much.  However, once you add dir_index, the 5000 files are
instead processed in a seemingly random hash order, and that causes the
hard drive head to thrash, decimating your i/o throughput.  (Note that
ReiserFS has this same problem as well.)

XFS and JFS don't suffer from this same problem to nearly the same
degree, but they come with their own caveats.  I would suggest not using
them unless you really need their performance characteristics.






More information about the ubuntu-users mailing list