Saturday, 14 January 2012

January 14th, 2012

Dear Diary,

today I've written a small benchmark utility to try to emulate NNTP server performance.  A one-file-per-article spool has somewhat unusual performance characteristics, totally dominated by stat-ing and stuff.

So my little utility is a C program that recursively reads a real news spool, and then just discards the result.  It's extremely single-threaded, which isn't typical of NNTP usage patterns, but otherwise it should be kinda ok.  It's on GitHub.

To test, I copied over a 26GB portion of the read Gmane news spool (3.3M files) over to three different partitions: One btrfs on the MegaRAID, one ext4 on the MegaRAID, and one ext4/btrfs on the spinning system disk, just to get a baseline.

(And always do echo 3 > /proc/sys/vm/drop_caches before testing anything.)

btrfs wastes a lot of room, though.  What takes 32GB on ext4 takes 42GB on btrfs.  But with max_inline=0 that shinks to 36GB.  Still kinda sucky.

Anyway, the results are, when reading files in readdir() order:

btrfs on ssd: 10600 files per second, 84MB/s

ext4 on ssd: 4460 files per second, 35MB/s

btrfs on spinning disk: 5030 files per second, 40MB/s

ext4 on spinning disk: 238 files per second (yes, I know.  With noatime.  Yes.  Yes.  Try it yourself.)
And when sorting the files in alphabetical order:

btrfs on ssd: 7800 files per second, 62MB/s

ext2 on ssd: 19200 files per second, 152MB/s

ext4 on ssd: 19100 files per second, 152MB/s

ext4 on spinning disk: 6100 files per second, 48MB/s

So two things stand out here:

1) ext4 is really sensitive to the order you read files
3) the LSI MegaRAID SAS 9265-8I is quite slow on small files

I mean, when reading large files, I get 1.2GB/s!  This is bullshit!  Where are my IOPSes!  I want more IOPS!

Perhaps I should set the stripe size on the RAID to something smaller than the default, which is 128KB.  I mean, the mean file size in the spool is 8K, which means that it's probably reading a lot more than it has to.

It has to!

1 comment:

  1. Your workload with lots of random reads of small files probably doesn't really need the bandwidth of SATA3 anyway since you will only hit it with large sequential reads, so I suggest trying out the performance without the RAID card as well. Maybe soft RAID or BtrFS's built-in RAID might even be faster with just SATA2.

    And even though I'm a Debian guy myself, I would actually suggest benchmarking the brand new FreeBSD 9 with the latest ZFS version in RAID-Z mode (again without the RAID card) as well. ZFS simply provides enough awesome to even bear using FreeBSD.

    ReplyDelete