January 20th, 2012

The magical key!

Dear Diary,

today the IOPS-unlocking FastPath key arrived for the LSI MegaRAID 9265-8i card arrivied!  I was so excited, until I opened the box an found what’s basically something that short-circuits two headers on the RAID card.

What a rip-off.  It probably disables some sleep(1) calls in the firmware on the card.

Anyway, I plugged it in, and did the benchmarks, and I got…  absolutely no (zero, zip) difference.  No difference whatsoever.  Zilch.  Nada.

So I googled around a bit, and found someone who claimed that the RAID card should be configured as write-through, no read-ahead, direct I/O to make the FastPath stuff be all awesome and stuff.

I did that.  Annoyingly enough, there seems to be no way to just flip those settings without rebuilding the RAID, so testing takes a while.

And benchmarking afterwards shows that…  throughput drops to one fifth of the standard settings, and random reads of files is absolutely no faster.

I have no more time for benchmarking, since I have to go to Rome.  Diary, I’ll continue fiddling with the RAID card when I get back.

But the strangest thing has happened.  I was looking through some of the previous pages in my secret diary, and I found that somebody else has been making comments!  How is that possible!?  Are other people able to read my diary?  Is Google betraying my secrets to the world?

Anyway, the comment said that I should perhaps just use the SATA 2.0 connectors on the motherboard to see what IOPS I get without using the MegaRAID card.  That makes tons of sense, because I don’t know whether the lack-lustre random read performance I’m getting is because of the disks or the RAID card.

I’ll do that when I get back in about four weeks. 

And I’ll see whether I can secure my diary further somehow.

January 15th, 2012

Dear Diary,

after the rather disappointing random read benchmarks (I mean, 4K files a second? It’s pitiful) I almost went into a severe depression and started thinking about using viAgain.

But then I went back to the hardware pusher’s web site and noticed something strange.  They’re selling something that’s supposed to make their LSI MegaRAID 9265 card suck less!  It’s apparently a firmware upgrade that removes all the sleep(1) calls in their code!  Or something!  At least that’s what I got from reading what they’re saying.  It’s gonna make the IOPS-es on SSD be three times better!

On the one hand, one could be annoyed that they didn’t just leave out the apparent sleep() calls in the standard firmware.  On the other hand, perhaps this is the answer to everything!

Dear Diary, I forked over more money, and I’m now waiting all aflutter for the magical firmware stick to arrive.

Meanwhile, here’s a picture of Oslo.  Winter has sort of arrived, at last:

January 14th, 2012

Dear Diary,

today I’ve written a small benchmark utility to try to emulate NNTP server performance.  A one-file-per-article spool has somewhat unusual performance characteristics, totally dominated by stat-ing and stuff.

So my little utility is a C program that recursively reads a real news spool, and then just discards the result.  It’s extremely single-threaded, which isn’t typical of NNTP usage patterns, but otherwise it should be kinda ok.  It’s on GitHub.

To test, I copied over a 26GB portion of the read Gmane news spool (3.3M files) over to three different partitions: One btrfs on the MegaRAID, one ext4 on the MegaRAID, and one ext4/btrfs on the spinning system disk, just to get a baseline.

(And always do echo 3 > /proc/sys/vm/drop_caches before testing anything.)

btrfs wastes a lot of room, though.  What takes 32GB on ext4 takes 42GB on btrfs.  But with max_inline=0 that shinks to 36GB.  Still kinda sucky.

Anyway, the results are, when reading files in readdir() order:

btrfs on ssd: 10600 files per second, 84MB/s

ext4 on ssd: 4460 files per second, 35MB/s

btrfs on spinning disk: 5030 files per second, 40MB/s

ext4 on spinning disk: 238 files per second (yes, I know.  With noatime.  Yes.  Yes.  Try it yourself.)
And when sorting the files in alphabetical order:

btrfs on ssd: 7800 files per second, 62MB/s

ext2 on ssd: 19200 files per second, 152MB/s

ext4 on ssd: 19100 files per second, 152MB/s

ext4 on spinning disk: 6100 files per second, 48MB/s

So two things stand out here:

1) ext4 is really sensitive to the order you read files
3) the LSI MegaRAID SAS 9265-8I is quite slow on small files

I mean, when reading large files, I get 1.2GB/s!  This is bullshit!  Where are my IOPSes!  I want more IOPS!

Perhaps I should set the stripe size on the RAID to something smaller than the default, which is 128KB.  I mean, the mean file size in the spool is 8K, which means that it’s probably reading a lot more than it has to.

It has to!

January 13th, 2012

Dear diary,

today is the most joyful day of my entire life.  The Samsung 830s for my new server arrived!  Sort of out of the blue!  The web shop insisted upon them not arriving until the end of the month, and then they sent them anyway.

Look at how pretty they are.  One might even suspect Samsung having learnt something from Apple by manufacturing most of Apple’s stuff.

The disks are so pretty that it’s almost a shame to put them into the server, don’t you think, Diary?

 Anyway, 5x 512GB of SSD-ey goodness.  I installed them in some 2.5″->3.5″ adapters that we had laying around.  They were originally used for some WD VelociRaptor disks that had all died over the years.  Not very good disks, but nice adapters.  And that’s what counts.

So the SSDs are plugged into adapters, which are then plugged into the backplace, which is then plugged into the RAID card.

I configured up the LSI MegaRAID SAS 9265-8I card in RAID5 mode, and then put ext4 on the partition.

Diary, you won’t believe this.  With a simple “cat /dev/zero > /mnt/ssd/file” command, iostat told me that I had a writing speed of 1.2GB/s.  Yes!  Jiggobytes!  Not bits!  Bytes!  Jiggobytes!

 I only got 600MB/s when reading the same huge file back.

So to get slightly more serious about benchmarking, I installed iozone, and made it do its test on five 100GB files in parallell.  It reports 1.5GB/s on large writes, and 1.2GB/s on large reads!

So I wasn’t just imagining things!  This thing is blazingly fast on pointlessly huge files! I mean, the theoretical max should be about 2GB/s, since each of the five SSDs have a writing speed of about halv a gig each, and it’s a five-disk RAID5 system, but I hadn’t actually expected the LSI RAID card to deliver anything like this, because, you know, Diary, all hardware sucks.

Now I just have to write something to emulate read NNTP server load (i.e., lots of small files in huge directories), and then do some benchmarking on differents file systems to see what happens.

Oh, the excitement!

January 4th, 2012

Dear Diary,

today the RAID card arrived for my new server. I had apparently only ordered a single SATA cable instead of the six I had meant to order.

But it turned out that the card didn’t use normal SATA cables at all, but a weird one-to-four connector.  It’s a big connector on one side, and four separate SATA cables comes out of it.

So for once messing up the order saved me some money.

I installed the card in my 2U machine.  As you can see, those weird connectors are pretty high up on the card, which means that the cables take a rather dramatic 90 degree turn.  I hope that’s not going to cause any problems…  At least the lid closes.

I connected it all up and booted the machine.

04:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2208 [Thunderbolt] (rev 01)

[    8.636577] megaraid_sas 0000:04:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    8.637236] megaraid_sas 0000:04:00.0: setting latency timer to 64
[    8.640920] megasas: FW now in Ready state
[    8.641300] megaraid_sas 0000:04:00.0: irq 46 for MSI/MSI-X
[    8.663036] megasas:IOC Init cmd success

It works!

After fiddling aorund in the glorious and very oddly named “WebBIOS” (it’s not actually on the web at all), I managed to set up my test SSD in RAID0.  That is, a single Corsair Force 3 disk.  Apparently it doesn’t like doing JBOD…  At least, I couldn’t find any setting for it.

Writing to the device gives me 480MB/s, while reading gives me 375MB/s.  Not all that impressive, but at least is demonstrates that the old backplane really really doesn’t lead to any major problems, since that’s clearly SATA 3 speed.

Now I  just have to wait for the Samsung 830s to arrive.  Surely they can’t be serious about that horribly late arrival date… I’ll be in Rome by then…  I won’t know whether this all works until months and months from now…

Woe is me, Diary.

(Continue reading my secret diary.)