Using Multiple Swap Partitions In 2.4

Posted: 10 Jun 2005

One day I was searching the web and found either a SuSE or Red Hat site that was saying that you could set up your swap in a RAID fashion. They were talking about a large server with a lot of disk drives, and you could put a swap partition on many of them, and set all these swap partitions to the same priority. This way they would work more like they were in a RAID setup, and the speed of swap writing and reading from the disks would be improved.

Also, near the end of the replies to the 'How to use RAM as Swap [1]' article, there was mention that someone should be reworking the swap algorithm so it didn't use such a simple and slow search method for finding and using swap slots. I believe it was a mention of Andrew Morton himself saying something like that in an lkml email. I don't know if that has been done, but the 2.6 kernel swap is a lot different than the 2.4, so maybe it was at least attempted.

In any case, in 2.4, if I break up my 512 MB swap partition into 4 different 128 MB partitions, and then give them all the same priority, it changes my swap-in wait period quite a bit. This is with all the swap partitions on the same drive, so the limiting factors are not IDE transfer speeds here. With the smaller partitions my max wait goes down from 60 seconds to 30 seconds (on the old laptop), and the average wait of 30 seconds is down to about 10. That makes it a lot more usable. It is much nicer. And on the old laptop there is no possibility of adding RAM, which would help more if it could be done.

Another interesting thing I encountered is that recently, when I installed MEPIS, on a drive that already had the smaller swap partitions, it automatically picked up the swap partitions and made them all the same priority. That was great. I usually have to manually edit fstab and maybe use mkswap on some of the partitions to get them to mount and to mount with the same priority.

Apparently this is already being used. I guess I don't know why MEPIS (3.3.1 I believe) does this automatically but I suspect it is for this RAID-like speed increase.

So, this does indicate that the swap algorithms must be somewhat inefficient. If hdparm gives a result of 10 MB/sec (slow by todays standards), then it should only take a few seconds to swap your application back in, but instead it is taking 30 seconds (with just one swap partition), so that looks like a low percentage efficiency for read in speed. Now since the processor and memory are so much faster than drive data transfer, it does indicate some kind of tie up in the swap algorithms. Also, the fact that using more swap partitions on the same drive speeds things up also shows that the swap algorithm is inefficient, but can be worked around to some extent.

This helps up to a point. When I tried 16 quite small swap partitions the speed was a little worse than with just one large partition, so in that case it must be getting tied up in handling so many partitions. So for my case in the breaking up a of a 512 MB swap partition, four partitions of 128 MB was about optimum. Your mileage may vary.

To try this yourself, if you have the swap partition as the last partition on the drive, you can (when not booted into linux on the drive you will be working on) delete the swap partition and make 4 swap partitions of one quarter the original size. If the swap partition is not the last partition, there will be some problems with making more partitions before the main Linux partitions, such as the boot loader will point to the wrong partition and you will have to repair that.

A couple ways you can do this is to boot from a rescue CD or Knoppix, and run from them.

But for a new installation, just put the swap partition(s) as the last partition(s) on the drive. Then you can slice it up later and not disturb the booting capability.

If you do have the situation where the swap partition is the last partition or you have space at the end of the drive or can make space by changing the next to last partition size, then you can put more swap partitions on the last of the drive and try this.

Your hdxx values will probably be different than those in the example below. To find your original swap partition value run:

fdisk -l

And look for the swap entries. (WARNING: If they aren't at the end of the disk, don't try this unless you know how to recover from changing the hdxx value of your booting partition from adding partitions before it.)

Use your favorite partitioning software and delete your current x MB swap partition. Then make 4 partitions of equal size (that add up to the original partition size) as swap partitions. Run mkswap if you need to. Then modify /etc/fstab so it has 4 lines about swap, and put " pri=8 " on each line in the section where it should be, for example:

(make a copy of fstab before you try this, for example:
cp /etc/fstab /etc/fstab.bak ).

(If you run into trouble and just can't get this to work, delete the small partitions, remake your original swap partition, and then restore your original fstab file from the fstab.bak. Your original bootloader file should work again too, unless you modified that and didn't make a backup copy of it.)

The swap line in the unmodified fstab file for just one swap partition may read:

/dev/hda6     swap      swap    defaults   0  0

And you will then modify it and add entries so it reads, for example:

/dev/hda6     swap      swap    sw, pri=8  0  0
/dev/hda7 swap swap sw, pri=8 0 0
/dev/hda8 swap swap sw, pri=8 0 0
/dev/hda9 swap swap sw, pri=8 0 0

Then, after saving this file, you can reboot and see if everything worked, or if you didn't prepare all the partitions correctly and need to run mkswap or made typos, whatever. Well, suffice it to say, if you don't feel comfortabe working at this level, maybe you should have a more experienced linux person help.

You can check it with:

free -mt

That will tell you how much space there is in swap, and:

swapon -s

will tell you what swap partitions are active and what their priorities are, plus more information.

Anyway, with the original swap partition sliced into 4 smaller ones, and if they are all mounted and running, there should be a definite decrease in time spent swapping. This is something that might really help especially if you can't increase your amount of memory and you must use swap. Of course it won't speed swap operations up anywhere near the speedup from adding some RAM and replacing your swap partition with swap in a ramdisk or set of ramdisks.

Informacja z serwisu