Question: I still did not get why is RAID5 better than RAID4. I understand both computes parity bits that are used for recovering if some failure occurs, the only difference is in storing those parity bits. I have borrowed diagrams from here How does parity work on a RAID-5 array

A B (A XOR B)0 0 ?01 1 ?00 1 ?11 0 ?1

RAID4

Disk1 ?Disk2 ?Disk3 ?Disk4—————————-data1 ?data1 ?data1 ?parity1data2 ?data2 ?data2 ?parity2data3 ?data3 ?data3 ?parity3data4 ?data4 ?data4 ?parity4

Lets say that first row is:

data1 = 1data1 = 0data1 = 1parity1 = 0 (COMPUTED: 1 XOR 0 XOR 1 = 0)

RAID5

Disk1 ?Disk2 ?Disk3 ?Disk4—————————-parity1 data1 ?data1 ?data1 ?data2 ?parity2 data2 ?data2 ?data3 ?data3 ?parity3 data3data4 ?data4 ?data4 ?parity4

Lets say that first row is:

parity1 = 0 (COMPUTED: 1 XOR 0 XOR 1 = 0)data1 = 1data1 = 0data1 = 1

Scanarios:

1. RAID4 – Disk3 FAILURE:

data1 = 1data1 = 0data1 = 1 (COMPUTED: 1 XOR 0 XOR 0 = 1)parity1 = 0

2. RAID4 – Disk4 (parity) FAILURE:

data1 = 1data1 = 0data1 = 1 parity1 = 0 (COMPUTED: 1 XOR 0 XOR 1 = 0)

etc.

In general: when RAID(4 or 5) uses N disks and one fails. I can take all remaining non failed disks (N-1) and XOR (since XOR is associative operation) values and I will get the failed value. What is the benefit of storing parity not on dedicated disk but rather cycle them? Is there some performance benefit or what? Thank you

Answer: There is a performance difference in that with RAID 4 each change requires writing to the single parity disk, which means things can queue waiting to update the parity data on that disk.

With RAID 5 you have a significant reduction in this because the parity update load is spread across multiple disks, so there’s less chance if getting stuck in a queue.

Here’s a nice link from Fujitsu with a short explanation and some nice animations to help clarify the performance/penalties of RAID 4 (as well as other RAID levels).