Time: 2020-12-08 www.sdyserver.cn
RAID10 and RAID01 comparison, RAID10 and RAID5 comparison

Comparison of RAID10 and RAID01


In RAID10, mirroring is done first, and then striping.


RAID01 is striping first, and then mirroring.


For example, taking 6 disks as an example, RAID10 first divides the disks into 3 groups of mirror images, and then strips the 3 RAID1. RAID01 first uses 3 disks as RAID0, and then uses the other 3 disks as the mirror image of RAID0. Related reading: Detailed analysis of the difference between Raid0, Raid0+1, Raid1, Raid5


The following uses 4 disks as an example to introduce the differences in security:


1. The situation of RAID10


In this case, we assume that when DISK0 is damaged, among the remaining 3 disks, only when one disk of DISK1 fails, the entire RAID will fail. We can simply calculate the failure rate of 1/3.


2. The situation of RAID01


In this case, we still assume that DISK0 is damaged and the strip on the left cannot be read. In the remaining 3 disks, as long as any one of DISK2 and DISK3 is damaged, the entire RAID will fail. We can simply calculate the failure rate to be 2/3.


Therefore, RAID10 is stronger than RAID01 in terms of security.


From the perspective of the logical location of data storage, under normal circumstances, RAID01 and RAID10 are exactly the same, and the number of IOs generated by each read and write operation is also the same, so there is no difference between the two in terms of read and write performance. And when there is a disk failure, such as the DISK0 assumed earlier, we can also find that the read performance will be different in the two cases, and the read performance of RAID10 will be better than RAID01.





Comparison of RAID10 and RAID5


In order to facilitate comparison, here is a comparison with disks with the same multiple drives. RAID 5 chooses the 3D+1P RAID scheme, and RAID10 chooses the 2D+2D RAID scheme, as shown in the figure:


1. Comparison of safety


In fact, in terms of security, there is no need to doubt that RAID10 is more secure than RAID5. We can also draw from a simple analysis. When Disk 1 is damaged, for RAID 10, only when the mirror disk corresponding to Disk 1 is damaged, the RAID will fail. But for RAID5, the failure of any one of the remaining three disks will cause RAID failure.


When recovering, RAID10 recovers faster than RAID5.


2. Comparison of space utilization


The utilization rate of RAID10 is 50%, and the utilization rate of RAID5 is 75%. The more the number of hard disks, the higher the space utilization of RAID5.


3. Comparison of read and write performance


The main analysis analyzes the following three processes: reading, continuous writing, and discrete writing.


Before introducing these three processes, first introduce a particularly important concept: cache.


The cache is already the core of the entire storage, which is the low-end storage, and there are also large caches, including the simplest raid cards, which generally contain dozens or even hundreds of megabytes of raid cache.


What is the main function of the cache? It is reflected in two different aspects of reading and writing. If it is used as writing, the general storage array only requires writing to the cache even if the write operation is completed. Therefore, the writing of the array is very fast, and the writing of the cache is very fast. When the data accumulates to a certain extent, the array flashes the data to the disk, which can realize batch writing. As for the protection of cache data, generally rely on the mirror and battery (or UPS).


The cache read cannot be ignored, because if the read can hit the cache, it will reduce disk seeks, because the disk seeks to find the data, generally more than 6ms, and this time, for those intensive io The application may not be ideal. However, if the cache can be hit, the general response time can be within 1ms. The difference between the two should be 3 orders of magnitude (1000 times).


1) Performance differences in read operations


The number of disks available for reading valid data in RAID10 is 4, and the number of disks available for reading valid data in RAID5 is also 4 (parity information is distributed on all the disks), so the read performance of both should be Basically the same.


2) Performance difference in continuous writing


In the process of continuous write operation, if there is a write cache and the algorithm is not problematic, RAID5 is even better than RAID10, although there may not be much difference. (It is assumed that there is a write cache of a certain size and sufficient size in the storage, and there will be no bottleneck in the calculation of the cpu for verification).


Because the RAID verification at this time is completed in the cache, such as a 4-disk RAID5, the verification can be calculated in the memory first, and 3 data + 1 verification can be written at the same time. And RAID10 can only write 2 data + 2 mirrors at the same time.


As shown in the figure above, a RAID5 with 4 disks can be written 1, 2, and 3 to the cache at the same time, and after the cache has calculated the checksum, it is assumed to be 6, and three data are written to the disk at the same time. Regardless of whether the cache exists in a 4-disk RAID10, when writing, two data and two mirrors are written at the same time.


According to the previous introduction to the principle of caching, write cache can cache write operations, wait until the cache write data accumulates for a certain period of time before writing to disk. However, the process of writing to the disk array will happen sooner or later, so in the case of continuous writing, RAID5 and RAID10 will have a small difference in the write operation speed from the cache to the disk. However, if it is not a continuous strong continuous write, as long as the write limit of the disk is not reached, the difference is not too great.


3) Performance differences in discrete writing


For example, the oracle database writes one data block of data at a time, such as 8K; since the amount of write each time is not very large, and the number of writes is very frequent, the online log looks like continuous writing. However, because there is no guarantee that a stripe of RAID5 can be filled, such as 32K (guaranteed that each disk can be written), it is often more inclined to discrete writing (writing to a strip of existing data).


Let's take a look at the difference between RAID5 and RAID10 when writing discretely from the above figure. As shown in the figure above: we assume that we want to change a number 2 into a number 4. Then for RAID5, 4 io actually occurred: first read 2 and check 6, a read hit may occur, and then a new check write is calculated in the cache The new number 4 and the new checksum 8.


As shown in the above figure, we can see that: for RAID10, the same single operation, finally RAID10 only needs 2 io, and RAID5 needs 4 io.


Here I ignore that RAID5 may have read hit operations during those two read operations. In other words, if the data to be read is already in the cache, 4 IOs may not be needed. This also proves the importance of cache to RAID5, not only for calculating the verification requirements, but also for improving performance.


Of course, it does not mean that cache is not important to RAID10, because write buffering and read life are the key to improving speed, but the dependence of RAID10 on cache is not as obvious as RAID5.


4) Comparison of disk IOPS


Assuming a case, the business iops is 10000, the read cache hit rate is 30%, the read iops is 60%, the write iops is 40%, and the number of disks is 120. Then calculate the raid5 and raid10 cases, each disk How many iops are.


raid5:


The iops of a single disk = (10000*(1-0.3)*0.6 + 4 * (10000*0.4))/120


= (4200 + 16000)/120


= 168


Here 10000*(1-0.3)*0.6 means read iops, and the ratio is 0.6. Excluding cache hits, there are actually only 4200 iops.


4 * (10000*0.4) means the written iops, because each write, in raid5, 4 io actually occurred, so the written iops are 16000


In order to consider that when raid5 is writing, the two read operations may also occur, so the more accurate calculation is:


The iops of a single disk = (10000*(1-0.3)*0.6 + 2 * (10000*0.4)*(1-0.3) + 2 * (10000*0.4))/120


= (4200 + 5600 + 8000)/120


= 148


It is calculated that the iops of a single disk is 148, which basically reaches the disk limit


raid10


The iops of a single disk = (10000*(1-0.3)*0.6 + 2 * (10000*0.4))/120


= (4200 + 8000)/120


= 102


It can be seen that because raid10 only takes io 2 times for a write operation, there are only 102 iops per disk under the same pressure and the same disk, which is far below the limit iops of the disk.


4. Summary


Therefore, a higher space utilization rate is required, and it is better to use RAID5 for systems that require less security and large file storage.


On the contrary, the security requirements are very high, regardless of cost, it is better to use RAID 10 for systems with small data volumes and frequent writes.