Analyzing Backblaze’s Q3 2025 Stats

Last edited:


The Backblaze blog is awesome for finding out information about hard drives, especially if you are looking to build a NAS. They report quarterly information about the AFR, or annualized failure rate, of their hard drives grouped by hard drive model number. This metric is nice, but it doesn’t capture exactly what you want if you are looking to buy a reliable hard drive.

For example, a drive with 0 failures and 10,000 working hours will have an AFR of 0%, but a drive with 10 failures and 1,000,000 working hours will have a higher AFR of 0.4%. Intuitively we know that there is not enough data on the first drive to really know how reliable it is and so if I were building a NAS I would probably pick the second one. To capture this intuition, I re-analyzed the Backblaze Q3 2025 data and added two extra columns in addition to the AFR mode: the mean and the 95% limit. I also added it to a sortable HTML table so you can sort by whichever column you want.

Here is a short description of the columns:

AFR (mode)
This is what Backblaze reports as the AFR. It is the most likely value (mode) of the AFR posterior.
AFR (mean)
This is the mean of the AFR posterior. For drives with lots of stats and some failures this will be almost identical to the mode. However, for drives without any failures and not a lot of data, the posterior distribution will be asymmetric (see the plot at the bottom of this blog post) and the mode will be 0, but the mean more accurately measures the center of the distribution.
AFR (95% limit)
This is the value of the AFR which we are 95% sure that the real AFR is below. If you are looking to buy a known reliable hard drive, this seems like the number you would most care about.
MFR Model Size (TB) Drive Days Fails AFR (mode) AFR (mean) AFR (95% Limit)
WDC WUH722222ALE6L4 22 3,555,491 50 0.51% 0.52% 0.65%
Toshiba MG10ACA20TE 20 1,416,127 21 0.54% 0.57% 0.78%
Seagate ST16000NM001G 16 3,125,133 57 0.67% 0.68% 0.83%
Toshiba MG08ACA16TA 16 3,686,376 85 0.84% 0.85% 1.01%
Seagate ST12000NM001G 12 1,219,082 31 0.93% 0.96% 1.25%
WDC WUH721414ALE6L4 14 794,781 20 0.92% 0.96% 1.33%
Toshiba MG07ACA14TA 14 3,440,051 116 1.23% 1.24% 1.44%
Toshiba MG08ACA16TE 16 553,332 16 1.06% 1.12% 1.60%
Seagate ST8000DM002 8 824,787 27 1.19% 1.24% 1.65%
Seagate ST14000NM001G 14 972,882 38 1.43% 1.46% 1.87%
Seagate ST8000NM0055 8 1,224,440 53 1.58% 1.61% 1.99%
HGST HUH721212ALE604 12 1,227,006 65 1.93% 1.96% 2.38%
HGST HUH721212ALE600 12 239,677 9 1.37% 1.52% 2.39%
Seagate ST12000NM000J 12 91,723 2 0.80% 1.19% 2.51%
Seagate ST16000NM002J 16 42,581 0 0.00% 0.86% 2.57%
Seagate ST12000NM0008 12 1,728,706 132 2.79% 2.81% 3.22%
Seagate ST24000NM002H 24 601,539 46 2.79% 2.85% 3.57%
WDC WUH721816ALE6L0 16 274,775 20 2.66% 2.79% 3.86%
HGST HUH728080ALE600 8 98,985 6 2.21% 2.58% 4.37%
Toshiba MG11ACA24TE 24 24,148 0 0.00% 1.51% 4.53%
Seagate ST8000NM000A 8 22,724 0 0.00% 1.61% 4.81%
HGST HUH721212ALN604 12 912,361 109 4.36% 4.40% 5.11%
Toshiba MG09ACA16TE 16 17,852 0 0.00% 2.04% 6.12%
Toshiba MG07ACA14TEY 14 85,530 8 3.41% 3.84% 6.16%
HGST HMS5C4040BLE640 4 17,194 0 0.00% 2.12% 6.36%
Seagate ST500LM030 1 15,345 0 0.00% 2.38% 7.12%
Seagate ST12000NM0007 12 91,835 13 5.17% 5.56% 8.21%
Toshiba MQ01ABF050 1 12,017 0 0.00% 3.04% 9.10%
Seagate ST14000NM0138 14 117,131 22 6.86% 7.17% 9.79%
Seagate ST14000NM000J 14 31,852 4 4.58% 5.73% 10.49%
Seagate ST10000NM0086 10 91,650 20 7.97% 8.36% 11.57%
WDC WUH721816ALE6L4 16 13,635 1 2.68% 5.35% 12.70%
Toshiba MG08ACA16TEY 16 462,943 215 16.95% 17.03% 18.98%

Failure Model

In order to understand the raw data from Backblaze we need to start with a model for the failures of hard drives. A realistic model would be too complicated (see for example Backblaze’s blog post here), so we’ll stick with a very simple model. Let’s assume that each model of hard drive has a probability p of failing each day, and that each day the probability of failure is independent. We want to know: what is the probability for a failure rate p given that we had t total drive days and f failures, or equivalently:

\begin{equation}P(p|t,f) = \frac{P(t,f|p)P(p)}{P(t,f)}\end{equation}

We’ll assume the prior P(p) is flat, and the denominator is just a normalization constant, so we have:

\begin{equation}P(p|t,f) \propto P(t,f|p)\end{equation}

This latter expression is just equal to the probability of no failure for t-f days and failures on f days, i.e.

\begin{equation}P(t,f|p) = (1-p)^{t-f}p^f\end{equation}

Combining these last two expressions we get that the posterior for the failure rate p is proportional to:

\begin{equation} P(p|t,f) \propto (1-p)^{t-f}p^f \end{equation}

This is just the beta distribution with parameters:

\begin{align} \beta &= t-f+1 \\ \alpha &= f+1 \end{align}

This is nice because now we can compute all sorts of things about the posterior. For example, the mode (which backblaze reports as the AFR)

\begin{equation}\mathrm{AFR (mode)} = 365\cdot\frac{f}{t}\end{equation}

we can also calculate the mean:

\begin{equation}\mathrm{AFR (mean)} = 365\cdot\frac{f+1}{t+2}\end{equation}

Finally, we can calculate the 95% limit using the scipy.stats.beta distribution:

from scipy.stats import beta

a = f+1
b = t-f+1
p95 = 365*beta.ppf(0.95,a,b)

It’s interesting to see how the posterior distribution changes over time. Here is a plot showing the posterior for the “Toshiba MG08ACA16TA” model after 1 day, 1 week, 1 month, and the full 3 months:

AFR Posterior for Toshiba MG08ACA16TA after 1 day, 1 week, 1 month, and 3 months

As you can see from the blue line the posterior starts out with a most likely AFR of 0% after 1 day (40,025 drive days), but the distribution has a significant fraction of it’s weight reaching all the way out to an AFR of more than 2%. After 1 week (240,651 drive days), there have been 2 failures and the distribution is still asymmetric but peaks somewhere just below an AFR of 0.5%. After 1 month (1,243,220 drive days) the distribution is starting to look Gaussian with a peak somewhere near 0.75%. Finally, after the whole quarter (3,686,376 drive days) the distribution still looks Gaussian but the peak has shifted closer to an AFR of 1%.

One cool thing about this visualization is we can kind of double check our assumptions. In this case we can see that although the peak shifted around quite a bit, all of the distributions had a significant fraction of their weight around the 1% AFR that it eventually settled in to, suggesting our initial assumption of a failure rate independent of time is probably good for this drive over this time period (at least with the amount of data we have).

Analyzing the Data

In order to analyze the data, I first downloaded the zip file containing all the csv files from Backblaze’s Hard Drive Test Data page. I then created an sqlite database with the relevant columns using the following schema:

CREATE TABLE IF NOT EXISTS backblaze_stats (
    id INTEGER PRIMARY KEY,
    date TEXT,
    serial_number TEXT,
    model TEXT,
    capacity_bytes INTEGER,
    failure INTEGER,
    datacenter TEXT,
    cluster_id INTEGER,
    vault_id INTEGER,
    pod_id INTEGER,
    pod_slot_num INTEGER,
    is_legacy_format TEXT
);

Having the data in an SQLite database is nice because we can query the data by drive model and calculate the AFR all in a single command:

sqlite> SELECT model, SUM(failure) as failures, COUNT(*) AS drive_days, ROUND(365.0*SUM(failure)*100/COUNT(*),2) AS afr FROM backblaze_stats GROUP BY model ORDER BY drive_days DESC LIMIT 10;
model                 failures  drive_days  afr 
--------------------  --------  ----------  ----
TOSHIBA MG08ACA16TA   85        3686376     0.84
WDC WUH722222ALE6L4   50        3555491     0.51
TOSHIBA MG07ACA14TA   116       3440051     1.23
ST16000NM001G         57        3125133     0.67
WDC WUH721816ALE6L4   64        2425374     0.96
ST12000NM0008         132       1728706     2.79
TOSHIBA MG10ACA20TE   21        1416127     0.54
HGST HUH721212ALE604  65        1227006     1.93
ST8000NM0055          53        1224440     1.58
ST12000NM001G         31        1219082     0.93