(SLC) SSD Write Endurance Considered... Sufficient

Some of the content on this page is in MathML. If your browser has trouble displaying any mathematical notation below you might find the PDF transcript of this page more helpful. There is also a collection of mathematical articles on this site, which should include this one. Additionally there is an ebook version of this page as well as an ebook version of the aforementioned collection of mathematical articles.

Summary

No, you won't kill your SSD anytime soon. Seems hard to find actual, crunched numbers though.

The Numbers

Ever wonder how long your shiny new SLC SSD will last? It's funny how bad people are at estimating just how long "100,000 writes" are going to take when spread over a device that spans several thousand of those blocks over several gigabytes of memory. Not to mention even newer drives that can take even more of a beating before going down. So yeah, let's crunch some numbers for that:

GnuplotProduced by GNUPLOT 4.6 patchlevel 0 0 0.2 0.4 0.6 0.8 11 month3 months6 months1 yeardevice weartime elapsedDrive Wear After Continuous Writing to Flash Memory at 6Gb/s (w/ 8b/10b Coding) with Average Lifespan of Flash Cell Approximated at 100k Writes53 days108 days215 days430 daysgnuplot_plot_132GB Drivegnuplot_plot_264GB Drivegnuplot_plot_3128GB Drivegnuplot_plot_4256GB Drivegnuplot_plot_510% ThresholdSource SVG: "Drive Integrity After Continuous Writing to Flash Memory at 6Gb/s with Average Lifespan of Flash Cell Approximated at 100k Writes".GnuplotProduced by GNUPLOT 4.6 patchlevel 0 0 0.2 0.4 0.6 0.8 11 year3 years6 years12 yearsdevice weartime elapsedDrive Wear After Continuous Writing to Flash Memory at 6Gb/s (w/ 8b/10b Coding) with Average Lifespan of Flash Cell Approximated at 1M Writes1.5 years2.9 years5.9 years11.8 yearsgnuplot_plot_132GB Drivegnuplot_plot_264GB Drivegnuplot_plot_3128GB Drivegnuplot_plot_4256GB Drivegnuplot_plot_510% ThresholdSource SVG: "Drive Integrity After Continuous Writing to Flash Memory at 6Gb/s with Average Lifespan of Flash Cell Approximated at 1M Writes".

Note how I crunched numbers with the full 6Gbit/s throughput of SATA 3.0. We don't actually have drives that can deliver that kind of throughput, much less sustained for several years. You also typically won't find any natural scenario where you'll keep writing gigabytes of random data to an SSD like that. And by "that" I mean "sustained for days, months or years." And OS write caching hasn't been taken into consideration either.

As for the "10% Threshold" line, that's because SSDs typically contain about 10% of spare blocks hidden from the OS and used to replace dead blocks on the drive transparently. You typically won't find dead blocks in your fsck before that threshold is reached. Individual drives may contain fewer or more spares, but 10% seems a solid average at this time.

Net result: next time someone says "I wouldn't put my swap/log files on that flash drive," you'll know better. Might be a good idea to link this article for them while you're at it. Although I do have to admit that I had to make a few assumptions to plot those graphs here, but see below...

A Note on the Data

As a few helpful comments on slashdot pointed out, I have in fact only considered SSDs using SLC NAND chips. That's because, personally, I've only ever bought high-end SSDs, so it kind of came natural to me. I took the 100k P/E cycle estimate from this here datasheet, section 1.5 'Product Features'. As for the 1M stat, I assumed those for 'newer' SLCs following this press release.

A lot of consumer grade SSDs use MLC NANDs or worse, which will fry out a lot sooner. But then you probably still won't be writing to those constantly at speeds the drive doesn't even support, now would you? In any event, the calculations below should work for those as well, just plug in fewer erase cycles and a lower maximum throughput into the equations.

The Crunching

As with all things in life, there's always several ways to calculate estimates. Personally, I think every graph that doesn't describe how it was calculated is worthless - so let's see how I came up with those two above.

Flash drives and SSDs only degrade when actually written to. More specifically we can assume they degrade during a block erase operation, as flash chips need to be erased in order to be written to. So in order to estimate the life span of an SSD we have to estimate the number of erased blocks for a given time span.

Write Speed

Since we're trying to make a point here, we'll use the theoretical maximum write speed we could possibly run across at this point: the maximum link speed for SATA 3.0, which happens to be larger than any other relevant interface or drive speed so it's a solid upper bound.

maximum write speed=maximum SATA 3.0 link speed=6Gbs(with 8b/10b coding)=600MBs

Note: the original article did not take 8b/10b coding as used by SATA 3.0 into account, which decreases the throughput by 20%. The current graphs have been updated to reflect this new maximum throughput.

Block Sizes

For the calculations we'll assume the following:

flash block size=8KBerase block size=256×flash block size=256×8KB=2MB

This seems a reasonable estimate given typical contemporary flash chips.

Number of Erased Blocks by Time

The drive firmware should be able to compensate for suboptimal writes well enough to bundle small writes together to fill up whole erase blocks. At peak throughputs this results in the drive needing to erase a block at a rate of:

block erase rate=write throughputerase block size=600MBs2MB=300blockss

Notice that this rate decreases with the write throughput, meaning that at more realistic throughput levels the drive will have to erase fewer blocks than that per second. I'd assume that the higher throughput we're using for our calculations would thus compensate for a suboptimal write bundling algorithm in the drive's firmware.

Number of Writes Per Erase Block

I seem to have trouble finding accurate statistics for this rather important metric. Contemporary flash chip descriptions seem to indicate an average of either 100,000 or 1,000,000 writes before a given block is likely to be dead. Unfortunately, most specification sheets on SSDs seem to omit the definition of "average" here. Thus I'll simply assume that number to be based on a Gaussian distribution and further assume a somewhat generous standard deviation of 10%.

This means that we should be able to use the following formula - a variant of the cumulative distribution function - to estimate the probability of a given block to be broken after a given number of writes to it:

probability of block being broken=1+erfwrites-write count2×write count1022

Note that erf is the Gauss error function, writes is the number of writes performed to a block and write count is either 100,000 or 1,000,000 - we'll crunch the numbers for both.

Number of Erase Blocks

The formula above gives us the probability of blocks in a given erase block to be broken after a given number of writes. Obviously any given SSD will have more than one erase block, and that number is easily calculated as:

erase blocks=drive sizeerase block size=drive size2MB

Number of Broken Blocks by Number of Writes

To find out how many blocks are broken after a given number of writes, we'll further assume that we have a perfect write leveling algorithm on the drive's firmware. That means that no matter which (logical) block we write to in software, the drive will distribute the erase and write operation to a truly random physical block. This greatly simplifies calculations by allowing us to apply the law of large numbers to our model, and it shouldn't be far off either - that is, after all, the purpose of write leveling.

With this additional assumption we use the following formula:

number of broken blocks=erase blocks×probability of block being broken

If we plot this with writes on the x axis, then the y axis will be average number of broken blocks - that is we're essentially having (number of erase blocks) separate probability experiments running in parallel, which multiplied by the probability of an individual block breaking is the number of those blocks that have broken so far.

The Result

If we bring the previous resulting formulas together, we find that we simply need to scale our probability calculation by the number of erased blocks/time to get an estimate of how long we need to write to a drive before it starts breaking.

The resulting formula for our plot is thus:

probable device integrity=1+erfseconds×block erase rateerase blocks-write count2×write count1022

Substituting variables yields:

(...)=1+erfseconds×write throughputerase block sizedrive sizeerase block size-write count2×write count1022=1+erfseconds×write throughputdrive size-write count2×write count1022

This plot indicates how many of the drive's erase blocks have broken so far, on a scale from 0 (none) to 1 (all of them) on the y axis and the elapsed time on the x axis. Interestingly, the exact erase block size we used to come up with the formula is not needed for this calculation.

The plots include a guiding line at 0.1 - this is because SSDs are built with approximately 10% spare blocks to compensate for burnt out flash chips. After 10% of the blocks are broken you'll start to "see" them in software - before that you won't notice anything because the drive firmware will hide them. Or at least it's supposed to, anyway.

We can now use a tool like gnuplot to render the plot with appropriate variables subsituted, e.g. like this for a 64GB drive at 100k writes:

speed=6
size=64
writecount=100000
set yrange [0:1]
set xrange [800000:20000000]
plot (0.5*(1+erf(((x*(speed*1000*1000*1000/10)/(size*1000*1000*1000))-writecount)/sqrt(2*((writecount*0.1)**2)))))

Alternatively, we could also have plotted the number of broken blocks directly:

probable number of broken blocks=erase blocks×1+erfseconds×block erase rateerase blocks-write count2×write count1022

Unfortunately this plot makes it harder to compare different SSD models, as the number of erase blocks depends on the size of the device. I still thought it'd be worth mentioning.

Last Modified: 2013-03-03T18:45:00Z

Since you came this far, why not follow jyujinX on Twitter

comments powered by Disqus
Written by Magnus Achim Deininger