Innovative products for data storage professionals
"Providing the Tools to maximize your Data Center"

Home Company Tape Eraser Tape Cleaner VeriTape® 21 Track Reader Ulysses® Lagacy Products
Home
Reliability of Magnetic Data Tape
Investigation of reliability of LTO tapes based on over 1 Million records

An examination of the usage and error records of over one million used LTO cartridges reveals that the LTO technology is very robust. With very few exceptions all cartridges perform well. A few bad cartridges can, however, be the cause of many problems. Their elimination could greatly improve the performance of a data center.

The last line of defence

Restoring lost data from magnetic data tape is often the last line of defense against a permanent loss of critical data. Failure is likely to be noticed and remembered. The tape technology is expected to perform flawlessly. Is this expectation realistic? Can you be confident that your tape will perform without fail every time and that you will be able to restore lost data from an archived tape?

Little empirical data about the reliability of magnetic data tape seem to exists - at least not in the public domain. (For a notable exception, see a case study of tape reliability of a single, but large data center. NERSC Study ). Most statements about tape reliability are of general nature and are based on an analysis of the tape technology. While stated error rates of 1x1017 to 1x1019 are impressive and noteworthy, they may not tell data center professionals what they need to know.

In the following, we have a closer look at the quality data of over one million used tapes in order to find an answer.

Almost all tapes are LTO tapes

Today almost all data tapes are LTO tapes, and the LTO technology is the clear winner in the tape market. Smaller, lower cost tape technologies are extinct and so are high end mainframe technologies, with the exception of IBM’s 3592 which is heavily based on LTO technology. Investigating the reliability of today’s magnetic data tapes requires an investigation of LTO tapes.

In the following, we examine the usage and error records of over one million LTO cartridges, or just less than one-half of one percent of all shipped LTO cartridges. [Skip to Results]

The cartridge records

Each LTO cartridge contains an auxiliary memory chip that stores cartridge and drive related data, such as the cartridge type, the location of data records on the tape, and importantly, the usage and error history of the cartridge. Tape drives update these statistics each time a cartridge is used. The size of this memory was initially 4 KB, but was doubled twice and is now 16 KB.

It contains many summary statistics accumulated over the lifetime of the cartridge and detailed statistics of the last 4 times the cartridge was used. The later LTO generations store over 200 numbers that are relevant for evaluating the cartridge reliability in this auxiliary memory.

The cartridge memory chip is located in a corner of every LTO cartridge. The antenna allows wireless access from the tape drive and from other devices.

Cartridge Memory Chip
Contamination - a common cause of problems

Contamination of the tape is a frequent and common cause of problems. For a detailed study of tape contamination see: How Contaminants Affect Tape Data Reliability at High Areal Densities.

How to score

Having such detailed statistics is very beneficial but makes getting a clear picture difficult, if not impossible.

For our purposes we look at two numbers. The first is a composite number that is calculated from all relevant information. It ranges from 0 (very bad) to 100 (very good). We developed the algorithm based on the requirements of our customers. It is used in tape libraries and in one of our products called "VeriTape®". See veritape.mptapes.com
[Skip to Results]

Contaminated Tape

Photo of a rectangular section of an LTO 3 tape of about 1mm x 0.75mm [0.04 x 0.03 inch]. In the center is a section of one of four servo bands with data tracks on either side.

The contamination above the servo band is smaller than 0.2 mm in diameter and it covers about 10 data tracks. On an LTO 8 tape this contamination could impact over 100 data tracks.

It is important to note that the algorithm was developed for a specific purpose: to eliminate bad cartridges. The algorithm casts a wide net, since it is preferable to eliminate several marginal cartridges than to let one bad cartridge go undetected. Many of the cartridges in the ‘bad’ category may still be able to perform adequately.

The second number is called “Hard Read Error’, or ‘Unrecovered Read Error’. A unrecovered read error occurs when all attempts to read a record fail. Most libraries will try the tape in one or several other drives before giving up. If the read operation in another drive is successful, a user may not even notice. If all attempts fail, a user probably will notice. For each of the failed attempts, the Hard Read Error counter is incremented. Even if the read eventually succeeds, the Hard Read Error Counter in the cartridge holds the count of all previously failed attempt.

Media manufacturers like to point out that in many error cases, the problem could be with the drive and not the media. While this is correct, does it really matter to the user? Any failure is a failure of the tape technology, no matter which component is at fault. [Skip to Results]

Origin of the Records

While the Cartridge Memory contains a lot of information regarding the past cartridge performance, it does not contain any information relating to the user of the cartridge or to the user’s data. Most of our customers feel comfortable transmitting these records to us.

We are receiving the performance records from two sources.

One of our products is an LTO Eraser [eraser.mptapes.com], which erases the data from the tapes but leaves the magnetically recorded servo tracks intact. This allows the erased tapes to be used again. During the erasing process, our process management software collects the performance records of each erased cartridge and generates a report, showing details of the erased tapes, including their past performance.

The second source is the above mentioned product called VeriTape® [veritape.mptapes.com]

Many customers use this product to check large numbers of cartridges and occationally send us their records for evaluation.

Screen of erasing management software

The software that manages the erasing process collects the CM record of each erased tape.

VeriTape® screen
Results
Score using composite algorithm

Score based on algorithm that uses all available data to calculate a comprehensive score number. For more details on how this number is calculated see: VeriScore.

This precautionary quality score based on an algorithm designed to weed out potential problem cartridges would eliminate 3% of cartridges.

Score based on unrecoverable read error

This graph shows the percentage of cartridges that had up to the number of indicated unrecovered read errors.

Over 95% of cartridges never had an unrecovered read error in their lifetime.
Almost 99% had no more than three errors.
Fewer than 0.3% had more than 10 errors.
Quality Score


Unrecovered Read Error
Are higher density tapes less reliable?

The capacity of LTO tapes increased over 100 times, from 100 GB of LTO 1, to 12.8 TB of LTO 8. More than 60% of this increase is due to the increased areal density.

Contamination, even small particles, impact many more bits of tapes of the later LTO generations. Does this lower their reliability? Perhaps surprisingly, our data suggest that it is not the case.

Later generation LTO tapes are as reliable as the earlier tapes.


We should note however, that most of our records consist of the older generations. Most are LTO 3 and LTO 4 cartridges. LTO 6 and LTO 7 comprise only about 0.1% combined of the total.

We continue to receive LTO performance records. For some time now, almost no LTO 1 or LTO 2 anymore. LTO 5 and 6 records are picking up fast, while LTO 3 and LTO 4 are slowly declining. We will update our statistics as we receive more records. Please check back on this page.

Native Capacity [TB]

Areal Density [MB/mm2]

Generational Mix

Statistical relevance

Although our sample size is quite large, we did not select at random the cartridges to be examined.

The results are statistically not relevant and they apply to the examined cartridges only.

Most tapes are used to archive data. The tapes are filled to capacity and then stored for a long time. Are most tapes just written once, archieved and never loaded into a drive again? Can tape reliability be determined from this very limited use?

The usage of the examined cartridges reveal that the tapes were used more heavily than the “write-once-archive-and never read” pattern may suggest.

Cartridge Usage


2% were loaded more than 500 times.
7% were loaded more than 250 times.
18% were used over 100 times
About 50% were used over 20 times.
67% were loaded 10 times or more.
Conclusion

The examined tape records were collected from a mixture of large and small sites, from very well maintaned sites, and from sites where maintenance is not a priority. Despite that, the results show that the LTO technology is remarkably robust and reliable.

Maintenance still is important even for a technology as reliable as LTO. Removing damaged cartridges from a site will eliminate potential problems and improve the site’s performance. There is no place for a cartridge with over 500 permanent read errors in any library.

Questions or Comments?
We welcome any response you may have.
Please send your question to: LTO Reliability