Asked how long data should be archived, one could say “long enough to ensure they are available when you need it.” This statement illustrates the two most important variables in the equation of archiving data: time and accessibility.
Time, or more precisely the shelf life is the main factor in question when it comes to finding solutions for the archiving needs of a business. However, the data retention requirements can vary considerably, often from one application to another. For example, all companies must manage financial data that are usually kept for seven to 10 years. For data relating to human resources, however, the period may be shorter, but regulations differ between countries and companies. Medical data may be kept for the patient’s life (or slightly longer), and data on atomic energy for 70 years, and so on, and so on.
But what is the parameter common to all these times?
It’s actually very simple: compliance. In most cases, the requirements for data retention correspond to the limitation period after which a party (whether public or private) can not bring an action against the company. Indeed, any failure to produce documents required by a court decision is subject to civil penalties and, in some cases, criminal charges. However, keeping records beyond the compulsory period may lead to this information being exposed during a legal investigation and may unnecessarily compromise the legal status of the company.
Unfortunately, most computer scientists have none or very limited legal knowledge. Therefore, the first step in developing an archiving strategy is to inventory the data and assign them an official retention schedule that is easy for anyone to follow. The legal advisers of the company may be able to provide the necessary information on statutes of limitations for different types of data. If lawyers can not take charge, the heads of the different services that “own” the data should be able to provide the necessary information since they are expected to know the regulatory environment of their activities. Sometimes lawyers and service officials prefer not to set a fixed timetable. In this case, IT departments should not bother guessing. Without setting a specific duration, the default retention period is “ad infinitum”. This is not ideal, but IT managers do not always have a choice in the matter. It is, in most cases, better to save something too long than to throw it away too early.
The term archive is used somewhat lightly in recent years. Archiving can designate moving seldom-used data to the inexpensive large capacity drives, tape backup and storage offline or off site. Similarly there are many data protection solutions (combining snapshots, replication and backup) companies have at their disposal in an arsenal dedicated to archiving. Such an infrastructure is needed to respond economically to the diverse conservation requirements mentioned above. It must find the balance between these needs and the complexity of the solutions. Thus, a good archiving solution will bring the automation necessary to ensure the granularity required at the application level, while minimizing the impact on overall IT operations.