Of the 7 phases of data lifecycle management, we have now covered: Data CaptureData MaintenanceData SynthesisData Usage, and Data Publication.  On to Phase 6: Data Archival.

Data archival is one of the most misunderstood phases of data lifecycle management.  Archival is often vaguely lumped in with backup and viewed as secondary necessity. But in fact, data archival is a key element in a holistic enterprise data lifecycle management strategy.

To operate legally, organizations must comply with thousands of regulatory directives. These include data retention regulations and data archival specifications such as SEC Rules 17a-4 & 17a-3, which specify retention requirements of records to be made by and preserved by certain exchange members, brokers, and dealers.

Key points of the SEC Rules:

  • Documented and enforceable retention policies.
  • Searchable index of stored enterprise data.
  • Data must be retrievable and readable.
  • Data should be stored in multiple/alternative locations.
  • Storage of data on WORM (Write Once Read Many) electronic media to facilitate preservation of unchanged data. 

 

What is Data Archival?

Data archival is the act of moving data that is no longer actively used to a separate storage area dedicated to data preservation and retention.

A data archive is a set of older data that is no longer in active use but that must be preserved for either future business reference or because it is mandated to be preserved by regulatory requirements.

How is Archival different from Backup?

Data archival is a different process and has a different purpose from data backup.  The primary purpose for backup is data protection, while the primary purpose for archival is retention.

Backup is about quick recovery from a data or storage failure. Its success is measured by recovery point (did you lose any data?) and recovery time (how long were you down?).

Archival is about preserving data and protecting it from change. Its measure of success is policy adherence (how well did you adhere to retention management policies?) and retention time (how timely did you delete or purge or remove the data from all instances of your systems once the data has met its retention criteria?).

 

Why Archive?

There are three major reasons to archive:

  1. To preserve data for business reference. Examples include financial records, intellectual property records, patents filed, and so on.
  2. To preserve data in order to meet regulatory or legal requirements such as FINRA 17a-4 and 17 CFR Part 241.
  3. To manage cost and performance. The larger the production environment, the more complex and costly it is to manage.

What to consider when archiving?

  • Structured versus unstructured data.
  • Regulated versus non-regulated data.
  • Immutable versus non-immutable requirements for regulated data.

What are the desired outcomes of an archive strategy?

Every archive deployment should provide the capability to:

  1. Remove and reclaim storage allocations from production and backup environments.
  2. Move archive data from near line to long-term archive media based business and regulatory policies.
  3. Move data into immutable archives.

There is more to archiving than this one blog post, so we will circle back later with more.  Feel free to comment or reach out if you want more before the next post on this subject.

Leave your thoughts on the comments section. Or feel free to contact me at datafabricblog@gmail.com

Happy to send an editable version of the 7 phases of a data lifecycle images, just send me an email with your request.

About the author:

Juliana Carroll is a problem solver, strategic thinker with 15+ years of Fortune 500 consulting experience delivering measurable results that align with both business competitive requirements and regulatory compliance.

Juliana delivered exceptional results for organizations including Morgan Stanley, Prudential, Merck, Guardian Life Insurance, Blue Cross Blue Shield of Florida, and Deutsche Bank.

Connect with Juliana Carroll via Linked In.