Make your own free website on Tripod.com

Digitaal Archiveren

PI Vermaes   verpister@gmail.com     To: | Home | TOC Gray | Bottom |

We recommend that production master image files be stored on hard drive systems with a level of data redundancy, such as RAID drives, rather than on optical media, such as CD-R. An additional set of images with metadata stored on an open standard tape format (such as LTO) is recommended (CD-R as backup is a less desirable option), and a backup copy should be stored offsite. Regular backups of the images onto tape from the RAID drives is also recommended. A checksum should be generated and should be stored with the image files.

Currently, we use CD-ROMs for distribution of images to external sources, not as a long-term storage medium. However, if images are stored on CD-ROMs, we recommend using high quality or "archival" quality CD-Rs (such as Mitsui Gold Archive CD-Rs). The term "archival" indicates the materials used to manufacture the CD-R (usually the dye layer where the data is recording, a protective gold layer to prevent pollutants from attacking the dye, or a physically durable top-coat to protect the surface of the disk) are reasonably stable and have good durability, but this will not guarantee the longevity of the media itself. All disks need to be stored and handled properly. We have found files stored on brand name CD-Rs that we have not been able to open less than a year after they have been written to the media. We recommend not using inexpensive or non-brand name CD-Rs, because generally they will be less stable, less durable, and more prone to recording problems. Two (or more) copies should be made; one copy should not be handled and should be stored offsite. Most importantly, a procedure for migration of the files off of the CD-ROMs should be in place. In addition, all copies of the CD-ROMs should be periodically checked using a metric such as a CRC (cyclic redundancy checksum) for data integrity. For large-scale projects or for projects that create very large image files, the limited capacity of CD-R storage will be problematic. DVD-Rs may be considered for large projects, however, DVD formats are not as standardized as the lower-capacity CD-ROM formats, and compatibility and obsolescence in the near future is likely to be a problem. Top

28-1-06
On the longevity of burned CDs (not to be confused with commercially stamped CDs such as music or software):

Factory-pressed CDs are totally different from recordable CDs. In a pressed CD, the data is literally "molded into" (actually pressed into) the media and will not disappear unless the CD is physically damaged. Recordable CDs use a dye that changes color or reflectivity when heated. There are different dye types commonly used in recordable CDs--phthalocyanine, azo, and cyanine, in particular--and they do not all have the same life expectancy and stability...

All of the studies that I have seen except one suggest that properly burned one-time media (-R media, but not -RW media; see below) has an expected life of decades to possibly even centuries. There was a study by NIST (a U.S. government agency, used to be the National Bureau of Standards) on the relative stability of different media here:

StabilityStudy.pdf Top

You can see some comparisons in the NIST study of the different dye types. But this study did not attempt to extrapolate the data to a life expectancy, although it did provide data about the relative stability of the different dyes and reflection layers behind them.

However, opinions still differ as to how long such media will last. The OSTA (Optical Storage Technology Association), in a report here:

cdqa13.htm

suggests that optical recordable media will last 50 to 200 years. This observation is backed by quite a number of studies that I have seen done both by the media makers and others. However, some storage experts suggest numbers more in line with your question, for example the expert in this report suggests a life of only 2 to 5 years:

life_expectancy.html (I have a suspicion that this is the article that you read).

The bottom line is that you are not going to get one single answer that everyone agrees on, although I personally am confident that properly recorded CD-R media can last decades if not a century or two. These 3 articles provide a good starting point for understanding some of the variables involved, which include:

-Dye type
-Physical construction of the media
-Storage conditions (temperature, humidity, light exposure, mechanical stress, chemical exposure and air quality)
-Manufacturing conditions (can vary from batch to batch in otherwise identical media of the same brand) Top

Now let us mention some other things that are relevant and important:

-The quality of the burner. A borderline defective burner can "under expose" the media to the laser beam, producing a seemingly good recording (at the time of burning) that will "fade" over time (failing weeks, months, years or decades sooner than it should have had the laser beam intensity been correct)

-Recording speed. Fast burns (52X) are probably less stable than somewhat slower burns (say 16x to 32x), but you can burn media too slowly also. There is a very good analogy here to photographic film and exposure levels. The dyes on a given media have a certain range of acceptable "exposures" and outside of that range, you can either under or over expose the media to the laser beam. However, mechanical jitter and certain other variables (largely a function of the quality of the drive) generally will be unconditionally worse at faster speeds.

-Your own handling and storage practices. On a CD, the data "exists" in a dye layer on the label side of the media. This can be scratched from the back (from the label side), which will literally and directly destroy the data. The front side is clear plastic but can also be scratched. While front side damage may make the data less readable or completely unreadable, the data is still intact and undamaged on the label side, and the scratches on the front can normally be removed by polishing the plastic. On recordable DVDs, the data is on a layer "inside" the media, but the media is a laminate of several layers and can delaminate, destroying the data. Flexing - even VERY minor flexing - is particularly bad at causing such damage. And, also, recordable DVDs tend to fail from the outside in, so you can increase your success rate and decrease the incidence of failures by not recording such media beyond 80% to 90% of capacity, leaving the outside edge, where the failure rate is greatest and failure occurs first, blank anyway.Top

-Labeling: The glues in adhesive labels, or the solvents in pen-type markers, both applied to the label side (the side containing the data) can SLOWLY penetrate the reflective backing and dye layers and destroy the data. Therefore, for archival media, the safest policy is to not label the CD or DVD itself at all. If you do label it, with either a label or a pen, you are, at best, taking a chance with your data. Hint: it is safe to write on the clear inner hub (where there is no data at all) with a suitable pen that will not rub off.

And, finally, I would be remiss if I did not mention one other factor which is really huge: Eraseable "RW" media is FAR less stable than one-time "R" media and should absolutely not be used for any permanent recordings of any kind whatsoever. There is no question that RW media can and does "fade". Although I have never seen failure of "R" media that I could attribute with absolute certainty to dye instability. I routinely see "RW" recordings that are unreadable after periods of months to a year or two when there is really no other explanation for the failure. I see this both on CD-RW and DVD + / - RW media, and I advise people in the strongest possible terms not to use "RW" media for anything that they want to consider permanent. Since RW media is also both more expensive (a lot more expensive) and slower. From my perspective the decision to never even buy RW media at all is an easy one.

Submitted by: Barry W. of North Canton, OH Top

9-2-06
De licentie-overeenkomsten met fabrikanten die gebruik willen maken van de CD-R-technologie van Philips, krijgen voortaan meer informatie. De norm wordt verbeterd waardoor de compatibiliteit met andere systemen verbetert. Bovendien is er een prijsdaling van 4,5 naar 2,5 dollar per schijfje.

11-2-06 Question:
How long do digital pictures last?

Answer:
Given the likely future of digital data storage, your best preservation for color photo images is in just that form: stored data. Barring the total breakdown of society and human civilization, there will continue to be institutional data preservation facilities available to everyone.
Putting your precious digital photo files "into the system" is likely to be the most surefire, long-term way to preserve them. Of course, you will always be able to produce physical media prints by whatever process is extant at the time you want to view the images, using whatever is the best cost/performance technology you wish to pay for at that time.
It is convenient to keep a backed up, working copy of your digital photo files in your own possession for the near term; this can be on the best media currently available now, optical or magnetic (see previous articles about the longetivity of CD media), but remember that any physical media is subject to deterioration, so you want redundancy. Keep more than one copy and keep the copies on different types of media.
Be careful to use data formats conformant to widely accepted, non-proprietary standards. Do not count on your digital camera's native format to be around for long. Get those images copied from the camera manufacturer's file format into some well-known open-standard format.
For physical prints, you have many choices, but it's a fact that all the color processes create prints that suffer from deterioration over the years. Most color dyes are not stable, especially when exposed to light. You can consult photographic experts to find the best "archival quality" print media and inks. Whether the color image is infused into the media, or layered onto the surface is not as important as the fundamental permanence of the chemistry used.
I personally know that Kodachrome transparencies, protected from light and humidity, last in excellent condition at least 60 years. Maybe longer - time will tell. I can scan my Kodachrome transparencies made around WW2 era and get good prints by various processes. Then I gently put the transparencies back in their special containers, out of the light. Even still, I know that those transparencies are aging - the dye particles are being hit by a few random bits of radiation and breaking down, from year to year. In a few centuries, these images will probably lose detail and color.
For color prints, there are a number of high quality processes from the photographic industry as well as the commercial printing world. These prints can be made from digital files just as easily as from traditional photographic negatives or positives.
In recent years, makers of inkjet, dye sublimation, and color laser printers have claimed archival permanence for their inks (toners). It remains to be proven, but such prints might be a good, low-cost way to keep your photographs at least for a decade or two.
If you can live with monochrome prints, things get more interesting. Various old photographic processes create images that are made up of very small particles of noble metals. Gold, silver, platinum (and other) processes create prints that, on archival quality papers, seem to be able to last for over a hundred years, perhaps much more if ambient conditions are controlled. And some of these old processes yield prints that are highly regarded aesthetically for their resolution and tonality. Simple carbon based inks are very stable - that's why you can still view old prints made from printing inks hundreds of years ago.
But the bottom line is this: Get your photos into open-standard-format digital files. Put those files on servers operated by reliable companies. Make some more temporal copies for your own use (magnetic and optical media) and keep those as working copies. Make prints from time to time, when you (or your descendants, or the future legal owners of the images) need them.
Let's hope that there will be humans around in a few hundred years who have leisure time (or work need) to enjoy your images.
Submitted by: Dion J. Top

Everything is going digital. Digital music, digital photos, digital movies. Is that a dangerous trend?

In 100 years, anything we put on electronic media will not exist. Yet anything published will still be around.

Someday flash memory will take over. But drives get so big, and they're so inexpensive, and so fast that memory hasn't been able to catch up. , 200 Gigs of flash memory would cost quite a bit.

19:18 2-3-06
Of course, each computer user in the family will always want their own local storage, but it may also be convenient and secure to have a central server where archives and backups live permanently.

When it comes to maintenance and storage, another wise investment is an uninterruptible power supply ; both for that basement network server and for each computer in the house. When the power fails, the battery backup gives you at least a few minutes to save your work before the screen goes dark. Top

For larger systems, old-line UPS manufacturer APS offers a 1200VA system for a street price of about $149 ; it provides 8 outlets, surge protection, and battery runtime of well over an hour.

Companies like Intermatic, Leviton and Panamax all make high capacity surge protectors that mount right where your electrical service enters the house, and thus protect everything in your home from surge damage. Top

3-3-06
With government records, reports and documents increasingly being created and stored in digital form, there is a software threat to electronic access to government information and archives. The problem is that public information can be locked in proprietary software whose document formats become obsolete or cannot be read by people using software from another company.

To cope with the problem, 30 companies, trade groups, academic institutions and professional organizations are announcing today the formation of the OpenDocument Format Alliance, which will promote the adoption of open technology standards by governments. Top

But Microsoft supports another open standard for documents, called OpenXML Document Format. In Office 2007, which Microsoft will ship in the second half of the year, OpenXML will be the default format for saving documents instead of Microsoft's proprietary formats, said Alan Yates of the company's Office division.

The OpenXML format is supported by Intel, Apple, Toshiba, BP and the British Library, among others, Mr. Yates said. Microsoft submitted OpenXML to Ecma International, a standards body in Geneva, last year. Top

Open Archive Initiative (OAI)
OAI is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.

Archive
The term "archive" in the name Open Archives Initiative reflects the origins of the OAI in the e-prints community where the term archive is generally accepted as a synonym for repository of scholarly papers. Members of the archiving profession have justifiably noted the strict definition of an ?archive? within their domain; with connotations of preservation of long-term value, statutory authorization and institutional policy. The OAI uses the term ?archive? in a broader sense: as a repository for stored information. Language and terms are never unambiguous and uncontroversial and the OAI respectfully requests the indulgence of the professional archiving community with this broader use of ?archive?.
(OAI definition quoted from FAQ on OAI Web site)

OAI Protocol for Metadata Harvesting (OAI-PMH)
OAI-PMH is a lightweight harvesting protocol for sharing metadata between services. Top

 

Protocol
A protocol is a set of rules defining communication between systems. FTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are examples of other protocols used for communication between systems across the Internet.

Harvesting
In the OAI context, harvesting refers specifically to the gathering together of metadata from a number of distributed repositories into a combined data store.

Data Provider
A Data Provider maintains one or more repositories (web servers) that support the OAI-PMH as a means of exposing metadata.
(OAI definition quoted from FAQ on OAI Web site)

Service Provider
A Service Provider issues OAI-PMH requests to data providers and uses the metadata as a basis for building value-added services.
(OAI definition quoted from FAQ on OAI Web site)
A Service Provider in this manner is "harvesting" the metadata exposed by Data Providers Top

PDFs can be a valid choice as long-term accessible documents. (Work is being done on a PDF variant based on PDF 1.4. The PDF/A or PDF-Archive is specifically scaled down for archival purposes.)

Microsoft Word documents can be converted into accessible PDFs, but only if the Word document is written with accessibility in mind - for example, using styles, correct paragraph mark-up and "alt" (alternative) text for images, and so on.

PDF on the WEB

Documents described in markup languages such as HTML/XHTML delegate responsibility for many display decisions to the renderer. This means that an XHTML document can render quite differently across various web browser platforms. While the end user experience of an XHTML document can vary significantly depending on browser, platform, and screen resolution, a PDF file can be reasonably expected to look exactly the same to every viewer. The desire for greater control over user experience has led many authors to use the PDF format to publish online content. This is particularly true for order forms, catalogues, brochures, and other documents which are primarily formatted for printing. The ubiquity of Adobe Reader and wide corporate availability of easy to use WYSIWYG PDF authoring have further enticed many (mostly corporate) web authors to publish a wider variety of information as PDF. Top

Critics of this practice cite several reasons for avoiding it. The major one is that the inflexibility of PDF rendering makes it difficult to read on screen: it does not adapt to the window size nor the reader's preferred font size and font family, as classic XHTML web page does. PDF files tend to be significantly larger than XHTML/SVG files presenting the same information, making it difficult or impossible for users with low-bandwidth connections to view them. Adobe Reader, the de facto standard PDF viewer, has historically been slow to start and caused browser instability, particularly when run alongside other browser plugins (Adobe Reader 7 addressed many of these concerns, but is not available under Windows 98/ME). Adobe Reader is also unavailable in current versions on many alternative operating systems and is distributed under a proprietary license unacceptable to some users. During each major release of Adobe (Acrobat) Reader, the installer package gets significantly larger to support extra features, but users are left without means to selectively install components. Top