The Internet Archive is attempting to preserve and digitize over 100,000 vinyl LPs
A gigantic task lies before them...
Kottke.org reports that the Internet Archive (IA) has started work on digitizing and archiving recordings that can be found on vinyl LP’s. The IA itself has explained the whole process here.
The organization began its huge quest in cooperation with the Boston Public Library (BPL) and the goal is to digitize more than 100,000 audio recordings from their sound collection. As explained, the recordings exist in a variety of historical formats, including wax cylinders, 78 RPMs, and LPs. They span musical genres including classical, pop, rock, and jazz, and contain obscure recordings like an album of music for baton twirlers, and a record of radio’s all-time greatest bloopers.
But the process does not only include copying the audio recordings. All of the information on an LP is printed, so “the digitization process must begin by cataloging data. High-resolution scans are taken of the cover art, the disc itself and any inserts or accompanying materials. The record label, year recorded, tracklist and other metadata are supplemented and cross-checked against various external databases.”
The copying of the recordings themselves is done in cooperation with Innodata Knowledge Services who send the LPs to their facility in the Philippines. There, the records have to be set up and turned over “by hand and recording each side at normal speed.”
And that is only half of the job. “Once recorded, there is a large FLAC file for each side of the LP, which needs to be segmented so listeners can easily begin at the desired song. There are two different algorithms used for segmenting; the first one looks at images of the vinyl disc to locate gaps in its grooves, which usually line up with gaps between songs. A second algorithm listens to the audio file to find the silent spaces between songs. When these two algorithms align, our engineers have a good measure of confidence that the machine has found the proper tracks.”
Combine that with the fact that there are 100,000 albums that have to go through that process and you can see how large an undertaking it is for IA and their collaborators.