Monthly Archives: April 2012

Why in the second part of the 90’s animated GIFs were rarely (outside of porno sites) used to show film and video sequences? Because even half a second of heavily compressed and and downscaled, barely recognizable footage would be still too heavy and slow network of that time.

In 1998, Shocking Blue fan Greg converted one second of a TV performance of the group’s hit song “Venus” into a GIF. It is 160×120 pixles, contains 15 frames and weights 93KB. Greg didn’t dare to confront the visitors of the page with such a huge file.1 He uses a static image and suggests to start loading the animation with a click, but to be ready that “it may take 20 sec”.

Starting image

Animated GIF

The moment in the original video

Original URL: http://www.geocities.com/ofmang/greg/shockblu.html


  1. Just for comparison, the third picture in the Alternative Animated GIF Timeline, a video loop as common in GIFs today, is 461×322 pixels, 2MB, 42 frames.
    []


Original URL: http://www.geocities.com/vienna/4302/

null

Original URL: http://www.geocities.com/~johanh/

The page is still on(VRM)line.

Access to the remains of Geocities can be measured on two axis: authenticity (how realistic can the harvested data be presented again) and ease of access (what technical requirements and what knowledge are needed to gain access on a certain level of authenticity).

Graphical authenticity on the pixel level means that a Geocities web page will render exactly as it would to visitors in the time the page was published. This might be considered as esoteric on the grounds that web authors could never be sure how their creations would appear to their audience, by the nature of HTML itself. However, web authors often were not aware of this fact and people of the 1990’s used different browsers on different operating systems than we do today. The dominance of for example Netscape browsers and a certain set of plugins meant that most people would experience a web page in a certain way that was very different from accessing it now with Webkit based browsers.

What has changed most significantly in operating systems since the high times of Geocities is the display of text. All current operating systems render characters with smoothed out edges, and this is reflected as much in current web design as the historic aliased pixel text display influenced the web design of the past.

Low barrier access to original visual web culture can be provided by screenshots taken from virtual machines running an historic operating system and browser.

Today, MIDI music files do not sound at all as they used to when they dominated web audio. These audio files can be recorded from historic hardware and operating systems or rendered with emulators for contemporary listeners. This will rip them of any context, but still can give a good, easily accessible impression on how the web sounded at a certain point in time.1

The web pages’ original interactivity can be restored by accessing them from a mirror and employing a browser addon that re-writes the original URLs contained in the HTML to match the mirror’s URLs.2

If the data should not be tampered with, neither on the client nor on the mirror server, and URLs shown in the browser’s address bar should be the exact originals, a proxy server must be put in front of the mirror server. This proxy can transparently re-direct requests to for example http://www.geocities.com/ to http://mirror.local/www.geocities.com/ without the need for any name server tricks or changing historic HTML code. It is trivial to write such a proxy in nodejs for example.

As even the oldest browsers support proxy servers, one can employ this system on a virtualized historic operating system like Windows 95 or Windows 98 with authentic web browsers. This significantly raises the access barrier though, as access to such a proxy must be technically restricted, or the Internet will collapse.3 Also, virtual machine software, historic operating systems and historic browsers and plugins are not easily available to the largest part of web users that might be interested in looking at their cultural past.

The largest authenticity surface is covered by using all the before mentioned techniques and running historic hardware. Current virtualization and emulation systems are quite good in re-creating all aspects of 1990’s computers, except audio. There are just not enough business critical MIDI files out there to make companies like VMWare emulate OPL3 sound chips. Additionally, all graphical output looks very different on CRT monitors and their special surface-to-pixel ratios than it does on contemporary flat screens. For example, when looking at historic web pages on a 800×600 pixel 14″ CRT screen with a 60Hz refresh rate, it becomes clear why many people decided to use dark backgrounds and bright text for their designs instead of emulating paper with black text on a white background.

Balancing

While restoration work must be done on the right end of the scale to provide a very authentic re-creation of the web’s past, it is just as important to work on every point of the scale in between to allow the broadest possible audience to experience the most authentic re-enactment of Geocities that is comfortable for consumption on many levels of expertise and interest.


  1. The artist Ryder Ripps for example published MP3 recordings of MIDI files as online mix tapes and a vinyl record containing recordings of a selection of classic MIDI files. Of course these projects present only “snapshots” of a certain kind of MIDI playback sound and the selection process definitely targets musically interested people of today, but no doubt these efforts will keep the sound present. []
  2. See Tips for Torrenters on this blog for a version of a browser addon that does this. reocities takes a similar approach and even automatically changed all HTML code so it validates and is more likely to “work” on contemporary browsers. While this seems questionable from an archivist’s point of view it surely allows broader access to the historic data. []
  3. The most common problem with open proxies is, as experience from my project insert_coin shows, that people from Saudi Arabia use up a lot of bandwidth while accessing pornographic material that is blocked in their home country. []

Many people started the adventure of downloading the Geocities Torrent, and while constantly seeding it, I have seen the peer numbers drop to zero a long time ago. Anyway, I do not recommend using the torrent at the moment, since Jason Scott, now working for the Internet Archive, put all the Geocities data into the Geocities Valhalla.

These are still the same files, hard to handle in their sheer amount, and not as neatly organized as the contents of the original torrent. If you like to complete your canceled torrent download, or of you wish to download the whole thing via HTTP, you might like this script:

I mostly wrote it for myself to check data integrity before the first serious database ingest, but extended it to support downloading. It requires Perl 5.1x, XML::TreePP and md5sum. Hope you will find it useful.

(To all the peeps reading this with a feed reader: The post contains a Perl script, embedded via gist from github.)