Aarhus University Seal

Early Web Datasets

A number of early web datasets have been made public at The Internet Archive in celebration of a partnership between Archive-It (a commercial service at The Internet Archive) and Archives unleashed, a global research initiative.

The collections may be accessed directly here:

GeoCities Collection (1994–2009)

Friendster (2003–2015)

Early Web Language Datasets (1996–1999)

More information on the collections may be found here.