Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Web Archives: Exploring the Digital Past: Major Web Archives

How to find, make use of, and create your own web archives!

Important Archives and Initiatives

The Internet Archive

One of the oldest - and, by far, the largest - web archiving organization in the world. The Intrenet Arcive began in 1996, and since then, has continuously worked to archive the internet itself. The internet archive has also pioneered many technologies with outsized impacts on the field of computing as a whole, including efficient webcrawlers, extremely-high-density storage, and much else besides.

According to their website, the Internet Archive's combined collections contain:

  • 330 billion web pages
  • 20 million books and texts
  • 4.5 million audio recordings
  • 4 million videos
  • 3 million images
  • 200,000 software programs

Main Services

The Wayback Machine - The Internet Archive's first and most popular project, the world's largest web archive. Locates snapshots associated with specific URLs, which can then be narrowed by date.

Archive-It - A subscription-based web archiving serve that helps organizations build collections of digital content via a web application.

The Memento Project

The Memento Project is an archiving meta-project which seeks to use HTTP to intuitively link URLs with corresponding archived resources (called “mementos”) located across many different, separate web archives. Though full implementation of the project would require wide-scale restructuring of how server’s handle HTTP requests, a current version of the project exists which allows for simultaneous searching of many different web archives at once. There is also a Google Chrome plugin which works similarly. Both are linked below:

Further Reading

Memento: Time Travel for the Web, by Van de Sompel et al -  A paper that outlines the proposed functionality, organization, and mission of the Memento Project

Wikipedia

Wikipedia is one of the largest and most popular websites on the internet. It is also a self-archiving website - virtually every version of every page hosted across the network of wikis which make up Wikipedia are all easily accessible.

To access an individual page's history, simply click the "View History" tab at the top of the page (see diagram).

Past versions of Wikipedia pages, considered together, can show how our public understanding of a topic has changed over time, as well has how Wikipedia's own standards and community has changed. Past versions of these pages may also refer to sources which are no longer considered current, but which may be useful for more creative forms of research.