Newsletter Subscribe

Bringing you thought leadership, news, infographics, resources and our own brand of comics each week to your inbox...

Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

News broke this week that The Wayback Machine has now archived 4 hundred billion webpages. The Wayback Machine enables users to see how another website looked in the past, and after launching in 1996 the project now requires an incredible 5 petabytes of storage to maintain all its data.

The figures behind the project are impressive; The Wayback Machine’s database is queried over 1,000 times every second by over 500,000 people a day – making Archive.org the 250th most popular site on the entire internet.

wayback_machine_logo

The company’s mission is equally impressive, explicitly expressing their ideology on their website “Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars”.

Brewster Kahle launched the project in 1996 at the same time he started the now-famous web crawling company Alexa Internet. The project has its roots in the development of software that could crawl and download all publicly accessible World Wide Web pages, the Gopher hierarchy, the Netnews bulletin board system, and downloadable software. The archived content itself wasn’t available until 2001 but by 1999 the archive had already expanded its collections to include texts, audio, moving images and software.

Uses of such an archive are widespread. There is a day-today practical use, as evidenced when The Wayback Machine provided access to important Federal Government sites that went dark during the Federal Government shutdown in the United States. There is also an educational aspect, with importance lessons to be learned from the vast amount of big data stored within its archives. Finally, there is a historical aspect, as the development of our internet has been preserved for future generations to enjoy.

The cloud-based project has not been without its controversies. In 2012 China restored access to the database after blocking it for several years, while in the USA an activist sued the organisation for $100,000 after claiming that archiving her site breached her terms of service. The dispute was ultimately settled out of court.

The success of the project has started to attract imitators. Online companies such as Archive.It, Freezepage, and iTools all offer similar services, but not of them can offer the same quality and depth of content as The Wayback Machine.

Is this a vital project or a waste of valuable storage space? Are there ethical questions surrounding the unhibited archiving of so many sites, or are there taking a virtual photograph of events? Let us know in the comments below.

By Daniel Price

About Daniel Price

Daniel is a Manchester-born UK native who has abandoned cold and wet Northern Europe and currently lives on the Caribbean coast of Mexico. A former Financial Consultant, he now balances his time between writing articles for several industry-leading tech (CloudTweaks.com & MakeUseOf.com), sports, and travel sites and looking after his three dogs.

View Website
Philips spotlights connected technology, predictive analytics software, and artificial intelligence advancing population health and precision medicine at HIMSS 2017 AMSTERDAM, Feb. 17, 2017 /PRNewswire/ -- Featuring new and enhanced connected health offerings at the 2017 HIMSS Conference & Exhibition (HIMSS17), Royal Philips (NYSE: PHG,AEX: PHIA), a global leader in health technology, will showcase a broad range of population health management, ...
Read More
Cupertino, California — Apple today announced its 28th annual Worldwide Developers Conference (WWDC) — hosting the world’s most talented developer community — will be held at the McEnery Convention Center in San Jose. The conference, kicking off June 5, will inspire developers from all walks of life to turn their passions into the next great innovations and apps that customers ...
Read More
When Cisco Systems Inc. reports earnings Wednesday, the big question will be if the networking giant’s repeated gambles on software can reverse a yearlong sales slide, or at least point to a reversal of that trend in the future. Cisco CSCO, +1.06%  is scheduled to report fiscal second-quarter earnings less than a month after announcing its latest multibillion-dollar software acquisition, ...
Read More
Offering Integrated and Automated Solutions, Expansive Partner Ecosystem, Advanced Architecture with Cross-Industry Collaboration SAN FRANCISCO, Feb. 14, 2017 – Today Intel Security outlined a new, unifying approach for the cybersecurity industry that strives to eliminate fragmentation through updated integrated solutions, new cross-industry partnerships and product integrations within the Intel Security Innovation Alliance and Cyber Threat Alliance (CTA). “Transforming isolated technologies ...
Read More
IoT Enablement, Analytics Offer Strong Monetisation Opportunities HAMPSHIRE, UNITED KINGDOM--(Marketwired - February 13, 2017) - A new study from Juniper Research has calculated that mobile network operators can realise an additional $85 billion in revenues over the next five years through the deployment and enhancement of non-core services including Big Data analytics and IoT (Internet of Things) enablement. Operators "Can ...
Read More