Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

News broke this week that The Wayback Machine has now archived 4 hundred billion webpages. The Wayback Machine enables users to see how another website looked in the past, and after launching in 1996 the project now requires an incredible 5 petabytes of storage to maintain all its data.

The figures behind the project are impressive; The Wayback Machine’s database is queried over 1,000 times every second by over 500,000 people a day – making Archive.org the 250th most popular site on the entire internet.

wayback_machine_logo

The company’s mission is equally impressive, explicitly expressing their ideology on their website “Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars”.

Brewster Kahle launched the project in 1996 at the same time he started the now-famous web crawling company Alexa Internet. The project has its roots in the development of software that could crawl and download all publicly accessible World Wide Web pages, the Gopher hierarchy, the Netnews bulletin board system, and downloadable software. The archived content itself wasn’t available until 2001 but by 1999 the archive had already expanded its collections to include texts, audio, moving images and software.

Uses of such an archive are widespread. There is a day-today practical use, as evidenced when The Wayback Machine provided access to important Federal Government sites that went dark during the Federal Government shutdown in the United States. There is also an educational aspect, with importance lessons to be learned from the vast amount of big data stored within its archives. Finally, there is a historical aspect, as the development of our internet has been preserved for future generations to enjoy.

The cloud-based project has not been without its controversies. In 2012 China restored access to the database after blocking it for several years, while in the USA an activist sued the organisation for $100,000 after claiming that archiving her site breached her terms of service. The dispute was ultimately settled out of court.

The success of the project has started to attract imitators. Online companies such as Archive.It, Freezepage, and iTools all offer similar services, but not of them can offer the same quality and depth of content as The Wayback Machine.

Is this a vital project or a waste of valuable storage space? Are there ethical questions surrounding the unhibited archiving of so many sites, or are there taking a virtual photograph of events? Let us know in the comments below.

By Daniel Price

About Daniel Price

Daniel is a Manchester-born UK native who has abandoned cold and wet Northern Europe and currently lives on the Caribbean coast of Mexico. A former Financial Consultant, he now balances his time between writing articles for several industry-leading tech (CloudTweaks.com & MakeUseOf.com), sports, and travel sites and looking after his three dogs.

Find out more
View All Articles

Sorry, comments are closed for this post.

Knots And Cloud Service Providers

Knots And Cloud Service Providers

How Do These Two Compare? In Boy Scouts, I learned how to tie knots. The quickest knot you can tie is the slipknot. It’s very effective for connecting one thing to another via the rope you have. It was used in setting up tents, mooring boats to docks temporarily and lifting your food up into…

Are You Sure You Are Ready For The Cloud: Type of Cloud

Are You Sure You Are Ready For The Cloud: Type of Cloud

Type of Cloud Continuing this theme on “Are you ready for the Cloud”, we are going to move forward with a new question: What type a cloud? That can be encompassed with many different connotations. It could mean it’s going to be hosted by a provider, or is it going to be an on-prem cloud?…

From Illusion To Reality: Up Personal On Cloud Computing Privacy

From Illusion To Reality: Up Personal On Cloud Computing Privacy

Cloud Computing Privacy We are under an illusion around the cloud computing. Without a doubt the benefits are incredible. However, cloud impacts personal user privacy and potentially exposes their private data in ways they may not have anticipated. Given that users don’t usually read the terms of service and privacy policies, it is unclear how…

Managed Services Providers (MSPs) – Urged To Embrace The Cloud

Managed Services Providers (MSPs) – Urged To Embrace The Cloud

Managed Services Providers (MSPs)  If you’ve been observant of the MSP industry over the last two years, you’ve no doubt noticed that it has had significant difficulty expanding its service capabilities and growing its revenue stream around cloud computing. At least that was the analysis of recent market research studies assessing the status of cloud…

Who’s Ready For The Cloud, And Can Deliver!

Who’s Ready For The Cloud, And Can Deliver!

Cloud Ready In my article last month, I discussed how the managed service provider (MSP) industry has been continuously urged to embrace the cloud, but in the end, could they? I answered the questions by describing several impediments and challenges that I believe are preventing MSPs from generating significant revenue and successfully fulfilling their client’s…

Cloud Service Provider Selection Considerations

Cloud Service Provider Selection Considerations

Why Cloud Brokers Make Sense Different workloads perform differently on different cloud service providers. Enough so that it is prudent in planning to consider the optimal configuration and the optimal CSP for your solution. Consider this old word problem from years ago. One person can carry two buckets of water. It takes 5 minutes to…

Cloud Computing – The Real Story Is About Business Strategy, Not Technology

Cloud Computing – The Real Story Is About Business Strategy, Not Technology

Enabling Business Strategies The cloud is not really the final destination: It’s mid-2015, and it’s clear that the cloud paradigm is here to stay. Its services are growing exponentially and, at this time, it’s a fluid model with no steady state on the horizon. As such, adopting cloud computing has been surprisingly slow and seen more…

CloudTweaks is recognized as one of the leading influencers in cloud computing, infosec, big data and the internet of things (IoT) information. Our goal is to continue to build our growing information portal by providing the best in-depth articles, interviews, event listings, whitepapers, infographics and much more.

Sponsor