Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

News broke this week that The Wayback Machine has now archived 4 hundred billion webpages. The Wayback Machine enables users to see how another website looked in the past, and after launching in 1996 the project now requires an incredible 5 petabytes of storage to maintain all its data.

The figures behind the project are impressive; The Wayback Machine’s database is queried over 1,000 times every second by over 500,000 people a day – making Archive.org the 250th most popular site on the entire internet.

wayback_machine_logo

The company’s mission is equally impressive, explicitly expressing their ideology on their website “Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars”.

Brewster Kahle launched the project in 1996 at the same time he started the now-famous web crawling company Alexa Internet. The project has its roots in the development of software that could crawl and download all publicly accessible World Wide Web pages, the Gopher hierarchy, the Netnews bulletin board system, and downloadable software. The archived content itself wasn’t available until 2001 but by 1999 the archive had already expanded its collections to include texts, audio, moving images and software.

Uses of such an archive are widespread. There is a day-today practical use, as evidenced when The Wayback Machine provided access to important Federal Government sites that went dark during the Federal Government shutdown in the United States. There is also an educational aspect, with importance lessons to be learned from the vast amount of big data stored within its archives. Finally, there is a historical aspect, as the development of our internet has been preserved for future generations to enjoy.

The cloud-based project has not been without its controversies. In 2012 China restored access to the database after blocking it for several years, while in the USA an activist sued the organisation for $100,000 after claiming that archiving her site breached her terms of service. The dispute was ultimately settled out of court.

The success of the project has started to attract imitators. Online companies such as Archive.It, Freezepage, and iTools all offer similar services, but not of them can offer the same quality and depth of content as The Wayback Machine.

Is this a vital project or a waste of valuable storage space? Are there ethical questions surrounding the unhibited archiving of so many sites, or are there taking a virtual photograph of events? Let us know in the comments below.

By Daniel Price

Follow Me!

Daniel Price

Daniel is a Manchester-born UK native who has abandoned cold and wet Northern Europe and currently lives on the Caribbean coast of Mexico. A former Financial Consultant, he now balances his time between writing articles for several industry-leading tech (CloudTweaks.com & MakeUseOf.com), sports, and travel sites and looking after his three dogs.
Follow Me!

Sorry, comments are closed for this post.

Comics

At CloudTweaks, we're plugged into the cloud, the internet of things and all that the web has to offer. From wearable technology, to mobile computing, cloud computing and big data, CloudTweaks is your source for updates and news on the most innovative technology.

Popular

Top Viral Impact

Cloud Infographic – The Internet Of Things In 2020

Cloud Infographic – The Internet Of Things In 2020

Cloud Infographic –  The Internet Of Things In 2020 The growing interest in the Internet of Things is amongst us and there is much discussion. Attached is an archived but still relevant infographic by Intel which has produced a memorizing snapshot at how the number of connected devices have exploded since the birth of the…

The Industries That The Cloud Will Change The Most

The Industries That The Cloud Will Change The Most

The Industries That The Cloud Will Change The Most Cloud computing is rapidly revolutionizing the way we do business. Instead of being a blurry buzzword, it has become a facet of everyday life. Most people may not quite understand how the cloud works, but electricity is quite difficult to fathom as well. Anyway, regardless of…

5 Ways The Internet of Things Will Drive Cloud Growth

5 Ways The Internet of Things Will Drive Cloud Growth

5 Ways The Internet of Things Will Drive Cloud Growth The Internet of Things is the latest term to describe the interconnectivity of all our devices and home appliances. The goal of the internet of things is to create universal applications that are connected to all of the lights, TVs, door locks, air conditioning, and…

The Lighter Side Of The Cloud – Holiday Photos

The Lighter Side Of The Cloud – Holiday Photos

The Lighter Side Of The Cloud – Holiday Photos Enjoy our weekly comics provided by our talented cartoonists. By David Fletcher About Latest Posts Follow Me!Daniel PriceDaniel is a Manchester-born UK native who has abandoned cold and wet Northern Europe and currently lives on the Caribbean coast of Mexico. A former Financial Consultant, he now…

Featured Sponsors

The Internet of Everything Opens Up The World

The Internet of Everything Opens Up The World

Shaping The World With New Technologies As a connected collection of intelligent objects, the Internet of Everything promises to open up those areas of the world hardest hit by economic, political and agricultural blights. Relatively inexpensive devices, paired with revolutionary energy sources and unprecedented access to information offer great promise to farmers and workers in…

Sponsors

Built For The Cloud vs. Adapting To The Cloud

Built For The Cloud vs. Adapting To The Cloud

Built For The Cloud vs. Adapting To The Cloud It may just sound like semantics, but there is a real difference between something that was “built for” versus “adapted to.” Would you rather buy a house that was “designed for” energy efficiency or one that was “adapted to” be energy efficient? “Built for” describes a…

Placement Opportunities - Find Out!

Established in 2009, CloudTweaks is recognized as one of the leading influencers in cloud computing, big data and internet of things (IoT) information. Our goal is to continue to build our growing information portal, by providing the best in-depth articles, interviews, event listings, whitepapers, infographics and much more.

You can help continue to support our community by social sharing, sponsoring, partnering or contributing to this great educational resource.

Contact

CloudTweaks Media
Phone: 1 (212) 763-0021

Join Our Newsletter