Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

Preserving The Cloud: The Wayback Machine

News broke this week that The Wayback Machine has now archived 4 hundred billion webpages. The Wayback Machine enables users to see how another website looked in the past, and after launching in 1996 the project now requires an incredible 5 petabytes of storage to maintain all its data.

The figures behind the project are impressive; The Wayback Machine’s database is queried over 1,000 times every second by over 500,000 people a day – making Archive.org the 250th most popular site on the entire internet.

wayback_machine_logo

The company’s mission is equally impressive, explicitly expressing their ideology on their website “Most societies place importance on preserving artifacts of their culture and heritage. Without such artifacts, civilization has no memory and no mechanism to learn from its successes and failures. Our culture now produces more and more artifacts in digital form. The Archive’s mission is to help preserve those artifacts and create an Internet library for researchers, historians, and scholars”.

Brewster Kahle launched the project in 1996 at the same time he started the now-famous web crawling company Alexa Internet. The project has its roots in the development of software that could crawl and download all publicly accessible World Wide Web pages, the Gopher hierarchy, the Netnews bulletin board system, and downloadable software. The archived content itself wasn’t available until 2001 but by 1999 the archive had already expanded its collections to include texts, audio, moving images and software.

Uses of such an archive are widespread. There is a day-today practical use, as evidenced when The Wayback Machine provided access to important Federal Government sites that went dark during the Federal Government shutdown in the United States. There is also an educational aspect, with importance lessons to be learned from the vast amount of big data stored within its archives. Finally, there is a historical aspect, as the development of our internet has been preserved for future generations to enjoy.

The cloud-based project has not been without its controversies. In 2012 China restored access to the database after blocking it for several years, while in the USA an activist sued the organisation for $100,000 after claiming that archiving her site breached her terms of service. The dispute was ultimately settled out of court.

The success of the project has started to attract imitators. Online companies such as Archive.It, Freezepage, and iTools all offer similar services, but not of them can offer the same quality and depth of content as The Wayback Machine.

Is this a vital project or a waste of valuable storage space? Are there ethical questions surrounding the unhibited archiving of so many sites, or are there taking a virtual photograph of events? Let us know in the comments below.

By Daniel Price

About Daniel Price

Daniel is a Manchester-born UK native who has abandoned cold and wet Northern Europe and currently lives on the Caribbean coast of Mexico. A former Financial Consultant, he now balances his time between writing articles for several industry-leading tech (CloudTweaks.com & MakeUseOf.com), sports, and travel sites and looking after his three dogs.

Find out more
View All Articles

Sorry, comments are closed for this post.

Comic
Fintech Systems, Advancements and Investments

Fintech Systems, Advancements and Investments

Fintech Growth According to a recent report, global investment in fintech companies including both venture-backed and non-venture-backed businesses reached $9.4 billion in the second quarter of 2016; investment in venture capital-backed fintech startups, however, fell by 49%. Nevertheless, the Pulse of Fintech, published jointly by KPMG International and CB Insights, suggests venture capital investment in…

How Identity Governance Can Secure The Cloud Enterprise

How Identity Governance Can Secure The Cloud Enterprise

Securing The Cloud Enterprise Cloud adoption is accelerating for most enterprises, and cloud computing is becoming an integral part of enterprise IT and security infrastructure. Based on current adoption trends, it’s clear that the vast majority of new applications purchased by organizations will be SaaS applications. The allure is evident, from cost savings to speed…

Significant Emerging Technologies To Lookout For In 2017

Significant Emerging Technologies To Lookout For In 2017

Emerging Technologies The entire world is being transformed right before our eyes. Emerging technologies are developing at break-neck speeds, and the global community needs to be prepared for what lies in the horizon. As with anything new or evolving there is benefit versus risk to consider. Most of the up-and-coming technologies that will soon affect…

In The Fast Lane: Connected Car Hacking A Big Risk

In The Fast Lane: Connected Car Hacking A Big Risk

Connected Car Hacking Researchers and cybersecurity experts working hard to keep hackers out of the driver’s seat. Modern transportation has come a million miles, and most all of today’s vehicles are controlled entirely by digital technology. Millions of drivers are not aware that of the many devices in their digital arsenal, the most complex of…

Having Your Cybersecurity And Eating It Too

Having Your Cybersecurity And Eating It Too

The Catch 22 The very same year Marc Andreessen famously said that software was eating the world, the Chief Information Officer of the United States was announcing a major Cloud First goal. That was 2011. Five years later, as both the private and public sectors continue to adopt cloud-based software services, we’re interested in this…

Cloud-based GRC Intelligence Supports Better Business Performance

Cloud-based GRC Intelligence Supports Better Business Performance

Cloud-based GRC Intelligence All businesses need a strategy and processes for governance, risk and compliance (GRC). Many still view GRC activity as a burdensome ‘must-do,’ approaching it reactively and managing it with non-specialized tools. GRC is a necessary business endeavor but it can be elevated from a cost drain to a value-add activity. By integrating…

5 Things To Consider About Your Next Enterprise File Sharing Solution

5 Things To Consider About Your Next Enterprise File Sharing Solution

Enterprise File Sharing Solution Businesses have varying file sharing needs. Large, multi-regional businesses need to synchronize folders across a large number of sites, whereas small businesses may only need to support a handful of users in a single site. Construction or advertising firms require sharing and collaboration with very large (several Gigabytes) files. Financial services…

Using Cloud Technology In The Education Industry

Using Cloud Technology In The Education Industry

Education Tech and the Cloud Arguably one of society’s most important functions, teaching can still seem antiquated at times. Many schools still function similarly to how they did five or 10 years ago, which is surprising considering the amount of technical innovation we’ve seen in the past decade. Education is an industry ripe for innovation…

Do Not Rely On Passwords To Protect Your Online Information

Do Not Rely On Passwords To Protect Your Online Information

Password Challenges  Simple passwords are no longer safe to use online. John Barco, vice president of Global Product Marketing at ForgeRock, explains why it’s time the industry embraced more advanced identity-centric solutions that improve the customer experience while also providing stronger security. Since the beginning of logins, consumers have used a simple username and password to…

Four Keys For Telecoms Competing In A Digital World

Four Keys For Telecoms Competing In A Digital World

Competing in a Digital World Telecoms, otherwise largely known as Communications Service Providers (CSPs), have traditionally made the lion’s share of their revenue from providing pipes and infrastructure. Now CSPs face increased competition, not so much from each other, but with digital service providers (DSPs) like Netflix, Google, Amazon, Facebook, and Apple, all of whom…

Protecting Your Web Applications In A Hybrid Cloud Environment

Protecting Your Web Applications In A Hybrid Cloud Environment

Protecting Your Web Applications It’s no secret that organizations are embracing the cloud and all the benefits that it entails. Whether its cost savings, increased flexibility or enhanced productivity – businesses around the world are leveraging the cloud to scale their business and better serve their customers. They are using a variety of cloud solutions…

Why Hybrid Cloud Delivers Better Business Agility

Why Hybrid Cloud Delivers Better Business Agility

Why Hybrid Cloud Delivers Better Business Agility A CIO friend of mine once told me that a hybrid cloud model enables him to “own the base, rent the spike” when it comes to unplanned events. Let’s face it – maintaining unused infrastructure for rare or random IT events is expensive and unnecessary in a cloud…

Cloud Computing Checklist For Startups

Cloud Computing Checklist For Startups

Checklist For Startups  There are many people who aspire to do great things in this world and see new technologies such as Cloud computing and Internet of Things as a tremendous offering to help bridge and showcase their ideas. The Time Is Now This is a perfect time for highly ambitious startups to make some…

Why Cloud Compliance Doesn’t Need To Be So Overly Complicated

Why Cloud Compliance Doesn’t Need To Be So Overly Complicated

Cloud Compliance  Regulatory compliance is an issue that has not only weighed heavily on the minds of executives, security and audit teams, but also today, even end users. Public cloud adds more complexity when varying degrees of infrastructure (depending on the cloud model) and data fall out of the hands of the company and into…

Infographic: IoT Programming Essential Job Skills

Infographic: IoT Programming Essential Job Skills

Learning To Code As many readers may or may not know we cover a fair number of topics surrounding new technologies such as Big data, Cloud computing , IoT and one of the most critical areas at the moment – Information Security. The trends continue to dictate that there is a huge shortage of unfilled…

The Internet of Things – Redefining The Digital World As We Know It

The Internet of Things – Redefining The Digital World As We Know It

Redefining The Digital World According to Internet World Stats (June 30th, 2015), no fewer than 3.2 billion people across the world now use the internet in one way or another. This means an incredible amount of data sharing through the utilization of API’s, Cloud platforms and inevitably the world of connected Things. The Internet of Things is a…