Mitigation Security

How Oxylabs delivers data scraping solutions pro-bono to help people hunt for bad actors

Data scraping solutions

When people hear the term data scraping, their first thought is often about how companies use this technology for competitive reasons – specifically to pull publicly-available data from millions of websites in order to remain more competitive for pricing or for market awareness. This could include monitoring a competitor’s inventory, especially in these times of supply chain malaise. But data scraping has other uses, too, and one of these is the act of partially “making people’s internet experience safer.

There is, sadly, a great deal of offensive material online. The internet, after all, is a technology with no filters other than those put up by individual operators. Instagram, YouTube, and Twitter, for example, all have rules about content, and they routinely ban offensive material, but even they cannot keep track of all of it, and they rely on users to bring issues to their attention. Patrolling for offensive material is made even more difficult when it comes to images, since they tend to slip through text-based filters.

Illegal content detection

Public data scraping solutions provider Oxylabs has contributed to the cleanup effort. As one example, described in more detail in a blog available on their website, they worked on a pro bono partnership with the Communications Regulatory Authority of Lithuania (RRT), after winning a hackathon that focused on automating illegal content detection such as child sexual abuse or pornography, specifically in the Lithuanian IP address space.

Their blend of web scraping technology and AI-driven recognition tools made the impossible now possible: to monitor most sites on the Lithuanian web, scanning thousands of pages and searching for potentially harmful images. Once located, the technology forwards those images to specialists to review. The Oxylabs-created tool reports the detected harmful images to a hotline that had, up until then, depended solely on voluntary reports. The tool allowed more proactivity in the process by monitoring the web in the background and making the reports constant. This means it does not depend too much on the changing habits and research styles of the volunteers.

This type of technology has great potential for the global internet, too. With a little add-on of AI it can be used for retailers and manufacturers to identify and track counterfeit goods and the companies that traffic in them.

Searching for insider risk

Such developments go a long way towards turning the tables on cybercriminals of all types. A great deal of cybersecurity is based on preventing attack and searching for insider risk. But with such an elusive and shapeless criminal element trafficking in data and stolen goods, it becomes necessary to proactively go out and search for the offending material with the same amount of energy and drive that is used in reactive searches and mitigation.

Data scraping has the capacity to search publicly visible websites. It is able to use that same level of persistence that spammers and other dark-side actors have always been using, which creates a force – a collection of bots constantly on the hunt for illegal material.

Data gathering solutions

In this way, the technologies made available by Oxylabs represent a vital new approach to cybersecurity overall, a proactive style of hunting, to complement the more traditional forensic style of fixing. Established in 2015, Oxylabs describes itself as a premium proxy and public web data acquisition solution provider, which enables its clients, companies of all sizes, to fully leverage the power of big data. Its tradition of constant innovation, backed up by a large patent portfolio, and a focus on ethics have allowed Oxylabs to become a global leader in the web data acquisition industry and forge close ties with dozens of Fortune Global 500 companies. In 2022, Oxylabs was named the fastest-growing public data gathering solutions company in Europe in the Financial Times’ FT 1000 list.

For more information about Oxylabs, you can visit them here.

By Steve Prentice

Steve Prentice

Steve Prentice is a project manager, writer, speaker and expert on productivity in the workplace, specifically the juncture where people and technology intersect. He is a senior writer for CloudTweaks.

How Enterprises Are Using Gen AI To Protect Against ChatGPT Leaks

ChatGPT is the new DNA of shadow IT, exposing organizations to new risks no one [...]
Read more
Bharti Patel

The Goldilocks Principle of Cloud Management: Striking the Ideal Balance

It’s not an all-or-nothing proposition: How to strike the right balance with cloud The pandemic [...]
Read more
Gilad David Maayan

5 Cloud-Based Documentation Tools Compared

Documentation Tools Compared What Are Cloud-Based Documentation Tools? Cloud-based documentation tools are software platforms that [...]
Read more
Gilad David Maayan

What Is the Kubernetes Ingress Controller?

Kubernetes Ingress Controller is a component within a Kubernetes cluster that manages the routing of [...]
Read more
Mariusz Michalowski

Streamlining Infrastructure Management with Terraform Automation

Streamlining Infrastructure Management The growth of cloud computing and infrastructure as code (IaC) practices has [...]
Read more
Featured Thought Leaders

Get Featured: Ready to Showcase Your Insights in Interviews & Thought Leadership?

Attention technology brands! If you have a thought leader enthusiastic about being interviewed and offering guest posts insights to broaden their exposure, act now! They could be showcased on CloudTweaks.
Craig Lowell
Jeff DeVerter
Andy Hilliard
Chris Bray
Nancy Zafrani


Explore top-tier education with exclusive savings on online courses from MIT, Oxford, and Harvard through our e-learning sponsor. Elevate your career with world-class knowledge. Start now!
© 2024 CloudTweaks. All rights reserved.