How Oxylabs delivers data scraping solutions pro-bono to help people hunt for bad actors

Data scraping solutions

When people hear the term data scraping, their first thought is often about how companies use this technology for competitive reasons – specifically to pull publicly-available data from millions of websites in order to remain more competitive for pricing or for market awareness. This could include monitoring a competitor’s inventory, especially in these times of supply chain malaise. But data scraping has other uses, too, and one of these is the act of partially “making people’s internet experience safer.

There is, sadly, a great deal of offensive material online. The internet, after all, is a technology with no filters other than those put up by individual operators. Instagram, YouTube, and Twitter, for example, all have rules about content, and they routinely ban offensive material, but even they cannot keep track of all of it, and they rely on users to bring issues to their attention. Patrolling for offensive material is made even more difficult when it comes to images, since they tend to slip through text-based filters.

Illegal content detection

Public data scraping solutions provider Oxylabs has contributed to the cleanup effort. As one example, described in more detail in a blog available on their website, they worked on a pro bono partnership with the Communications Regulatory Authority of Lithuania (RRT), after winning a hackathon that focused on automating illegal content detection such as child sexual abuse or pornography, specifically in the Lithuanian IP address space.

Their blend of web scraping technology and AI-driven recognition tools made the impossible now possible: to monitor most sites on the Lithuanian web, scanning thousands of pages and searching for potentially harmful images. Once located, the technology forwards those images to specialists to review. The Oxylabs-created tool reports the detected harmful images to a hotline that had, up until then, depended solely on voluntary reports. The tool allowed more proactivity in the process by monitoring the web in the background and making the reports constant. This means it does not depend too much on the changing habits and research styles of the volunteers.

This type of technology has great potential for the global internet, too. With a little add-on of AI it can be used for retailers and manufacturers to identify and track counterfeit goods and the companies that traffic in them.

Searching for insider risk

Such developments go a long way towards turning the tables on cybercriminals of all types. A great deal of cybersecurity is based on preventing attack and searching for insider risk. But with such an elusive and shapeless criminal element trafficking in data and stolen goods, it becomes necessary to proactively go out and search for the offending material with the same amount of energy and drive that is used in reactive searches and mitigation.

Data scraping has the capacity to search publicly visible websites. It is able to use that same level of persistence that spammers and other dark-side actors have always been using, which creates a force – a collection of bots constantly on the hunt for illegal material.

Data gathering solutions

In this way, the technologies made available by Oxylabs represent a vital new approach to cybersecurity overall, a proactive style of hunting, to complement the more traditional forensic style of fixing. Established in 2015, Oxylabs describes itself as a premium proxy and public web data acquisition solution provider, which enables its clients, companies of all sizes, to fully leverage the power of big data. Its tradition of constant innovation, backed up by a large patent portfolio, and a focus on ethics have allowed Oxylabs to become a global leader in the web data acquisition industry and forge close ties with dozens of Fortune Global 500 companies. In 2022, Oxylabs was named the fastest-growing public data gathering solutions company in Europe in the Financial Times’ FT 1000 list.

For more information about Oxylabs, you can visit them here.

By Steve Prentice

Gilad David Maayan
What Is Cloud Deployment? Cloud deployment is the process of deploying and managing applications, services, and infrastructure in a cloud computing environment. Cloud deployment provides scalability, reliability and accessibility over the internet, and it allows ...
Gilad David Maayan
Network Security in the Public Cloud What is Network Security? Network security is a strategic approach to securing an organization’s resources and data across the corporate network. It helps protect organizations of all sizes, industries, ...
Maxim Melamedov
Trouble is Brewing Cloud Paradise - 2023 Will Determine Company's Long-Term Plans for Cloud Use The relationship between developers and the cloud was practically love at first sight. For years, migration to the cloud in ...
Rob Reinauer
The last few years have brought significant changes, adoption and innovation to the cloud space. As 2023 begins, there’s an opportunity to consider what’s in store for the year ahead. From hybrid and remote work ...
Steve Prentice
The Era of Microlearning Becoming employable and then staying employable requires ongoing, up to date knowledge, and this can become something of a dilemma. Many of us grew up with a traditional understanding of the ...
Martin Mendelsohn
The Colonial Pipeline Dilemma The Colonial Pipeline is one of a number of essential energy and infrastructure assets that have been recently targeted by the global ransomware group DarkSide, and other aspiring non-state actors, with ...
Security Breach 10 Useful Cloud Security Tools
Cloud Security Tools Cloud providing vendors need to embed cloud security tools within their infrastructure. They should not emphasize keeping high uptime at the expense of security. Cloud computing has become a business solution for ...
Get Smarter
Higher Education A big challenge for professionals of all ages is time. Balancing the responsibilities of work and life leave little time for self-improvement in the form of education. But ongoing education is more than ...
The Backup.png
Holiday Photos.png
Hair Loss.png
It’s Magic

PLURALSITE

Pluralsight provides online courses on popular programming languages and developer tools. Other courses cover fields such as IT security best practices, server infrastructure, and virtualization. 

(ISC)²

(ISC)² provides IT training, certifications, and exams that run online, on your premises, or in classrooms. Self-study resources are available. You can also train groups of 10 or more of your employees.

CYBRARY

CYBRARY Open source Cyber Security learning. The world's largest cyber security community. Cybrary provides free IT training certificates. Courses for beginners, intermediates, and advanced users are available.