Write Once, Run Anywhere: The IoT Machine Learning Shift From Proprietary Technology To Data

The IoT Machine Learning Shift

While early artificial intelligence (AI) programs were a one-trick pony, typically only able to excel at one task, today it’s about becoming a jack of all trades. Or at least, that’s the intention. The goal is to write one program that can solve multi-variant problems without the need to be rewritten when conditions change—write once, run anywhere. Digital heavyweights—notably Amazon, Google, IBM, and Microsoft—are now open sourcing their Machine Learning (ML) libraries in pursuit of that goal as competitive pressures shift focus from proprietary technologies to proprietary data for differentiation.

Machine learning is the study of algorithms that learn from examples and experience, rather than relying on hard-coded rules that do not always adapt well to real-world environments. ABI Research forecasts ML-based IoT analytics revenues will grow from $2 billion in 2016 to more than $19 billion in 2021, with more than 90% of 2021 revenue to be attributed to more advanced analytics phases. Yet while ML is an intuitive and organic approach to what was once a very rudimentary and primal way of analyzing data, it is worth noting that the ML/AI model creation process itself can be a very complex.

The techniques used to develop machine learning algorithms fall under two umbrellas:

  • How they learn: based on the type of input data provided to the algorithm (supervised learning, unsupervised learning, reinforcement learning, and semi-supervised learning)

  • How they work: based on type of operation, task, or problem performed on I/O data (classification, regression, clustering, anomaly detection, and recommendation engines)

Once the basic principles are established, a classifier can be trained to automate the creation of rules for a model. The challenge lies in learning and implementing the complex algorithms required to build these ML models, which can be costly, difficult, and time-consuming.

Engaging the open-source community introduces an order of magnitude to the development and integration of machine learning technologies without the need to expose proprietary data, a trend which Amazon, Google, IBM, and Microsoft swiftly pioneered.

At more than $1 trillion, these four companies have a combined market cap that dwarfs the annual gross domestic product of more than 90% of countries in the world. Each also open sourced its own deep learning library in the past 12 to 18 months: Amazon’s Deep Scalable Sparse Tensor Network Engine (DSSTNE; pronounced “destiny”), Google’s TensorFlow, IBM’s SystemML, and Microsoft’s Computational Network Toolkit (CNTK). And others are quickly following suit, including Baidu, Facebook, and OpenAI.

But this is just the beginning. To take the most advanced ML models used in IoT to the next level (artificial intelligence), modeling, and neural network toolsets (e.g., syntactic parsers) must improve. Open sourcing such toolsets is again a viable option, and Google is taking the lead by open sourcing its neural network framework, Google’s SyntaxNet, driving the next evolution in IoT from advanced analytics to smart, autonomous machines.

But should others continue to jump on this bandwagon and attempt to shift away from proprietary technology and toward proprietary data? Not all companies own the kind of data that Google collects through Android or Search, or that IBM picked up with its acquisition of The Weather Company’s B2B, mobile, and cloud-based web-properties. Fortunately, a proprietary data strategy is not the panacea for competitive advantage in data and analytics. As more devices get connected, technology will play an increasingly important role for balancing insight generation from previously untapped datasets, and the capacity to derive value from the highly variable, high-volume data that comes with these new endpoints—at a cloud scale, with zero manual tuning.

Collaboration 

Collaborative economics is an important component in the analytics product and service strategies of these four leading digital companies all seeking to build a greater presence in IoT and more broadly the convergence of the digital and the physical. But “collaboration” should be placed in context. Once one company open-sourced its ML libraries, other companies were forced to release theirs as well. Millions of developers are far more powerful than a few thousand in-house employees. As well, open sourcing offers these companies tremendous benefits because they can use the new tools to enhance their own operations. For example, Baidu’s Paddle ML software is being used in 30 different online and offline Baidu businesses ranging from health to financial services.

And there are other areas for these companies to invest resources that go beyond the analytics toolsets. Identity management services, data exchange services and data chain of custody are three key areas that will be critical in the growth of IoT and the digital/physical convergence. Pursuing ownership or proprietary access to important data has its appeal. But the new opportunities in the IoT landscape will rely on great technology and the scale these companies possess for a connected world that will in the decades to come reach hundreds of billions of endpoints.

By Dan Shey

Shireesh Thota
Here’s How to Position Your Organization for the Era of Data Intensity We live in a data-intensive era. Data is booming. Companies are realizing that data is one of the most important assets and they ...
Oxylabs
A conversation with Aleksandras Šulženko – Product owner at Oxylabs.io In a global economy where change happens by the second, one of the best ways to keep up with industry information, including your competitors, is ...
Dan Teichman
Cloud-Native Communications Historically, Communication Service Providers (CSPs) networks ran on purpose-built hardware. However, in the early 2000s organizations started to update their infrastructure, moving to virtualization. Now, providers are looking to take the next step, ...
Bi Tools
BI Tools For Data Scientists Many data scientists prefer to use open-source framework to code scripts; after all, it’s something they already trust to work. Business intelligence tools like Qlik Sense, Power BI, or Tableau, ...
Gilad David Maayan
What Is SSPM? SaaS Security Posture Management (SSPM) is a set of security tools that an organization’s security team can use to gain visibility and manage security for their Software as a Service (SaaS) applications ...

SECURITY TRAINING

  • Isc2

    ISC2

    (ISC)² provides IT training, certifications, and exams that run online, on your premises, or in classrooms. Self-study resources are available. You can also train groups of 10 or more of your employees. If you want a job in cybersecurity, this is the route to take.

  • App Academy

    App Academy

    Immersive software engineering programs. No experience required. Pay $0 until you're hired. Join an online info session to learn more

  • Cybrary

    Cybrary

    CYBRARY Open source Cyber Security learning. Free for everyone, forever. The world's largest cyber security community. Cybrary provides free IT training and paid IT certificates. Courses for beginners, intermediates, and advanced users are available.

  • Plural Site

    Pluralsite

    Pluralsight provides online courses on popular programming languages and developer tools. Other courses cover fields such as IT security best practices, server infrastructure, and virtualization.