Machine Learning Open-Source Tools
We are in the middle of a machine learning, AI and big data renaissance — at least, that’s what we’re calling it.
Seemingly everyone is interested in this technology these days, and for a good reason. AI and machine learning bots are said to power 85% of customer service interactions by 2020.
(Infographic source: pwc.com)
One problem the industry is seeing, however, is that there’s a severe lack of developers and new talent. It’s a problem for the entire development and programming industry, not just Machine Learning. Many companies and brands are vying for new employees, leaving the startups and newer names in a bit of hot water.
Luckily, this can be offset by adopting open-source development protocols. More importantly, you can open your projects — future and present — to an even broader development community and audience by making it open-source.
Open-source tools allow anyone to contribute to a project and work on fixes for bugs, new features and new builds. You can retcon separate versions, selecting the content and elements you want in an official release. This way, even though there’s a development community behind the project, you still have a great deal of control over the central project path. A demonstration of open source software BloodHound was given at the 2016 DEFCON hacking conference, which blew the doors wide open to this kind of shared development platform — at least when it comes to AI and machine learning.
Here six of the best open-source tools developers and scientists can use to do more with machine learning platforms.
The problem with AI systems like Siri, Alexa, Google Assistant and even IBM’s Watson is that they’re all proprietary. In other words, no matter what data is collected and what enhancements are made, progress is not shared with the greater community. That also means if one of those parties makes an incredible leap forward regarding technology and power, only they will benefit.
Worried about this, Elon Musk and his army of investors and partners have helped fund the OpenAI project. It’s a non-profit AI-centric research initiative that aims to advance the technology and systems involved for everyone.
Since its birth, the team’s researchers have been responsible for publishing many insightful papers on advances in AI, and even a few tools of their own. Gym, for example, is a developer toolkit that allows you to compare reinforcement learning algorithms. Universe, another tool created by the team, offers a collection of Gym environments that monitor and measure an AI platforms general intelligence.
Char-RNN is a somewhat unique, Torch and LUA-based neural net — the Facebook team supports it. Char-RNN stands for Character Recurrent Neural Network, and it’s designed to predict the next character in sequence or series, using previous characters and historical info.
It’s a deep learning framework that can be leveraged in many ways. One researcher, Janelle Shane, has been able to create some incredibly fun entertainment projects through the system. Her whole shtick is to bring out the weirdness in AI and foster the personality quirks.
We couldn’t have an open-source list without a tool or entry from Google, could we? TensorFlow is an open-source ML library or AI incumbent framework, developed almost exclusively with Python. Jumping right in, you have access to several experimental APIs, including in Java and Go.
Like the other frameworks here, you can do anything within the confines of the system. It’s also one of the most accessible open-source ML tools and has a ton of resources for beginners. If you’re new to these programs, start here. TensorFlow also has an incredibly large and supportive community of developers and researchers behind it, which is always good to tap into in a pinch.
CNTK, also known as Microsoft’s Cognitive Toolkit, is another excellent deep-learning platform for developers and researchers. It’s designed to train and create algorithms, which will, in turn, allow AI systems and machine learning devices to boost their knowledge just like the human brain. If you want your AI to grow smarter, stronger and more capable, this is a valuable tool.
It’s written in both Python and C++, which is Microsoft’s preferred development language, and is highly customizable. With support through GitHub and an active community, you’ll find plenty of help when you need it.
The only downside to CNTK is that it’s in active use at Microsoft and still very much a work in progress, so you’ll probably encounter bugs and software issues. Luckily, the open-source and GitHub support means you don’t have to wait around for the official dev team to iron out the kinks.
Chinese company Baidu — a significant force in the Asian market — is responsible for PaddlePaddle, a capable and advanced AI toolset. Advancements are being made by the team thanks to an incredible AI lab and experienced professionals, including an ex-Stanford professor.
If any of the platforms on this list can give Google and TensorFlow a run for their money, it’s this crew. Paddle stands for PArallel Distributed Deep Learning. It’s a deep learning platform designed to be efficient, flexible and scalable enough to meet the needs of any project scope. Adding to that full-featured package is the fact that it’s easy-to-use, making it great for beginners.
The PaddlePaddle team has made a vast selection of resources and guides available for anyone that wants to dive in, including a “getting started” guide.
Amazon’s DSSTNE (pronounced “destiny”) stands for Deep Scalable Sparse Tensor Network Engine. It’s a deep learning library meant to train and engage neural networks using GPUs or graphics hardware. DSSTNE is a response to Google’s open-source TensorFlow platform and includes plenty of support from the retail giant.
DSSTNE was initially created by Amazon’s highly-capable engineer team to drive the recommendations system that delivers product suggestions and promotional materials to customers using the retailer’s website. Every time you search for an item and then see it listed as a recommended product on Amazon, that’s thanks to DSSTNE. The tool is also being used to enhance the ECHO lineup, backed by the equally-impressive Alexa AI.
Amazon has made the platform open-source, like Google, to boost innovation in the industry and hopefully make some significant advancements in the technology.
By Kayla Matthews
Kayla Matthews is a technology writer dedicated to exploring issues related to the Cloud, Cybersecurity, IoT and the use of tech in daily life.
Her work can be seen on such sites as The Huffington Post, MakeUseOf, and VMBlog. You can read more from Kayla on her personal website.