HITInfrastructure

Cloud News

Cloudera Releases Open Sourced, Machine Learning Platform

Cloudera launches self-service platform based on open source and machine learning technology.

Cloudera's latest platform uses open source and machine learning technology.

Source: Thinkstock

By Elizabeth O'Dowd

- Cloudera announced the general availability of the Cloudera Data Science Workbench platform for self-service data analytics using open source technologies and machine learning.

The platform operates directly in the web browser with Python, R, and Scala, giving users the ability to download and experiment with libraries and frameworks in customizable project environments. Cloudera Data Science Workbench is secure and compliant, with support for Hadoop authentication, authorization, encryption, and governance.

"We are entering the golden age of machine learning and it's all about the data. However, data scientists continue to struggle to build and test new analytics projects as fast as they would like, particularly in large scale environments," Cloudera Products Senior Vice President Charles Zedlewski said in a statement. "The Data Science Workbench is a self-service tool that accelerates the ability to build, scale and deploy machine learning solutions.”

“This means that data scientists now have the freedom to share, collaborate and manage their data in a way that best suits them and their enterprise, resulting in an easier and faster path to production."

Cloudera Data Science Workbench integrates with many existing deep learning frameworks including BigDL, a deep learning library for Apache Spark, open sourced by Intel. BigDL works directly within Cloudera's Data Science Workbench and is built to run on distributed Spark/Hadoop infrastructure and performance-optimized to run on Intel Xeon processors.

The integration of BigDL into Data Science Workbench allows organizations to leverage deep learning libraries and tactics on CPU architecture without having to add additional hardware or separate environments. BigDL and Data Science Workbench gives organizations a way to create native Spark data science pipelines and integrate them with BigDL and other Spark/Hadoop components.

"Enterprise customers require a cohesive platform to scale their analytics solutions and maximize their investments,” Intel VP and General Manager of System Technologies and Optimization in the Software and Services Group Michael Greene said in a statement. “BigDL's native integration with Apache Spark brings the world of deep learning to the Apache Spark ecosystem and higher value to enterprise customers.”

“The BigDL framework will help enterprise customers better utilize existing investments to build their analytics capabilities with optimized performance on Intel architecture."

Healthcare organizations need better big data analytics tools to handle all of the unstructured data being constantly collected by connected medical and Internet of Things (IoT) devices. Artificial intelligence (AI) tools are a way organizations can utilize the unstructured data stored in repositories and use it for analytical insight into patient health.

Open source software is an option for healthcare organizations because of the challenges they face concerning budget restraints and how new, more advanced technology will integrate into their health IT infrastructure.

Open source code is free source code made available to any developer to build software. Developers download a free license to the source code and retain the right to study and change the source code to suit their individual needs.

Many software vendors, including major companies such as Google, provide open-source code to developers. Open-source software operates under licenses such as Apache 2.0 Open Source License, meaning that the entire source code is available to the public for free.

Open source code is important to the advancement of healthcare machine learning and artificial intelligence because it allows organizations to customize the source code and share innovations and improvements with other organizations.

Late last year, Health Catalyst announced the launch of its open source analytics solution to encourage healthcare organizations currently implementing successful AI solutions for predictive analytics to make its solutions widely available.

Health Catalyst’s healthcare.ai makes machine learning accessible to healthcare organizations with limited means to develop their own AI analytics.

The healthcare.ai is a central location for healthcare professionals to download free algorithms and tools, and participate in forum discussions with data scientists and other healthcare professionals to exchange ideas, make requests, and contribute code of their own.

Integrating open source software into healthcare artificial intelligence solutions allows more organizations to participate in building solutions. It also speeds along the development process as organizations continue to improve their machine learning solutions using the same framework.