Integrating Apache Spark with

Integrating Apache Spark with H2O is surprisingly simple when you know what you are doing. In this post, we’ll cover multiple data transformations with Spark. The data will then be fed into the H2O AutoML algorithm. Finally, we’ll use the best H2O model to make a prediction. The entire workflow will result in a flow such as this:

We encourage you to open this notebook and click Import Notebook to execute it in your own Databricks Community Edition account (It’s free, you can sign up here). Remember, the objective of this notebook is to utilize the advantages of both Spark and in order to give a solution to your specific use case.

Also, make sure you check out our Apache Spark and Databricks Certified Spark course offerings at the Datapao training site.