Apache Spark Hands-On Training

Apache Spark Hands-On Training

Want to learn Spark fast, practice it, and get yourself a flying start ?

25-26 May 2016 (14-21h)
Location: Golden Tulip Brussels Airport (Diegem)
Presented in English by Geert Van Landeghem
Price: 1250 EUR (excl. 21% VAT)
Register Now » AGENDA » SPEAKERS »

This event is history, please check out the List of Upcoming Seminars, or send us an email

Check out our related open workshops:

Check out our related in-house workshops:

Why do we organise this workshop about Apache Spark ?

Big Data is the hype of the moment in ICT and marketing. Since its inception in 2007, Apache Hadoop has been looked at as the de facto standard for the storage and processing of big data volumes in batch.

But every technology has its limitations, and this is no different for Hadoop: it is batch-oriented and the MapReduce framework is too limited for handling all types of data analysis within the same technology stack.

Because the volume and speed of data generation gradually increases, so does the need for faster data processing and analysis to answer the needs and expectations of end users.

IBM calls Apache Spark "most important new open source project in a decade"

Apache Spark solves the problem of speed and versatility by offering an "open source data analytics cluster computing framework". Spark was developed in 2009 at the AMPLab (Algorithms, Machines, and People Lab) of the University of California in Berkeley, and donated to the open source community in 2010. It is faster than Hadoop, in some cases 100 times faster, and it offers a framework that supports different types of data analysis within the same technology stack: fast interactive queries, streaming analysis, graph analysis and machine learning. During this two-day hands-on workshop, we discuss the theory and practice of several data analysis applications.

Who should attend this workshop?

This workshop is mainly aimed at developers, data analysts and data scientists who want to know more about Apache Spark. This course uses a hands-on approach to teach you the basics of Spark and give you a flying start.

You get an introduction to all Spark components from the perspective of the "data developer". Some experience with programming is necessary to get the most out of this course.

The exercises are implemented on your own laptop using Scala (unfortunately, the Spark Python API (PySpark) still gives problems), and vary from easy to complex, gradually adding functionality.

We also offer this training as an in-house course for a minimum of 5 people from your company.


Questions about this ? Interested but you can't attend ? Send us an email !