The Hadoop Ecosystem: a Practical Workshop

The Hadoop Ecosystem: a Practical Workshop


Explore and understand the power of the Hadoop ecosystem

3-4 July 2012 (10-18u)
Location: Golden Tulip Brussels Airport (Diegem)
Presented in English by
Price: 1050 EUR (excl. 21% VAT)
Register Now » AGENDA » SPEAKERS »


This event is history, please check out the List of Upcoming Seminars

Check out our related open workshops:

Check out our related in-house workshops:


Full Programme:
9.30h - 10.00h
Registration, coffee/tea and croissants
10.00h (DAY 1)
Introduction

During the introduction, we will go deeper into the concepts used within the BigData world. We'll explore the history behind Hadoop and get a feeling of what can be done with a Hadoop environment.

Additionally, we will get our virtual machine up and running so we can use it during the exercises later on. We will introduce Hue, an online desktop serving as the gateway to our cluster.

Hadoop Storage Technologies

The first main topic we will cover is the "Hadoop Distributed Filesystem". We will go into detail on how it differs from a "regular" file system and what the concequences are of choosing this approach.

The second topic will deal with HBase that serves as a distributed key-value store on top of HDFS. We go deeper into why you would need HBase when you have HDFS also touching HBase data modeling.

During the exercises we will install Hadoop and get a feeling of this filesystem. You'll get a feeling of working with HBase to do simple operations.

18.00h (DAY 1)
End of Dag 1 of this Workshop
9.30h - 10.00h (DAY 2)
Coffee/Tea and croissants
10.00h (DAY 2)
Hadoop Processing Technologies

Next to the storage of data, this second main part will deal with the processing of this stored information. MapReduce will be explained as the main algorithm used to process large amounts of information.

We will also show that you don't need to know Java to invoke MapReduce jobs, but that tools like Hive and Pig greatly simplify this job.

The exercises will require the same datasets to be processed with Hive as well as Pig, showing the similarities and differences between the two technologies.

Importing and Exporting data

Most likely, you want to integrate your Hadoop environment into your existing one, which means you will need to import and export information from and to your existing systems.

This part of the course will deal with importing and exporting data from existing systems as well as relational databases using tools like Sqoop and Flume.

Managing an Hadoop Cluster

So you got your cluster up and running, but how do you manage it and make sure it stays up? We'll introduce Cloudera Manager, a tool for managing your hadoop environment. For monitoring we'll have a look at Ganglia.

18.00h
End of this Workshop

The number of participants is limited to 16 to guarantee an optimal interaction and learning experience.

SPEAKERS »

YES, I am interested !

Questions about this ? Interested but you can't attend ? Send us an email !