This foundation course is designed for developing basic understanding of concepts of Big Data storage and processing for unstructured, semi-structured and structured data using core technologies of Hadoop and Spark. The course is delivered over a period of 3 Days. This course covers the basics of core components of Hadoop ecosystem like HDFS, Sqoop, Hive, HBase, MapReduce as well as basics of Apache Spark framework. The desired prerequisite for participants is a basic understanding of Linux and Java programming language but that can also be picked up during the course. Basic concepts on Spark core APIs like RDD's shall be imparted as part of the course.
The whole course will be delivered as a mix of theory and hands-on sessions, where participants will be asked to perform hands-on operations on various topics along with instructor. At the end of the course, participants will be familiar with Big Data concepts as well as above mentioned Big Data tools and APIs of Hadoop and Spark ecosystems, which will build foundation for advanced learning of the Big Data frameworks and technologies. The course will be delivered on a Hortonworks Data Platform (HDP) cluster on an AWS cloud platform.
Target audience - Database designers, developers, architects, Data analysts, Data engineering and Data science professionals.