CDOSS Certificate

Big Data Machine Learning with Apache Spark  

Profiles that can prepare this certification contents:

Data scientist, Data engineer, Full stack developer, Web developer, Business Intelligence Consultant, Big Data Consultant and more with acceptable algorithmic capability

Global knowledge to be acquired to pass this certification: 

+ Understand Apache Spark architecture and data management

+ Using basic Apache Spark functionality with python:

– Extract Transform Load (ETL) with pyspark

– Spark SQL

– Scalable Data Science

– Machine learning (basic notions) with Mllib et ML

Detailed plan of preparation:

+ Hadoop Architecture and MapReduce

+ Apache Spark scalability

+ Apache Spark architecture

+ Resilient Distributed Dataset (RDD) and Dataframe

+ Spark SQL

+ Extract Transform Load with Spark

+ Basic notions of machine learning (supervised learning (example: decision tree) and unsupervised learning (example: K-means)

  • Machine Learning with RDD (MLlib with pyspark)
  • Machine Learning with Dataframe (ML with pyspark)

CDOSS Association

(Compliance for Data Open Source Software)

📌 Association CDOSS, ZI de Franchepré, Centre d’activités Econoliques de Franchepré
54240 JOEUF (FRANCE)

✉  contact@cdoss.org