Mini Program 20 - Placement data analysis using Spark

Apache spark is an open source unified analytics engine for large scale data processing. It is 100 times faster in operation and hence preferred for big data analysis. The set of commands used in spark is bit different from that of pandas. In this analysis we aim to familiarize you with common spark commands that can prove to be handy during exploratory data analysis. We have used Placement dataset from Kaggle for comparison with pandas and spark operations. Watch the video to know more.

#CodeWithUs to find out more and do it your self!! 

You can find the code at Python Code - GitHub

Comments

Popular posts from this blog

Mini Program 14 - Health Insurance Data Analysis & Model building using Python - Part 1

Mini Program 15 - Health Insurance Data Analysis & Model building using Python - Part 2

Mini Program 16 - Health Insurance Data Analysis & Model building using Python - Part 3