Mini Program 20 - Placement data analysis using Spark

September 05, 2021

Apache spark is an open source unified analytics engine for large scale data processing. It is 100 times faster in operation and hence preferred for big data analysis. The set of commands used in spark is bit different from that of pandas. In this analysis we aim to familiarize you with common spark commands that can prove to be handy during exploratory data analysis. We have used Placement dataset from Kaggle for comparison with pandas and spark operations. Watch the video to know more.

#CodeWithUs to find out more and do it your self!!

You can find the code at Python Code - GitHub

Search This Blog

Python Mini Program Series

Mini Program 20 - Placement data analysis using Spark

Comments

Post a Comment

Popular posts from this blog

Mini Program 18 - Data Science Model Deployment using Flask

Mini Program 14 - Health Insurance Data Analysis & Model building using Python - Part 1

Mini Program 19 - Integration of power bi dashboard in Jupyter notebook