Posts

Showing posts from September, 2021

Mini Program 20 - Placement data analysis using Spark

Image
Apache spark is an open source unified analytics engine for large scale data processing. It is 100 times faster in operation and hence preferred for big data analysis. The set of commands used in spark is bit different from that of pandas. In this analysis we aim to familiarize you with common spark commands that can prove to be handy during exploratory data analysis. We have used Placement dataset from Kaggle for comparison with pandas and spark operations . Watch the video to know more. #CodeWithUs to find out more and do it your self!!  You can find the code at  Python Code - GitHub