Introduction
In this post we will discuss about Exploratory Data Analysis and how we use it to analyze Univariate, Bivariate and Multivariate data sets. Exploratory Data Analysis involves initial investigation of the data before creating any kind of model. There are a lot of different techniques that can be employed while doing EDA. It doesn’t have a set of rules that needs to be followed. Rather it is more of a philosophy, more of an art than science. The sole intention of carrying out EDA is to gain insight about the data and its underlying structure. It helps in identifying the variables which will have greater impact on the model.
Some of the steps we should follow while doing EDA are as follows:
- Handling null or erroneous data
- Finding the outliers in the dataset
- Doing hypothesis testing on a sample data to verify the assumptions about the data
- Define and estimate different parameters for the data and also find out the associated confidence intervals.
Techniques for EDA
Below are few techniques which can be followed in EDA:
- Univariate Analysis of data –Discussed in the below section
- Bivariate Analysis of Data –Discussed in the below section
- Multivariate data visualization –Discussed in the below section
- Covariance and Correlation between different features in a dataset
- Predictive modeling using Linear Regression etc
EDA in Univariate, Bivariate and Multivariate Data
Reference:
- https://en.wikipedia.org/wiki/Exploratory_data_analysis#:~:text=In%20statistics%2C%20exploratory%20data%20analysis,modeling%20or%20hypothesis%20testing%20task.
- https://www.statisticshowto.com/probability-and-statistics/data-analysis/
- https://www.analyticsvidhya.com/blog/2020/04/beginners-guide-exploratory-data-analysis-text-data/