AutoViz Visualization of Boston

We ran AutoViz on the famous Boston Housing data set. Here's what we got...

The Boston Housing data set is a Regression problem. Here "MV" is the Target Variable.
Please start Visualization below or select a Tab above to see a specific plot.

Get started Scatter plot

Scatter Plot of each Continuous Variable against Target Variable

Scatter Plots can show positive or negative trends and linear or non-linear relationships between a numeric variable and the target variable.

From the Scatter Plots that AutoViz shows, we can see that RM (average number of rooms per dwelling) is almost linearly correlated to Median Value (the target variable).

Plot Image

Comments:

Pairwise Scatter Plot of each Continuous Variable against other Continuous Variables

Pairwise Scatter Plots can show positive or negative interactions and linear or non-linear relationships between a numeric variable and another numeric variable.

Plot Image

Comments:

Histogram Plots of all Continuous Variable

A Histogram shows the distribution of variables individually.

From the Histograms that AutoViz shows, we can see that some distributions of variables such as AGE and LSTAT are skewed, and can benefit from a Log Transformation...

Plot Image

Comments:

Violine Plots of all Continuous Variable

A Violin Plot is similar to a Kernel Density Distribution plot of a Continuous Variable.

From the Violin Plots that AutoViz shows, we can see that the CRIM (crime rates) variable is very close to zero most of the time but there are some Big Outliers...

Plot Image
Plot Image

Comments:

Heatmap of all Continuous Variables for target Variable

Heatmaps help us visualise the correlation between every set of variables in the dataset.

From this, we can see that the two variables with the strongest correlation to MV are the LSTAT(% lower status of the population) and RM (average number of rooms per dwelling).

Plot Image

Comments:

Bar Plots of Average of each Continuous Variable by Target Variable

Bar Plots are used to compare and summarize numeric data by different groups or categories in a data set.

From the Bar Plots of Average MV against CHAS (1 if the tract bounds a river; 0 otherwise), we can see that CHAS is a good predictor of Median Value (the Target Variable).

Plot Image
Plot Image

Comments:

Time Series Plots of Two Continuous Variables against a Date/Time Variable

Time Series Plots are used track changes in continuous variables by date/time variables in a data set.

There are no Time Series Plots in Boston Housing since there is no Date/Time variable in this data set.


Comments: