Dataset : http://archive.ics.uci.edu/ml/datasets/Communities+and+Crime
Using the R Programming Lanaguage used "PCA" and Linear Regression with Cross Validation with K=10 to predict "Violent Crimes per Population".
Crime Dataset contains 127 attribute, with target attribute ViolentCrimesPerPop. This is carried out in following steps:
- Step 1: Read the dataset.
- Step 2: Clean the dataset.
- Step 3: DO PCA
- Step 4: Apply linear regression with cross validation with k=10 using glm.net
- Step 5: Calculate MSE
- MSE On Test data : 0.05366028
- MSE On Train data : 0.05791931