Skip to content

Latest commit

 

History

History
18 lines (14 loc) · 711 Bytes

readme.MD

File metadata and controls

18 lines (14 loc) · 711 Bytes

Analysis on Crime Dataset.

Dataset : http://archive.ics.uci.edu/ml/datasets/Communities+and+Crime

Using the R Programming Lanaguage used "PCA" and Linear Regression with Cross Validation with K=10 to predict "Violent Crimes per Population".

Crime Dataset contains 127 attribute, with target attribute ViolentCrimesPerPop. This is carried out in following steps:

  • Step 1: Read the dataset.
  • Step 2: Clean the dataset.
  • Step 3: DO PCA
  • Step 4: Apply linear regression with cross validation with k=10 using glm.net
  • Step 5: Calculate MSE

Results

  • MSE On Test data : 0.05366028
  • MSE On Train data : 0.05791931

Also used Weka tool to compare the different methods for analysis as well.