- The problem that we are going to solve here is that given a set of features that describe a house in Boston, our machine learning model must predict the house price.
- To train our machine learning model with boston housing data, we will be using boston dataset.
- We will be using Random Forest Regression algorithm for creating the prediction model.
The Project includes 2 datasets (training dataset and testing dataset). In each one of these datasets, each row describes a boston town or suburb. There are 14 attributes (features) with a target column (MEDV/Price) in the training dataset.
Attributes | Details |
---|---|
ID | ID for each row |
CRIM | per capita crime rate by town |
ZN | proportion of residential land zoned for lots over 25,000 sq.ft |
INDUS | proportion of non-retail business acres per town |
CHAS | Charles River dummy variable (1 if tract bounds river; else 0) |
NOX | nitric oxides concentration (parts per 10 million) |
RM | average number of rooms per dwelling |
AGE | proportion of owner-occupied units built prior to 1940 |
DIS | weighted distances to five Boston employment centres |
RAD | index of accessibility to radial highways |
TAX | full-value property-tax rate per $10,000 |
PTRATIO | pupil-teacher ratio by town |
BLACK | 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town |
LSTAT | % lower status of the population |
MEDV | Median value of owner-occupied homes in $1000’s (the target) |