Skip to content

Commit

Permalink
set up cross validation script
Browse files Browse the repository at this point in the history
  • Loading branch information
Daniel Berry authored and Daniel Berry committed Nov 21, 2016
1 parent 9ce569e commit 0bbf362
Show file tree
Hide file tree
Showing 6 changed files with 68 additions and 15 deletions.
4 changes: 3 additions & 1 deletion auto/paper.el
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
(LaTeX-add-labels
"cpresult"
"npresult"
"ppresult"))
"ppresult"
"AICs"
"MSEs"))
:latex)

25 changes: 21 additions & 4 deletions model.r
Original file line number Diff line number Diff line change
Expand Up @@ -339,10 +339,10 @@ model4 <- glmer(desert ~ CTA_counts +
verbose = TRUE,
control = glmerControl(calc.derivs = FALSE, optCtrl=list(maxfun=5000)))

cp_mse <- c(); np_mse <- c(); pp_mse <- c(); mlm_mse <- c();
cp_mses <- c(); np_mses <- c(); pp_mses <- c(); mlm_mses <- c();

for (i in 1:10) {
cv_ind <- runif(nrow(model_data_scale)) < .8
cv_ind <- sample.split(model_data_scale$Neighborhood, SplitRatio = .8)
train <- model_data_scale[cv_ind,]
test <- model_data_scale[!cv_ind,]

Expand Down Expand Up @@ -376,12 +376,29 @@ for (i in 1:10) {
(1|Neighborhood),
data = train,
family = 'binomial',
control = glmerControl(calc.derivs = FALSE, optCtrl=list(maxfun=5000)))
control = glmerControl(calc.derivs = FALSE, optCtrl=list(maxfun=1000)))

print(paste('AIC mlm:', AIC(mlm)))

print(paste('EVALUATING: ', i))


cp_pred <- predict(cp, test, allow.new.levels = TRUE, type = 'response')

print(cp_mse <- mean((cp_pred - test$desert)^2))
cp_mses <- c(cp_mses, cp_mse)


np_pred <- predict(np, test, allow.new.levels = TRUE, type = 'response')
print(np_mse <- mean((np_pred - test$desert)^2))
np_mses <- c(np_mses, np_mse)

pp_pred <- predict(pp, test, allow.new.levels = TRUE, type = 'response')
print(pp_mse <- mean((pp_pred - test$desert)^2))
pp_mses <- c(pp_mses, pp_mse)

mlm_pred <- predict(mlm, test, allow.new.levels = TRUE, type = 'response')
print(mlm_mse <- mean((mlm_pred - test$desert)^2, na.rm = TRUE))
mlm_mses <- c(mlm_mses, mlm_mse)

}

Expand Down
4 changes: 4 additions & 0 deletions paper.aux
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,7 @@
\newlabel{npresult}{{2}{5}}
\@writefile{lot}{\contentsline {table}{\numberline {3}{\ignorespaces Partial pooling model summary}}{6}}
\newlabel{ppresult}{{3}{6}}
\@writefile{lot}{\contentsline {table}{\numberline {4}{\ignorespaces Model AICs}}{6}}
\newlabel{AICs}{{4}{6}}
\@writefile{lot}{\contentsline {table}{\numberline {5}{\ignorespaces Model Cross Validated MSEs}}{6}}
\newlabel{MSEs}{{5}{6}}
14 changes: 7 additions & 7 deletions paper.log
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) (preloaded format=pdflatex 2016.5.22) 21 NOV 2016 13:56
This is pdfTeX, Version 3.14159265-2.6-1.40.17 (TeX Live 2016) (preloaded format=pdflatex 2016.5.22) 21 NOV 2016 15:12
entering extended mode
restricted \write18 enabled.
file:line:error style messages enabled.
Expand Down Expand Up @@ -169,14 +169,14 @@ Underfull \hbox (badness 10000) in paragraph at lines 65--66

[2]

LaTeX Warning: Float too large for page by 408.55894pt on input line 384.
LaTeX Warning: Float too large for page by 408.55894pt on input line 382.

[3] [4] [5] [6] (./paper.aux) )
Here is how much of TeX's memory you used:
1381 strings out of 493014
17385 string characters out of 6133351
97656 words of memory out of 5000000
4960 multiletter control sequences out of 15000+600000
1383 strings out of 493014
17397 string characters out of 6133351
99672 words of memory out of 5000000
4962 multiletter control sequences out of 15000+600000
9369 words of font info for 34 fonts, out of 8000000 for 9000
1141 hyphenation exceptions out of 8191
27i,9n,32p,1165b,268s stack positions out of 5000i,500n,10000p,200000b,80000s
Expand All @@ -193,7 +193,7 @@ nts/cm/cmr7.pfb></usr/local/texlive/2016/texmf-dist/fonts/type1/public/amsfonts
/cm/cmsy10.pfb></usr/local/texlive/2016/texmf-dist/fonts/type1/public/amsfonts/
cm/cmsy7.pfb></usr/local/texlive/2016/texmf-dist/fonts/type1/public/amsfonts/cm
/cmti10.pfb>
Output written on paper.pdf (8 pages, 148983 bytes).
Output written on paper.pdf (8 pages, 150024 bytes).
PDF statistics:
79 PDF objects out of 1000 (max. 8388607)
56 compressed objects within 1 object stream
Expand Down
Binary file modified paper.pdf
Binary file not shown.
36 changes: 33 additions & 3 deletions paper.tex
Original file line number Diff line number Diff line change
Expand Up @@ -116,9 +116,7 @@ \subsubsection*{Hierarchical}

\subsection*{Model Comparison}

Models were compared using AIC and cross validated Breir Score (Mean Square Error in the case of 2-class logistic regression).


Models were compared using AIC and cross validated Breir Score (Mean Square Error in the case of 2-class logistic regression). Lower values of AIC indicate better fitting models and thus can be used to compare the performance of models to each other. The cross validation was performed 10 fold where the model was fit on 80\% of the data and evaluated on the remaining 20\%. This gives a way of quantifying the predictive ability of the model on new, unseen data.

\section*{Results}

Expand Down Expand Up @@ -443,8 +441,40 @@ \subsubsection*{Partial Pooling}

\subsubsection*{Hierarchical}

\subsection*{Model Comparison}

We can see from table \ref{AICs} that the No Pooling model has the lowest AIC which is to be expected as in a certain sense this model has the most flexibility. The intercept term for each neighborhood is the average of only the observations in that neighborhood and is not ``shrunk'' to any sort of common mean.

As we can see from table \ref{MSEs}, %TODO: write section on which MSEs are the smallest.

\begin{table}[!htbp] \centering
\caption{Model AICs}
\label{AICs}
\begin{tabular}[c]{c|c}
\\ Model & AIC \\
\hline \\
Complete Pooling & 0 \\
No Pooling & 0 \\
Partial Pooling & 0 \\
Hierarchical & 0 \\
\end{tabular}
\end{table}

\begin{table}[!htbp] \centering
\caption{Model Cross Validated MSEs}
\label{MSEs}
\begin{tabular}[c]{c|c}
\\ Model & AIC \\
\hline \\
Complete Pooling & 0 \\
No Pooling & 0 \\
Partial Pooling & 0 \\
Hierarchical & 0 \\
\end{tabular}
\end{table}

\section*{Conclusions}



\end{document}

0 comments on commit 0bbf362

Please sign in to comment.