Skip to content

Commit

Permalink
Add learning curves to presentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Witiko committed Sep 3, 2022
1 parent dde26cc commit 7ef8eaf
Show file tree
Hide file tree
Showing 3 changed files with 50 additions and 6 deletions.
19 changes: 18 additions & 1 deletion paper/markdownthemewitiko_beamer_MU.sty
Original file line number Diff line number Diff line change
Expand Up @@ -132,13 +132,30 @@
% Headings
\markdownSetupSnippet{headingOne/empty}{
renderers = {
headingOne = {\vspace*{0.5cm}},
headingOne = {},
},
}
\markdownSetupSnippet{headingTwo/empty}{
renderers = {
headingTwo = {},
},
}

\mode
<presentation>

% Headings
\markdownSetupSnippet{headingOne/empty}{
renderers = {
headingOne = {\vspace*{0.5cm}},
},
}
\markdownSetupSnippet{headingTwo/empty}{
renderers = {
headingTwo = {},
},
}

\markdownSetup{
renderers = {
headingOne = {%
Expand Down
27 changes: 22 additions & 5 deletions paper/presentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,11 +161,8 @@ In our experiments, we also used two different types of language models:
To model text, we used a pre-trained `roberta-base` model [@liu2019roberta].

To model text and math in the LaTeX format, we replaced the tokenizer of
`roberta-base` with our text + LaTeX tokenizer, we randomly initialized
weights for the new tokens, and we fine-tuned our model on our text + LaTeX
dataset for one epoch using the autoregressive masked language modeling
objective. We called our model MathBERTa and we released it to the Hugging
Face Model Hub.
`roberta-base` with our text + LaTeX tokenizer and we randomly initialized
weights for the new tokens.

* * *

Expand All @@ -183,6 +180,26 @@ Deep transformer models
- Pre-trained `roberta-base` model for text
- Fine-tuned MathBERTa model for text + LaTeX

## Fine-tuned MathBERTa model for text + LaTeX {#learning-curves}

Then, we fine-tuned our model on our text + LaTeX dataset for one epoch using
the autoregressive masked language modeling objective. The figure shows the
learning curves of our model on our in-domain text + LaTeX dataset and also
on the out-of-domain dataset with th European consitution. The ongoing descent
of in-domain validation loss indicates that the performance of the model
improved over time, but has not converged and would benefit from further
training. The ongoing descent of out-of-domain validation loss shows that
improvements on scientific texts do not come at the price of other
non-scientific domains.

We called our model MathBERTa and we released it to the Hugging Face Model Hub.
Besides our work, MathBERTa has already been used in the systems of the MIRMU
team and for the automatic evaluation of Task 3.

* * *

/learning-curves.pdf

## Token Similarity {#token-similarity}

To determine the similarity of text and math tokens, we first extracted their
Expand Down
10 changes: 10 additions & 0 deletions paper/presentation.tex
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
%% Markup
\markdownSetupSnippet{horizontalRule/singleFrame}{snippet=witiko/beamer/MU/horizontalRule/singleFrame}
\markdownSetupSnippet{headingOne/empty}{snippet=witiko/beamer/MU/headingOne/empty}
\markdownSetupSnippet{headingTwo/empty}{snippet=witiko/beamer/MU/headingTwo/empty}

%% Title page
\maketitle
Expand All @@ -14,6 +15,7 @@
%% Markup
\markdownSetupSnippet{horizontalRule/frameBreak}{snippet=witiko/beamer/MU/horizontalRule/frameBreak}
\markdownSetupSnippet{headingOne/empty}{snippet=witiko/beamer/MU/headingOne/empty}
\markdownSetupSnippet{headingTwo/empty}{snippet=witiko/beamer/MU/headingTwo/empty}
\markdownSetupSnippet{headingTwo/several}{snippet=witiko/beamer/MU/headingTwo/several}

%% Title page
Expand Down Expand Up @@ -44,6 +46,7 @@
\markdownInput[slice=datasets]{presentation.md}
\markdownInput[slice=tokenization, snippet=horizontalRule/singleFrame]{presentation.md}
\markdownInput[slice=language-modeling]{presentation.md}
\markdownInput[slice=learning-curves, snippet=headingTwo/empty]{presentation.md}
\markdownInput[slice=token-similarity, snippet=horizontalRule/singleFrame]{presentation.md}
\markdownInput[slice=soft-vector-space-modeling]{presentation.md}
Expand Down Expand Up @@ -76,6 +79,13 @@ \subsection{Tokenization and Language Modeling}
\markdownInput[slice=language-modeling, snippet=headingTwo/several]{presentation.md}
\end{frame}
\begingroup
\setbeamertemplate{footline}{}
\begin{frame}[fragile]{Fine-tuned MathBERTa model for text + LaTeX}
\markdownInput[slice=learning-curves, snippet=headingTwo/empty]{presentation.md}
\end{frame}
\endgroup
\subsection{Token Similarity and Soft Vector Space Modeling}
\begingroup
\setbeamertemplate{footline}{}
Expand Down

0 comments on commit 7ef8eaf

Please sign in to comment.