You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I failed to reproduce the Llama2-7b-4k (w/o SFT) in the paper.
Here is our result:
Methods
Tokens
Coursera
GSM
QuALITY
TOEFL
CodeU
SFiction
Avg
(L-Eval)Llama2-7b-4k (w/o SFT)
4k
20.05
2.0
28.71
24.53
0.00
40.62
19.31
(Ours) Llama2-7b-4k (w/o SFT)
4k
15.26
19.0
30.69
13.01
3.33
35.93
19.54
Here is our experimental setting:
We change the llama2-chat-test.py file, disable the NTK parameters and using LLama2-7b to conduct the evaluation.
And run like this:
python3 Baselines/llama2-chat-test.py
--scale 7b
--max_length 4k
--metric exam_eval
What's the possible reason for that ? Should I adjust the prompt or other pamameters?
The text was updated successfully, but these errors were encountered:
Hi, I failed to reproduce the Llama2-7b-4k (w/o SFT) in the paper.
Here is our result:
Here is our experimental setting:
We change the llama2-chat-test.py file, disable the NTK parameters and using LLama2-7b to conduct the evaluation.
And run like this:
python3 Baselines/llama2-chat-test.py
--scale 7b
--max_length 4k
--metric exam_eval
What's the possible reason for that ? Should I adjust the prompt or other pamameters?
The text was updated successfully, but these errors were encountered: