如何设置模型推理时的参数,如top_k? How to set the generation config of the model, like the temperature. #329
-
For fair evaluation, generation configs(https://huggingface.co/docs/transformers/main_classes/text_generation#transformers.GenerationConfig) should be the same. I tried to find the setting but directly set that in model configration would fail. I noticed that there is a huggingface.py file at opencompass/models, in which the generation function is called using some kwargs, and the specific parameters are unknown. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments
-
Note that by default HF uses greedy decoding if no extra kwargs are passed to the generation function, and the evaluation should be fair. In case you do want to customize the generation parameters, you may modify the |
Beta Was this translation helpful? Give feedback.
-
Now I simply add generation configs into the model_kwargs. I don't know whether I am right, if the args in model_kwargs are used in this way(from the _load_model function in HugginFaceCausalLM class): self.model = AutoModelForCausalLM.from_pretrained(path, **model_kwargs) Then huggingface would automatically set the generation config, there're not any bugs raised in this way. |
Beta Was this translation helpful? Give feedback.
-
The config setting is the belowing, I wonder if it does set the generation config as I expected(although won't raise any error) : models = [
dict(
type=HuggingFaceCausalLM,
abbr='baichuan-13b-base',
path="/data/share_user/cls/models/Baichuan-13B-Base",
tokenizer_path='/data/share_user/cls/models/Baichuan-13B-Base',
tokenizer_kwargs=dict(padding_side='left',
truncation_side='left',
trust_remote_code=True,
use_fast=False,),
max_out_len=512,
max_seq_len=2048,
batch_size=1,
model_kwargs=dict(device_map='auto', trust_remote_code=True, revision='77d74f449c4b2882eac9d061b5a0c4b7c1936898', temperature=0.7,
top_p=0.85,
top_k=40,
num_beams=1,
repetition_penalty=1.2),
run_cfg=dict(num_gpus=1),
)
] |
Beta Was this translation helpful? Give feedback.
-
The models perform poorly by default generate setting, if they can be customized, it will be enlightening on how to better use the models |
Beta Was this translation helpful? Give feedback.
Note that by default HF uses greedy decoding if no extra kwargs are passed to the generation function, and the evaluation should be fair.
In case you do want to customize the generation parameters, you may modify the
HuggingFace
class and addgeneration_kwargs
to its__init__
function, which saves the kwargs on construction and then uses it ingenerate()
. With such a modification, you can set these parameters in model configuration as howmodel_kwargs
ortokenizer_kwargs
is set.