Carry out local job (local training / local inference / local interaction) with a configuration file. You can specify GPU by export CUDA_VISIBLE_DEVICES=XXX in ./scripts/local/job.sh. You can also specify other environment variables in the script.
./scripts/local/job.sh ${JOB_CONF}- An example of training configuration files is
./package/dialog_en/24L_train.conf. It contains three sections:job,taskandtraining. - An example of evaluation configuration files is
./package/dialog_en/24L_evaluate.conf. It contains three sections:job,taskandevaluation. - An example of inference configuration files is
./package/dialog_en/24L_infer.conf. It contains three sections:job,taskandinference. - An example of interaction configuration files is
./package/dialog_en/24L_interact.conf. It contains three sections:job,taskandinteraction. - An example of self-chat configuration files is
./package/dialog_en/self_chat.conf. It contains three sections:job,taskandself-chat.
This section defines:
job_script: the main script of this task../scripts/single_gpu/train.sh: single GPU training./scripts/distributed/train.sh: distributed GPU training (NCCL required)../scripts/single_gpu/infer.sh: single GPU inference./scripts/distributed/infer.sh: distributed GPU inference on single machine, and it will merge different inference results on each GPU../scripts/single_gpu/interact.sh: running interaction with a dialogue model
This section defines:
model: the model used in specify task, such asUnifiedTransformer: used in theDialogGenerationtaskPlato: used in theDialogGenerationtaskNSPModel: used in theNextSentencePredictiontask
task: task name, such asDialogGeneration: generating a response for the given context.NextSentencePrediction: judging whether the sentence is the next sentence follows given context.
vocab_path: vocabulary path- tokenizer related:
spm_model_filefor SentencePieces Tokenizer, and so on. config_path: the model configuration file.- dataset files related:
train_file/valid_file(for training),infer_file(for inference),data_formatandfile_format. - Choices of
data_format:raw: untokenized data tsv file, example:./data/example/train.tsv, each column is a field.tokenized: tokenized data tsv, example:./data/example/train_tokenized.tsvwhich is generated by./knover/tools/pre_tokenized.sh.numerical: each line contains numerical data (token_ids,type_idsandpos_ids) , example:./data/example/train.numerical.tsvwhich is generated by./knover/tools/pre_numericalize.sh.
- Choices of
file_format:file: a file only.filelist: containing multiple files, where each line is a file, example:./data/example/train_filelist.
- It also supports the file with
.gzsuffix which is compressed bygzipcommand.
This section defines training related settings:
init_params: initialized parameters.init_checkpoint: initialized checkpoints (containing not only the parameters of the model, but also the persitables of the optimizer) . You can also settrain_args="--start_step 1000"for better display of log (when you continue the training from step 1000) , but this is not necessary.batch_size,lr,num_epochsand so on.log_dir: the output path of training logs, including the log file (${log_dir}/workerlog.${DEV_ID}) of each GPU trainer.save_path: the output path of saved parameters.- You can define other arguments in training configuration, such as:
train_args="--max_src_len 384 --max_seq_len 512"
- You can find more arguments in
knover/tasks/${TASK_NAME}.pyandknover/models/${MODEL_NAME}.py.
This section defines inference related settings:
init_params: initialized parameters.nsp_init_params: initialized NSP model parameters to re-rank candidate responses.batch_size,in_tokensand so on.output_name: the name of output field.save_path: the output path of inference result.- You can define other arguments in inference configuration, such as:
rerank by NSP score:
infer_args="--ranking_score nsp_score" # this will re-rank candidate responses by scores given by NSP model.
top-k sampling and rerank:
infer_args="--decoding_strategy topk_sampling --num_samples 20 --topk 10 --length_average true"
top-p sampling and rerank:
infer_args="--decoding_strategy topp_sampling --num_samples 20 --topp 0.9 --length_average true"
beam search:
infer_args="--decoding_strategy beam_search --beam_size 10 --length_average true"
- You can find more arguments in
knover/tasks/${TASK_NAME}.pyandknover/models/${MODEL_NAME}.py.
This section defines interaction related settings:
init_params: initialized parameters.nsp_init_params: initialized NSP model parameters to re-rank candidate responses.- You can define other arguments in interaction configuration, such as:
infer_args="--ranking_score nsp_score" # this will re-rank candidate responses by scores given by NSP model.
- You can find more arguments in
knover/tasks/${TASK_NAME}.pyandknover/models/${MODEL_NAME}.py.