SquadForRoBERTaTensorizer.ConfigΒΆ
Component: SquadForRoBERTaTensorizer
-
class
SquadForRoBERTaTensorizer.Config[source] Bases:
RoBERTaTensorizer.Config
All Attributes (including base classes)
- is_input: bool =
True- columns: list[str] =
['question', 'doc']- tokenizer: Tokenizer.Config = GPT2BPETokenizer.Config()
- base_tokenizer: Optional[Tokenizer.Config] =
None- vocab_file: str =
'gpt2_bpe_dict'- max_seq_len: int =
256- add_selfie_token: bool =
False- answers_column: str =
'answers'- answer_starts_column: str =
'answer_starts'
- Subclasses
SquadForRoBERTaTensorizerForKD.Config
Default JSON
{
"is_input": true,
"columns": [
"question",
"doc"
],
"tokenizer": {
"GPT2BPETokenizer": {
"bpe_encoder_path": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/encoder.json",
"bpe_vocab_path": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/vocab.bpe",
"lowercase": false
}
},
"base_tokenizer": null,
"vocab_file": "gpt2_bpe_dict",
"max_seq_len": 256,
"add_selfie_token": false,
"answers_column": "answers",
"answer_starts_column": "answer_starts"
}