SquadForRoBERTaTensorizer.ConfigΒΆ

Component: SquadForRoBERTaTensorizer

class SquadForRoBERTaTensorizer.Config[source]

Bases: RoBERTaTensorizer.Config

All Attributes (including base classes)

is_input: bool = True
columns: list[str] = ['question', 'doc']
tokenizer: Tokenizer.Config = GPT2BPETokenizer.Config()
base_tokenizer: Optional[Tokenizer.Config] = None
vocab_file: str = 'gpt2_bpe_dict'
max_seq_len: int = 256
add_selfie_token: bool = False
answers_column: str = 'answers'
answer_starts_column: str = 'answer_starts'
Subclasses
  • SquadForRoBERTaTensorizerForKD.Config

Default JSON

{
    "is_input": true,
    "columns": [
        "question",
        "doc"
    ],
    "tokenizer": {
        "GPT2BPETokenizer": {
            "bpe_encoder_path": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/encoder.json",
            "bpe_vocab_path": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/vocab.bpe",
            "lowercase": false
        }
    },
    "base_tokenizer": null,
    "vocab_file": "gpt2_bpe_dict",
    "max_seq_len": 256,
    "add_selfie_token": false,
    "answers_column": "answers",
    "answer_starts_column": "answer_starts"
}