RoBERTaTokenLevelTensorizer.ConfigΒΆ
Component: RoBERTaTokenLevelTensorizer
-
class
RoBERTaTokenLevelTensorizer.
Config
[source] Bases:
RoBERTaTensorizer.Config
All Attributes (including base classes)
- is_input: bool =
True
- columns: list[str] =
['text']
- tokenizer: Tokenizer.Config = GPT2BPETokenizer.Config()
- base_tokenizer: Optional[Tokenizer.Config] =
None
- vocab_file: str =
'manifold://pytext_training/tree/static/vocabs/bpe/gpt2/dict.txt'
- max_seq_len: int =
256
- labels_columns: list[str] =
['label']
- labels: list[str] =
[]
Default JSON
{
"is_input": true,
"columns": [
"text"
],
"tokenizer": {
"GPT2BPETokenizer": {
"bpe_encoder_path": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/encoder.json",
"bpe_vocab_path": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/vocab.bpe"
}
},
"base_tokenizer": null,
"vocab_file": "manifold://pytext_training/tree/static/vocabs/bpe/gpt2/dict.txt",
"max_seq_len": 256,
"labels_columns": [
"label"
],
"labels": []
}