BERTInitialTokenizer.ConfigΒΆ
Component: BERTInitialTokenizer
-
class
BERTInitialTokenizer.
Config
[source] Bases:
Tokenizer.Config
Config for this class.
All Attributes (including base classes)
- split_regex: str =
'\\s+'
- lowercase: bool =
True
- use_byte_offsets: bool =
False
Default JSON
{
"split_regex": "\\s+",
"lowercase": true,
"use_byte_offsets": false
}