BERTInitialTokenizer.ConfigΒΆ
Component: BERTInitialTokenizer
-
class
BERTInitialTokenizer.Config[source] Bases:
Tokenizer.ConfigConfig for this class.
All Attributes (including base classes)
- split_regex: str =
'\\s+'- lowercase: bool =
True- use_byte_offsets: bool =
False
Default JSON
{
"split_regex": "\\s+",
"lowercase": true,
"use_byte_offsets": false
}