GazetteerTensorizer.ConfigΒΆ
Component: GazetteerTensorizer
-
class
GazetteerTensorizer.
Config
[source] Bases:
Tensorizer.Config
All Attributes (including base classes)
- is_input: bool =
True
- text_column: str =
'text'
- dict_column: str =
'dict'
- tokenizer: Tokenizer.Config = Tokenizer.Config()
- tokenizer to split text and create dict tensors of the same size.
Default JSON
{
"is_input": true,
"text_column": "text",
"dict_column": "dict",
"tokenizer": {
"Tokenizer": {
"split_regex": "\\s+",
"lowercase": true,
"use_byte_offsets": false
}
}
}