SimpleFeaturizer.ConfigΒΆ

Component: SimpleFeaturizer

class SimpleFeaturizer.Config[source]

Bases: ConfigBase

All Attributes (including base classes)

sentence_markers: Optional[tuple[str, str]] = None
lowercase_tokens: bool = True
split_regex: str = '\\s+'
convert_to_bytes: bool = False

Default JSON

{
    "sentence_markers": null,
    "lowercase_tokens": true,
    "split_regex": "\\s+",
    "convert_to_bytes": false
}