VocabConfigΒΆ
Component: Component
-
class
pytext.data.tensorizers.
VocabConfig
[source] Bases:
Component.Config
All Attributes (including base classes)
- build_from_data: bool =
True
- Whether to add tokens from training data to vocab.
- size_from_data: int =
0
- Add size_from_data most frequent tokens in training data to vocab (if this is 0, add all tokens from training data).
- min_counts: int =
0
- Add min_counts filter out tokens in training data that with count smaller than min_counts.
- vocab_files: list[VocabFileConfig] =
[]
Default JSON
{
"build_from_data": true,
"size_from_data": 0,
"min_counts": 0,
"vocab_files": []
}