VocabFileConfig¶

Component: Component

class pytext.data.tensorizers.VocabFileConfig[source]: Bases: Component.Config

All Attributes (including base classes)

filepath: str = ''

File containing tokens to add to vocab (first whitespace-separated entry per line)

skip_header_line: bool = False

Whether to skip the first line of the file (e.g. if it is a header line)

lowercase_tokens: bool = False

Whether to lowercase each of the tokens in the file

size_limit: int = 0

The max number of tokens to add to vocab

Default JSON

{
    "filepath": "",
    "skip_header_line": false,
    "lowercase_tokens": false,
    "size_limit": 0
}