SlotLabelTensorizer.Config

Component: SlotLabelTensorizer

class SlotLabelTensorizer.Config[source]

Bases: Tensorizer.Config

All Attributes (including base classes)

is_input: bool = False
slot_column: str = 'slots'
The name of the slot label column to parse from the data source.
text_column: str = 'text'
The name of the text column to parse from the data source. We need this to be able to generate tensors which correspond to input text.
tokenizer: Tokenizer.Config = Tokenizer.Config()
The tokenizer to use to split input text into tokens. This should be configured in a way which yields tokens consistent with the tokens input to or output by a model, so that the labels generated by this tensorizer will match the indices of the model’s tokens.
allow_unknown: bool = False
Whether to allow for unknown labels at test/prediction time
Subclasses
  • SlotLabelTensorizerExpansible.Config

Default JSON

{
    "is_input": false,
    "slot_column": "slots",
    "text_column": "text",
    "tokenizer": {
        "Tokenizer": {
            "split_regex": "\\s+",
            "lowercase": true,
            "use_byte_offsets": false
        }
    },
    "allow_unknown": false
}