SlotLabelTensorizer.Config¶

Component: SlotLabelTensorizer

class SlotLabelTensorizer.Config[source]: Bases: Tensorizer.Config

All Attributes (including base classes)

is_input: bool = False

slot_column: str = 'slots'

The name of the slot label column to parse from the data source.

text_column: str = 'text'

The name of the text column to parse from the data source. We need this to be able to generate tensors which correspond to input text.

tokenizer: Tokenizer.Config = Tokenizer.Config()

The tokenizer to use to split input text into tokens. This should be configured in a way which yields tokens consistent with the tokens input to or output by a model, so that the labels generated by this tensorizer will match the indices of the model’s tokens.

allow_unknown: bool = False

Whether to allow for unknown labels at test/prediction time

Subclasses

SlotLabelTensorizerExpansible.Config

Default JSON

{
    "is_input": false,
    "slot_column": "slots",
    "text_column": "text",
    "tokenizer": {
        "Tokenizer": {
            "split_regex": "\\s+",
            "lowercase": true,
            "use_byte_offsets": false
        }
    },
    "allow_unknown": false
}