pytext.data.featurizer package¶
Submodules¶
pytext.data.featurizer.featurizer module¶
-
class
pytext.data.featurizer.featurizer.Featurizer(config, feature_config: pytext.config.field_config.FeatureConfig)[source]¶ Bases:
pytext.config.component.ComponentFeaturizer is tasked with performing data preprocessing that should be shared between training and inference, namely, tokenization and gazetteer features alignment.
This is an interface whose featurize() method must be implemented so that the implemented interface can be used with the appropriate data handler.
-
featurize(input_record: pytext.data.featurizer.featurizer.InputRecord) → pytext.data.featurizer.featurizer.OutputRecord[source]¶
-
-
class
pytext.data.featurizer.featurizer.InputRecord[source]¶ Bases:
tupleInput data contract between Featurizer and DataHandler.
-
locale¶ Alias for field number 2
-
raw_gazetteer_feats¶ Alias for field number 1
-
raw_text¶ Alias for field number 0
-
-
class
pytext.data.featurizer.featurizer.OutputRecord[source]¶ Bases:
tupleOutput data contract between Featurizer and DataHandler.
-
characters¶ Alias for field number 5
-
contextual_token_embedding¶ Alias for field number 6
-
dense_feats¶ Alias for field number 7
-
gazetteer_feat_lengths¶ Alias for field number 3
-
gazetteer_feat_weights¶ Alias for field number 4
-
gazetteer_feats¶ Alias for field number 2
-
token_ranges¶ Alias for field number 1
-
tokens¶ Alias for field number 0
-
pytext.data.featurizer.simple_featurizer module¶
-
class
pytext.data.featurizer.simple_featurizer.SimpleFeaturizer(config, feature_config: pytext.config.field_config.FeatureConfig)[source]¶ Bases:
pytext.data.featurizer.featurizer.FeaturizerSimple featurizer for basic tokenization and gazetteer feature alignment.
-
featurize(input_record: pytext.data.featurizer.featurizer.InputRecord) → pytext.data.featurizer.featurizer.OutputRecord[source]¶ Featurize one instance/example only.
-
featurize_batch(input_records: Sequence[pytext.data.featurizer.featurizer.InputRecord]) → Sequence[pytext.data.featurizer.featurizer.OutputRecord][source]¶ Featurize a batch of instances/examples.
-
Module contents¶
-
class
pytext.data.featurizer.Featurizer(config, feature_config: pytext.config.field_config.FeatureConfig)[source]¶ Bases:
pytext.config.component.ComponentFeaturizer is tasked with performing data preprocessing that should be shared between training and inference, namely, tokenization and gazetteer features alignment.
This is an interface whose featurize() method must be implemented so that the implemented interface can be used with the appropriate data handler.
-
featurize(input_record: pytext.data.featurizer.featurizer.InputRecord) → pytext.data.featurizer.featurizer.OutputRecord[source]¶
-
-
class
pytext.data.featurizer.InputRecord[source]¶ Bases:
tupleInput data contract between Featurizer and DataHandler.
-
locale¶ Alias for field number 2
-
raw_gazetteer_feats¶ Alias for field number 1
-
raw_text¶ Alias for field number 0
-
-
class
pytext.data.featurizer.OutputRecord[source]¶ Bases:
tupleOutput data contract between Featurizer and DataHandler.
-
characters¶ Alias for field number 5
-
contextual_token_embedding¶ Alias for field number 6
-
dense_feats¶ Alias for field number 7
-
gazetteer_feat_lengths¶ Alias for field number 3
-
gazetteer_feat_weights¶ Alias for field number 4
-
gazetteer_feats¶ Alias for field number 2
-
token_ranges¶ Alias for field number 1
-
tokens¶ Alias for field number 0
-
-
class
pytext.data.featurizer.SimpleFeaturizer(config, feature_config: pytext.config.field_config.FeatureConfig)[source]¶ Bases:
pytext.data.featurizer.featurizer.FeaturizerSimple featurizer for basic tokenization and gazetteer feature alignment.
-
featurize(input_record: pytext.data.featurizer.featurizer.InputRecord) → pytext.data.featurizer.featurizer.OutputRecord[source]¶ Featurize one instance/example only.
-
featurize_batch(input_records: Sequence[pytext.data.featurizer.featurizer.InputRecord]) → Sequence[pytext.data.featurizer.featurizer.OutputRecord][source]¶ Featurize a batch of instances/examples.
-