TFMA Extractors¶
tensorflow_model_analysis.extractors
¶
Init module for TensorFlow Model Analysis extractors.
Attributes¶
Extractor
module-attribute
¶
Extractor = NamedTuple(
"Extractor",
[("stage_name", str), ("ptransform", PTransform)],
)
SLICE_KEY_EXTRACTOR_STAGE_NAME
module-attribute
¶
Functions¶
ExampleWeightsExtractor
¶
ExampleWeightsExtractor(
eval_config: EvalConfig,
) -> Extractor
Creates an extractor for extracting example weights.
The extractor's PTransform uses the config's ModelSpec.example_weight_key(s) to lookup the associated example weight values stored as features under the tfma.FEATURES_KEY (and optionally tfma.TRANSFORMED_FEATURES_KEY) in extracts. The resulting values are then added to the extracts under the key tfma.EXAMPLE_WEIGHTS_KEY.
PARAMETER | DESCRIPTION |
---|---|
eval_config
|
Eval config.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting example weights. |
Source code in tensorflow_model_analysis/extractors/example_weights_extractor.py
FeatureExtractor
¶
FeatureExtractor(
additional_extracts: Optional[List[str]] = None,
excludes: Optional[List[bytes]] = None,
extract_source: str = FEATURES_PREDICTIONS_LABELS_KEY,
extract_dest: str = MATERIALIZE_COLUMNS,
)
Source code in tensorflow_model_analysis/extractors/legacy_feature_extractor.py
FeaturesExtractor
¶
FeaturesExtractor(
eval_config: EvalConfig,
tensor_representations: Optional[
Mapping[Text, TensorRepresentation]
] = None,
) -> Extractor
Creates an extractor for extracting features.
The extractor acts as follows depending on the existence of certain keys within the incoming extracts:
1) Extracts contains tfma.ARROW_RECORD_BATCH_KEY
The features stored in the RecordBatch will be extracted and added to the output extract under the key tfma.FEATURES_KEY and the raw serialized inputs will be added under the tfma.INPUT_KEY. Any extracts that already exist will be merged with the values from the RecordBatch with the RecordBatch values taking precedence when duplicate keys are detected. The tfma.ARROW_RECORD_BATCH_KEY key will be removed from the output extracts.
2) Extracts contains tfma.FEATURES_KEY (but not tfma.ARROW_RECORD_BATCH_KEY)
The operation will be a no-op and the incoming extracts will be passed as is to the output.
3) Extracts contains neither tfma.FEATURES_KEY | tfma.ARROW_RECORD_BATCH_KEY
An exception will be raised.
PARAMETER | DESCRIPTION |
---|---|
eval_config
|
Eval config.
TYPE:
|
tensor_representations
|
Optional tensor representations to use when parsing the data. If tensor_representations are not passed or a representation is not found for a given feature name a default representation will be used where possible, otherwise an exception will be raised.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting features. |
Source code in tensorflow_model_analysis/extractors/features_extractor.py
Filter
¶
Filter(
extracts: PCollection,
include: Optional[
Union[Iterable[str], Dict[str, Any]]
] = None,
exclude: Optional[
Union[Iterable[str], Dict[str, Any]]
] = None,
) -> PCollection
Filters extracts to include/exclude specified keys.
PARAMETER | DESCRIPTION |
---|---|
extracts
|
PCollection of extracts.
TYPE:
|
include
|
List or map of keys to include in output. If a map of keys is passed then the keys and sub-keys that exist in the map will be included in the output. An empty dict behaves as a wildcard matching all keys or the value itself. Since matching on feature values is not currently supported, an empty dict must be used to represent the leaf nodes. For example, {'key1': {'key1-subkey': {}}, 'key2': {}}.
TYPE:
|
exclude
|
List or map of keys to exclude from output. If a map of keys is passed then the keys and sub-keys that exist in the map will be excluded from the output. An empty dict behaves as a wildcard matching all keys or the value itself. Since matching on feature values is not currently supported, an empty dict must be used to represent the leaf nodes. For example, {'key1': {'key1-subkey': {}}, 'key2': {}}.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
PCollection
|
Filtered PCollection of Extracts. |
RAISES | DESCRIPTION |
---|---|
ValueError
|
If both include and exclude are used. |
Source code in tensorflow_model_analysis/extractors/extractor.py
InputExtractor
¶
InputExtractor(eval_config: EvalConfig) -> Extractor
Creates an extractor for extracting features, labels, and example weights.
The extractor's PTransform parses tf.train.Example protos stored under the tfma.INPUT_KEY in the incoming extracts and adds the resulting features, labels, and example weights to the extracts under the keys tfma.FEATURES_KEY, tfma.LABELS_KEY, and tfma.EXAMPLE_WEIGHTS_KEY. If the eval_config contains a prediction_key and a corresponding key is found in the parse example, then predictions will also be extracted and stored under the tfma.PREDICTIONS_KEY. Any extracts that already exist will be merged with the values parsed by this extractor with this extractor's values taking precedence when duplicate keys are detected.
Note that the use of a prediction_key in an eval_config serves two use cases
(1) as a key into the dict of predictions output by predict extractor (2) as the key for a pre-computed prediction stored as a feature.
The InputExtractor can be used to handle case (2). These cases are meant to be exclusive (i.e. if approach (2) is used then a predict extractor would not be configured and if (1) is used then a key matching the predictons would not be stored in the features). However, if a feature key happens to match the same name as the prediction output key then both paths may be executed. In this case, the value stored here will be replaced by the predict extractor (though it will still be popped from the features).
PARAMETER | DESCRIPTION |
---|---|
eval_config
|
Eval config.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting features, labels, and example weights inputs. |
Source code in tensorflow_model_analysis/extractors/legacy_input_extractor.py
LabelsExtractor
¶
LabelsExtractor(eval_config: EvalConfig) -> Extractor
Creates an extractor for extracting labels.
The extractor's PTransform uses the config's ModelSpec.label_key(s) to lookup the associated label values stored as features under the tfma.FEATURES_KEY (and optionally tfma.TRANSFORMED_FEATURES_KEY) in extracts. The resulting values are then added to the extracts under the key tfma.LABELS_KEY.
PARAMETER | DESCRIPTION |
---|---|
eval_config
|
Eval config.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting labels. |
Source code in tensorflow_model_analysis/extractors/labels_extractor.py
PredictExtractor
¶
PredictExtractor(
eval_shared_model: MaybeMultipleEvalSharedModels,
desired_batch_size: Optional[int] = None,
materialize: Optional[bool] = True,
eval_config: Optional[EvalConfig] = None,
) -> Extractor
Creates an Extractor for TFMAPredict.
The extractor's PTransform loads and runs the eval_saved_model against every example yielding a copy of the Extracts input with an additional extract of type FeaturesPredictionsLabels keyed by tfma.FEATURES_PREDICTIONS_LABELS_KEY unless eval_config is not None in which case the features, predictions, and labels will be stored separately under tfma.FEATURES_KEY, tfma.PREDICTIONS_KEY, and tfma.LABELS_KEY respectively.
PARAMETER | DESCRIPTION |
---|---|
eval_shared_model
|
Shared model (single-model evaluation) or list of shared models (multi-model evaluation). |
desired_batch_size
|
Optional batch size for batching in Aggregate. |
materialize
|
True to call the FeatureExtractor to add MaterializedColumn entries for the features, predictions, and labels. |
eval_config
|
Eval config.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting features, predictions, labels, and other tensors |
Extractor
|
during predict. |
Source code in tensorflow_model_analysis/extractors/legacy_predict_extractor.py
PredictionsExtractor
¶
PredictionsExtractor(
eval_config: EvalConfig,
eval_shared_model: Optional[
MaybeMultipleEvalSharedModels
] = None,
output_keypath: Sequence[str] = (PREDICTIONS_KEY,),
) -> Extractor
Creates an extractor for performing predictions over a batch.
The extractor's PTransform loads and runs the serving saved_model(s) against every Extracts yielding a copy of the incoming Extracts with an additional Extracts added for the predictions keyed by tfma.PREDICTIONS_KEY. The model inputs are searched for under tfma.FEATURES_KEY (keras only) or tfma.INPUT_KEY (if tfma.FEATURES_KEY is not set or the model is non-keras). If multiple models are used the predictions will be stored in a dict keyed by model name.
Note that the prediction_key in the ModelSpecs also serves as a key into the dict of the prediction's output.
PARAMETER | DESCRIPTION |
---|---|
eval_config
|
Eval config.
TYPE:
|
eval_shared_model
|
Shared model (single-model evaluation) or list of shared models (multi-model evaluation) or None (predictions obtained from features).
TYPE:
|
output_keypath
|
A sequence of keys to be used as the path to traverse and insert the outputs in the extract.
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting predictions. |
Source code in tensorflow_model_analysis/extractors/predictions_extractor.py
SliceKeyExtractor
¶
SliceKeyExtractor(
slice_spec: Optional[List[SingleSliceSpec]] = None,
eval_config: Optional[EvalConfig] = None,
materialize: Optional[bool] = True,
) -> Extractor
Creates an extractor for extracting slice keys.
The incoming Extracts must contain features stored under tfma.FEATURES_KEY and optionally under tfma.TRANSFORMED_FEATURES.
The extractor's PTransform yields a copy of the Extracts input with an additional extract pointing at the list of SliceKeyType values keyed by tfma.SLICE_KEY_TYPES_KEY. If materialize is True then a materialized version of the slice keys will be added under the key tfma.SLICE_KEYS_KEY.
PARAMETER | DESCRIPTION |
---|---|
slice_spec
|
Deprecated (use EvalConfig). |
eval_config
|
Optional EvalConfig containing slicing_specs specifying the slices to slice the data into. If slicing_specs are empty, defaults to overall slice.
TYPE:
|
materialize
|
True to add MaterializedColumn entries for the slice keys. |
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for slice keys. |
Source code in tensorflow_model_analysis/extractors/slice_key_extractor.py
TransformedFeaturesExtractor
¶
TransformedFeaturesExtractor(
eval_config: EvalConfig,
eval_shared_model: MaybeMultipleEvalSharedModels,
) -> Extractor
Creates an extractor for extracting transformed features.
The extractor's PTransform loads the saved_model(s) invoking the preprocessing functions against every extract yielding a copy of the incoming extracts with a tfma.TRANSFORMED_FEATURES_KEY containing the output from the preprocessing functions.
PARAMETER | DESCRIPTION |
---|---|
eval_config
|
Eval config.
TYPE:
|
eval_shared_model
|
Shared model (single-model evaluation) or list of shared models (multi-model evaluation). |
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for extracting preprocessed features. |
Source code in tensorflow_model_analysis/extractors/transformed_features_extractor.py
UnbatchExtractor
¶
UnbatchExtractor() -> Extractor
Creates an extractor for unbatching batched extracts.
This extractor removes Arrow RecordBatch from the batched extract and outputs per-example extracts with the remaining keys. We assume that the remaining keys in the input extract contain list of objects (one per example).
RETURNS | DESCRIPTION |
---|---|
Extractor
|
Extractor for unbatching batched extracts. |