Skip to content

TFMA Extractors

tensorflow_model_analysis.extractors

Init module for TensorFlow Model Analysis extractors.

Attributes

Extractor module-attribute

Extractor = NamedTuple(
    "Extractor",
    [("stage_name", str), ("ptransform", PTransform)],
)

SLICE_KEY_EXTRACTOR_STAGE_NAME module-attribute

SLICE_KEY_EXTRACTOR_STAGE_NAME = 'ExtractSliceKeys'

Functions

ExampleWeightsExtractor

ExampleWeightsExtractor(
    eval_config: EvalConfig,
) -> Extractor

Creates an extractor for extracting example weights.

The extractor's PTransform uses the config's ModelSpec.example_weight_key(s) to lookup the associated example weight values stored as features under the tfma.FEATURES_KEY (and optionally tfma.TRANSFORMED_FEATURES_KEY) in extracts. The resulting values are then added to the extracts under the key tfma.EXAMPLE_WEIGHTS_KEY.

PARAMETER DESCRIPTION
eval_config

Eval config.

TYPE: EvalConfig

RETURNS DESCRIPTION
Extractor

Extractor for extracting example weights.

Source code in tensorflow_model_analysis/extractors/example_weights_extractor.py
def ExampleWeightsExtractor(
    eval_config: config_pb2.EvalConfig,
) -> extractor.Extractor:
  """Creates an extractor for extracting example weights.

  The extractor's PTransform uses the config's ModelSpec.example_weight_key(s)
  to lookup the associated example weight values stored as features under the
  tfma.FEATURES_KEY (and optionally tfma.TRANSFORMED_FEATURES_KEY) in extracts.
  The resulting values are then added to the extracts under the key
  tfma.EXAMPLE_WEIGHTS_KEY.

  Args:
    eval_config: Eval config.

  Returns:
    Extractor for extracting example weights.
  """
  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=_EXAMPLE_WEIGHTS_EXTRACTOR_STAGE_NAME,
      ptransform=_ExtractExampleWeights(eval_config=eval_config),
  )

FeatureExtractor

FeatureExtractor(
    additional_extracts: Optional[List[str]] = None,
    excludes: Optional[List[bytes]] = None,
    extract_source: str = FEATURES_PREDICTIONS_LABELS_KEY,
    extract_dest: str = MATERIALIZE_COLUMNS,
)
Source code in tensorflow_model_analysis/extractors/legacy_feature_extractor.py
def FeatureExtractor(
    additional_extracts: Optional[List[str]] = None,
    excludes: Optional[List[bytes]] = None,
    extract_source: str = constants.FEATURES_PREDICTIONS_LABELS_KEY,
    extract_dest: str = constants.MATERIALIZE_COLUMNS,
):
  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=_FEATURE_EXTRACTOR_STAGE_NAME,
      ptransform=_ExtractFeatures(
          additional_extracts=additional_extracts,
          excludes=excludes,
          source=extract_source,
          dest=extract_dest,
      ),
  )

FeaturesExtractor

FeaturesExtractor(
    eval_config: EvalConfig,
    tensor_representations: Optional[
        Mapping[Text, TensorRepresentation]
    ] = None,
) -> Extractor

Creates an extractor for extracting features.

The extractor acts as follows depending on the existence of certain keys within the incoming extracts:

1) Extracts contains tfma.ARROW_RECORD_BATCH_KEY

The features stored in the RecordBatch will be extracted and added to the output extract under the key tfma.FEATURES_KEY and the raw serialized inputs will be added under the tfma.INPUT_KEY. Any extracts that already exist will be merged with the values from the RecordBatch with the RecordBatch values taking precedence when duplicate keys are detected. The tfma.ARROW_RECORD_BATCH_KEY key will be removed from the output extracts.

2) Extracts contains tfma.FEATURES_KEY (but not tfma.ARROW_RECORD_BATCH_KEY)

The operation will be a no-op and the incoming extracts will be passed as is to the output.

3) Extracts contains neither tfma.FEATURES_KEY | tfma.ARROW_RECORD_BATCH_KEY

An exception will be raised.

PARAMETER DESCRIPTION
eval_config

Eval config.

TYPE: EvalConfig

tensor_representations

Optional tensor representations to use when parsing the data. If tensor_representations are not passed or a representation is not found for a given feature name a default representation will be used where possible, otherwise an exception will be raised.

TYPE: Optional[Mapping[Text, TensorRepresentation]] DEFAULT: None

RETURNS DESCRIPTION
Extractor

Extractor for extracting features.

Source code in tensorflow_model_analysis/extractors/features_extractor.py
def FeaturesExtractor(  # pylint: disable=invalid-name
    eval_config: config_pb2.EvalConfig,
    tensor_representations: Optional[
        Mapping[Text, schema_pb2.TensorRepresentation]
    ] = None,
) -> extractor.Extractor:
  """Creates an extractor for extracting features.

  The extractor acts as follows depending on the existence of certain keys
  within the incoming extracts:

    1) Extracts contains tfma.ARROW_RECORD_BATCH_KEY

    The features stored in the RecordBatch will be extracted and added to the
    output extract under the key tfma.FEATURES_KEY and the raw serialized inputs
    will be added under the tfma.INPUT_KEY. Any extracts that already exist will
    be merged with the values from the RecordBatch with the RecordBatch values
    taking precedence when duplicate keys are detected. The
    tfma.ARROW_RECORD_BATCH_KEY key will be removed from the output extracts.

    2) Extracts contains tfma.FEATURES_KEY (but not tfma.ARROW_RECORD_BATCH_KEY)

    The operation will be a no-op and the incoming extracts will be passed as is
    to the output.

    3) Extracts contains neither tfma.FEATURES_KEY | tfma.ARROW_RECORD_BATCH_KEY

    An exception will be raised.

  Args:
    eval_config: Eval config.
    tensor_representations: Optional tensor representations to use when parsing
      the data. If tensor_representations are not passed or a representation is
      not found for a given feature name a default representation will be used
      where possible, otherwise an exception will be raised.

  Returns:
    Extractor for extracting features.
  """
  del eval_config
  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=_FEATURES_EXTRACTOR_STAGE_NAME,
      ptransform=_ExtractFeatures(tensor_representations or {}),
  )

Filter

Filter(
    extracts: PCollection,
    include: Optional[
        Union[Iterable[str], Dict[str, Any]]
    ] = None,
    exclude: Optional[
        Union[Iterable[str], Dict[str, Any]]
    ] = None,
) -> PCollection

Filters extracts to include/exclude specified keys.

PARAMETER DESCRIPTION
extracts

PCollection of extracts.

TYPE: PCollection

include

List or map of keys to include in output. If a map of keys is passed then the keys and sub-keys that exist in the map will be included in the output. An empty dict behaves as a wildcard matching all keys or the value itself. Since matching on feature values is not currently supported, an empty dict must be used to represent the leaf nodes. For example, {'key1': {'key1-subkey': {}}, 'key2': {}}.

TYPE: Optional[Union[Iterable[str], Dict[str, Any]]] DEFAULT: None

exclude

List or map of keys to exclude from output. If a map of keys is passed then the keys and sub-keys that exist in the map will be excluded from the output. An empty dict behaves as a wildcard matching all keys or the value itself. Since matching on feature values is not currently supported, an empty dict must be used to represent the leaf nodes. For example, {'key1': {'key1-subkey': {}}, 'key2': {}}.

TYPE: Optional[Union[Iterable[str], Dict[str, Any]]] DEFAULT: None

RETURNS DESCRIPTION
PCollection

Filtered PCollection of Extracts.

RAISES DESCRIPTION
ValueError

If both include and exclude are used.

Source code in tensorflow_model_analysis/extractors/extractor.py
@beam.ptransform_fn
@beam.typehints.with_input_types(types.Extracts)
@beam.typehints.with_output_types(types.Extracts)
def Filter(  # pylint: disable=invalid-name
    extracts: beam.pvalue.PCollection,
    include: Optional[Union[Iterable[str], Dict[str, Any]]] = None,
    exclude: Optional[Union[Iterable[str], Dict[str, Any]]] = None,
) -> beam.pvalue.PCollection:
  """Filters extracts to include/exclude specified keys.

  Args:
    extracts: PCollection of extracts.
    include: List or map of keys to include in output. If a map of keys is
      passed then the keys and sub-keys that exist in the map will be included
      in the output. An empty dict behaves as a wildcard matching all keys or
      the value itself. Since matching on feature values is not currently
      supported, an empty dict must be used to represent the leaf nodes. For
      example, {'key1': {'key1-subkey': {}}, 'key2': {}}.
    exclude: List or map of keys to exclude from output. If a map of keys is
      passed then the keys and sub-keys that exist in the map will be excluded
      from the output. An empty dict behaves as a wildcard matching all keys or
      the value itself. Since matching on feature values is not currently
      supported, an empty dict must be used to represent the leaf nodes. For
      example, {'key1': {'key1-subkey': {}}, 'key2': {}}.

  Returns:
    Filtered PCollection of Extracts.

  Raises:
    ValueError: If both include and exclude are used.
  """
  if include and exclude:
    raise ValueError('only one of include or exclude should be used.')

  if not isinstance(include, dict):
    include = {k: {} for k in include or []}
  if not isinstance(exclude, dict):
    exclude = {k: {} for k in exclude or []}

  def filter_extracts(extracts: types.Extracts) -> types.Extracts:  # pylint: disable=invalid-name
    """Filters extracts."""
    if not include and not exclude:
      return extracts
    elif include:
      return util.include_filter(include, extracts)
    else:
      return util.exclude_filter(exclude, extracts)

  return extracts | beam.Map(filter_extracts)

InputExtractor

InputExtractor(eval_config: EvalConfig) -> Extractor

Creates an extractor for extracting features, labels, and example weights.

The extractor's PTransform parses tf.train.Example protos stored under the tfma.INPUT_KEY in the incoming extracts and adds the resulting features, labels, and example weights to the extracts under the keys tfma.FEATURES_KEY, tfma.LABELS_KEY, and tfma.EXAMPLE_WEIGHTS_KEY. If the eval_config contains a prediction_key and a corresponding key is found in the parse example, then predictions will also be extracted and stored under the tfma.PREDICTIONS_KEY. Any extracts that already exist will be merged with the values parsed by this extractor with this extractor's values taking precedence when duplicate keys are detected.

Note that the use of a prediction_key in an eval_config serves two use cases

(1) as a key into the dict of predictions output by predict extractor (2) as the key for a pre-computed prediction stored as a feature.

The InputExtractor can be used to handle case (2). These cases are meant to be exclusive (i.e. if approach (2) is used then a predict extractor would not be configured and if (1) is used then a key matching the predictons would not be stored in the features). However, if a feature key happens to match the same name as the prediction output key then both paths may be executed. In this case, the value stored here will be replaced by the predict extractor (though it will still be popped from the features).

PARAMETER DESCRIPTION
eval_config

Eval config.

TYPE: EvalConfig

RETURNS DESCRIPTION
Extractor

Extractor for extracting features, labels, and example weights inputs.

Source code in tensorflow_model_analysis/extractors/legacy_input_extractor.py
def InputExtractor(eval_config: config_pb2.EvalConfig) -> extractor.Extractor:
  """Creates an extractor for extracting features, labels, and example weights.

  The extractor's PTransform parses tf.train.Example protos stored under the
  tfma.INPUT_KEY in the incoming extracts and adds the resulting features,
  labels, and example weights to the extracts under the keys tfma.FEATURES_KEY,
  tfma.LABELS_KEY, and tfma.EXAMPLE_WEIGHTS_KEY. If the eval_config contains a
  prediction_key and a corresponding key is found in the parse example, then
  predictions will also be extracted and stored under the tfma.PREDICTIONS_KEY.
  Any extracts that already exist will be merged with the values parsed by this
  extractor with this extractor's values taking precedence when duplicate keys
  are detected.

  Note that the use of a prediction_key in an eval_config serves two use cases:
    (1) as a key into the dict of predictions output by predict extractor
    (2) as the key for a pre-computed prediction stored as a feature.
  The InputExtractor can be used to handle case (2). These cases are meant to be
  exclusive (i.e. if approach (2) is used then a predict extractor would not be
  configured and if (1) is used then a key matching the predictons would not be
  stored in the features). However, if a feature key happens to match the same
  name as the prediction output key then both paths may be executed. In this
  case, the value stored here will be replaced by the predict extractor (though
  it will still be popped from the features).

  Args:
    eval_config: Eval config.

  Returns:
    Extractor for extracting features, labels, and example weights inputs.
  """
  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=_INPUT_EXTRACTOR_STAGE_NAME,
      ptransform=_ExtractInputs(eval_config=eval_config),
  )

LabelsExtractor

LabelsExtractor(eval_config: EvalConfig) -> Extractor

Creates an extractor for extracting labels.

The extractor's PTransform uses the config's ModelSpec.label_key(s) to lookup the associated label values stored as features under the tfma.FEATURES_KEY (and optionally tfma.TRANSFORMED_FEATURES_KEY) in extracts. The resulting values are then added to the extracts under the key tfma.LABELS_KEY.

PARAMETER DESCRIPTION
eval_config

Eval config.

TYPE: EvalConfig

RETURNS DESCRIPTION
Extractor

Extractor for extracting labels.

Source code in tensorflow_model_analysis/extractors/labels_extractor.py
def LabelsExtractor(eval_config: config_pb2.EvalConfig) -> extractor.Extractor:
  """Creates an extractor for extracting labels.

  The extractor's PTransform uses the config's ModelSpec.label_key(s) to lookup
  the associated label values stored as features under the tfma.FEATURES_KEY
  (and optionally tfma.TRANSFORMED_FEATURES_KEY) in extracts. The resulting
  values are then added to the extracts under the key tfma.LABELS_KEY.

  Args:
    eval_config: Eval config.

  Returns:
    Extractor for extracting labels.
  """
  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=LABELS_EXTRACTOR_STAGE_NAME,
      ptransform=_ExtractLabels(eval_config=eval_config),
  )

PredictExtractor

PredictExtractor(
    eval_shared_model: MaybeMultipleEvalSharedModels,
    desired_batch_size: Optional[int] = None,
    materialize: Optional[bool] = True,
    eval_config: Optional[EvalConfig] = None,
) -> Extractor

Creates an Extractor for TFMAPredict.

The extractor's PTransform loads and runs the eval_saved_model against every example yielding a copy of the Extracts input with an additional extract of type FeaturesPredictionsLabels keyed by tfma.FEATURES_PREDICTIONS_LABELS_KEY unless eval_config is not None in which case the features, predictions, and labels will be stored separately under tfma.FEATURES_KEY, tfma.PREDICTIONS_KEY, and tfma.LABELS_KEY respectively.

PARAMETER DESCRIPTION
eval_shared_model

Shared model (single-model evaluation) or list of shared models (multi-model evaluation).

TYPE: MaybeMultipleEvalSharedModels

desired_batch_size

Optional batch size for batching in Aggregate.

TYPE: Optional[int] DEFAULT: None

materialize

True to call the FeatureExtractor to add MaterializedColumn entries for the features, predictions, and labels.

TYPE: Optional[bool] DEFAULT: True

eval_config

Eval config.

TYPE: Optional[EvalConfig] DEFAULT: None

RETURNS DESCRIPTION
Extractor

Extractor for extracting features, predictions, labels, and other tensors

Extractor

during predict.

Source code in tensorflow_model_analysis/extractors/legacy_predict_extractor.py
def PredictExtractor(  # pylint: disable=invalid-name
    eval_shared_model: types.MaybeMultipleEvalSharedModels,
    desired_batch_size: Optional[int] = None,
    materialize: Optional[bool] = True,
    eval_config: Optional[config_pb2.EvalConfig] = None,
) -> extractor.Extractor:
  """Creates an Extractor for TFMAPredict.

  The extractor's PTransform loads and runs the eval_saved_model against every
  example yielding a copy of the Extracts input with an additional extract
  of type FeaturesPredictionsLabels keyed by
  tfma.FEATURES_PREDICTIONS_LABELS_KEY unless eval_config is not None in which
  case the features, predictions, and labels will be stored separately under
  tfma.FEATURES_KEY, tfma.PREDICTIONS_KEY, and tfma.LABELS_KEY respectively.

  Args:
    eval_shared_model: Shared model (single-model evaluation) or list of shared
      models (multi-model evaluation).
    desired_batch_size: Optional batch size for batching in Aggregate.
    materialize: True to call the FeatureExtractor to add MaterializedColumn
      entries for the features, predictions, and labels.
    eval_config: Eval config.

  Returns:
    Extractor for extracting features, predictions, labels, and other tensors
    during predict.
  """
  eval_shared_models = model_util.verify_and_update_eval_shared_models(
      eval_shared_model
  )

  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=_PREDICT_EXTRACTOR_STAGE_NAME,
      ptransform=_TFMAPredict(
          eval_shared_models={m.model_name: m for m in eval_shared_models},
          desired_batch_size=desired_batch_size,
          materialize=materialize,
          eval_config=eval_config,
      ),
  )

PredictionsExtractor

PredictionsExtractor(
    eval_config: EvalConfig,
    eval_shared_model: Optional[
        MaybeMultipleEvalSharedModels
    ] = None,
    output_keypath: Sequence[str] = (PREDICTIONS_KEY,),
) -> Extractor

Creates an extractor for performing predictions over a batch.

The extractor's PTransform loads and runs the serving saved_model(s) against every Extracts yielding a copy of the incoming Extracts with an additional Extracts added for the predictions keyed by tfma.PREDICTIONS_KEY. The model inputs are searched for under tfma.FEATURES_KEY (keras only) or tfma.INPUT_KEY (if tfma.FEATURES_KEY is not set or the model is non-keras). If multiple models are used the predictions will be stored in a dict keyed by model name.

Note that the prediction_key in the ModelSpecs also serves as a key into the dict of the prediction's output.

PARAMETER DESCRIPTION
eval_config

Eval config.

TYPE: EvalConfig

eval_shared_model

Shared model (single-model evaluation) or list of shared models (multi-model evaluation) or None (predictions obtained from features).

TYPE: Optional[MaybeMultipleEvalSharedModels] DEFAULT: None

output_keypath

A sequence of keys to be used as the path to traverse and insert the outputs in the extract.

TYPE: Sequence[str] DEFAULT: (PREDICTIONS_KEY,)

RETURNS DESCRIPTION
Extractor

Extractor for extracting predictions.

Source code in tensorflow_model_analysis/extractors/predictions_extractor.py
def PredictionsExtractor(
    eval_config: config_pb2.EvalConfig,
    eval_shared_model: Optional[types.MaybeMultipleEvalSharedModels] = None,
    output_keypath: Sequence[str] = (constants.PREDICTIONS_KEY,),
) -> extractor.Extractor:
  """Creates an extractor for performing predictions over a batch.

  The extractor's PTransform loads and runs the serving saved_model(s) against
  every Extracts yielding a copy of the incoming Extracts with an additional
  Extracts added for the predictions keyed by tfma.PREDICTIONS_KEY. The model
  inputs are searched for under tfma.FEATURES_KEY (keras only) or tfma.INPUT_KEY
  (if tfma.FEATURES_KEY is not set or the model is non-keras). If multiple
  models are used the predictions will be stored in a dict keyed by model name.

  Note that the prediction_key in the ModelSpecs also serves as a key into the
  dict of the prediction's output.

  Args:
    eval_config: Eval config.
    eval_shared_model: Shared model (single-model evaluation) or list of shared
      models (multi-model evaluation) or None (predictions obtained from
      features).
    output_keypath: A sequence of keys to be used as the path to traverse and
      insert the outputs in the extract.

  Returns:
    Extractor for extracting predictions.
  """
  # TODO(b/239975835): Remove this Optional support for version 1.0.
  if eval_shared_model is None:
    logging.warning(
        'Calling the PredictionsExtractor with eval_shared_model=None is '
        'deprecated and no longer supported. This will break in version 1.0. '
        'Please update your implementation to call '
        'MaterializedPredictionsExtractor directly.'
    )
    _, ptransform = (
        materialized_predictions_extractor.MaterializedPredictionsExtractor(
            eval_config, output_keypath=output_keypath
        )
    )
    # Note we are changing the stage name here for backwards compatibility. Old
    # clients expect these code paths to have the same stage name. New clients
    # should never reference the private stage name.
    return extractor.Extractor(
        stage_name=PREDICTIONS_EXTRACTOR_STAGE_NAME, ptransform=ptransform
    )

  return extractor.Extractor(
      stage_name=PREDICTIONS_EXTRACTOR_STAGE_NAME,
      ptransform=_ModelSignaturesInferenceWrapper(  # pylint: disable=no-value-for-parameter
          model_specs=list(eval_config.model_specs),
          eval_shared_model=eval_shared_model,
          output_keypath=output_keypath,
      ),
  )

SliceKeyExtractor

SliceKeyExtractor(
    slice_spec: Optional[List[SingleSliceSpec]] = None,
    eval_config: Optional[EvalConfig] = None,
    materialize: Optional[bool] = True,
) -> Extractor

Creates an extractor for extracting slice keys.

The incoming Extracts must contain features stored under tfma.FEATURES_KEY and optionally under tfma.TRANSFORMED_FEATURES.

The extractor's PTransform yields a copy of the Extracts input with an additional extract pointing at the list of SliceKeyType values keyed by tfma.SLICE_KEY_TYPES_KEY. If materialize is True then a materialized version of the slice keys will be added under the key tfma.SLICE_KEYS_KEY.

PARAMETER DESCRIPTION
slice_spec

Deprecated (use EvalConfig).

TYPE: Optional[List[SingleSliceSpec]] DEFAULT: None

eval_config

Optional EvalConfig containing slicing_specs specifying the slices to slice the data into. If slicing_specs are empty, defaults to overall slice.

TYPE: Optional[EvalConfig] DEFAULT: None

materialize

True to add MaterializedColumn entries for the slice keys.

TYPE: Optional[bool] DEFAULT: True

RETURNS DESCRIPTION
Extractor

Extractor for slice keys.

Source code in tensorflow_model_analysis/extractors/slice_key_extractor.py
def SliceKeyExtractor(
    slice_spec: Optional[List[slicer.SingleSliceSpec]] = None,
    eval_config: Optional[config_pb2.EvalConfig] = None,
    materialize: Optional[bool] = True,
) -> extractor.Extractor:
  """Creates an extractor for extracting slice keys.

  The incoming Extracts must contain features stored under tfma.FEATURES_KEY
  and optionally under tfma.TRANSFORMED_FEATURES.

  The extractor's PTransform yields a copy of the Extracts input with an
  additional extract pointing at the list of SliceKeyType values keyed by
  tfma.SLICE_KEY_TYPES_KEY. If materialize is True then a materialized version
  of the slice keys will be added under the key tfma.SLICE_KEYS_KEY.

  Args:
    slice_spec: Deprecated (use EvalConfig).
    eval_config: Optional EvalConfig containing slicing_specs specifying the
      slices to slice the data into. If slicing_specs are empty, defaults to
      overall slice.
    materialize: True to add MaterializedColumn entries for the slice keys.

  Returns:
    Extractor for slice keys.
  """
  if slice_spec and eval_config:
    raise ValueError('slice_spec is deprecated, only use eval_config')
  if eval_config:
    slice_spec = [
        slicer.SingleSliceSpec(spec=spec) for spec in eval_config.slicing_specs
    ]
    for cross_slice_spec in eval_config.cross_slicing_specs:
      baseline_slice_spec = slicer.SingleSliceSpec(
          spec=cross_slice_spec.baseline_spec
      )
      if baseline_slice_spec not in slice_spec:
        slice_spec.append(baseline_slice_spec)
      for spec in cross_slice_spec.slicing_specs:
        comparison_slice_spec = slicer.SingleSliceSpec(spec=spec)
        if comparison_slice_spec not in slice_spec:
          slice_spec.append(comparison_slice_spec)
  if not slice_spec:
    slice_spec = [slicer.SingleSliceSpec()]
  return extractor.Extractor(
      stage_name=SLICE_KEY_EXTRACTOR_STAGE_NAME,
      ptransform=ExtractSliceKeys(slice_spec, eval_config, materialize),
  )

TransformedFeaturesExtractor

TransformedFeaturesExtractor(
    eval_config: EvalConfig,
    eval_shared_model: MaybeMultipleEvalSharedModels,
) -> Extractor

Creates an extractor for extracting transformed features.

The extractor's PTransform loads the saved_model(s) invoking the preprocessing functions against every extract yielding a copy of the incoming extracts with a tfma.TRANSFORMED_FEATURES_KEY containing the output from the preprocessing functions.

PARAMETER DESCRIPTION
eval_config

Eval config.

TYPE: EvalConfig

eval_shared_model

Shared model (single-model evaluation) or list of shared models (multi-model evaluation).

TYPE: MaybeMultipleEvalSharedModels

RETURNS DESCRIPTION
Extractor

Extractor for extracting preprocessed features.

Source code in tensorflow_model_analysis/extractors/transformed_features_extractor.py
def TransformedFeaturesExtractor(
    eval_config: config_pb2.EvalConfig,
    eval_shared_model: types.MaybeMultipleEvalSharedModels,
) -> extractor.Extractor:
  """Creates an extractor for extracting transformed features.

  The extractor's PTransform loads the saved_model(s) invoking the preprocessing
  functions against every extract yielding a copy of the incoming extracts with
  a tfma.TRANSFORMED_FEATURES_KEY containing the output from the preprocessing
  functions.

  Args:
    eval_config: Eval config.
    eval_shared_model: Shared model (single-model evaluation) or list of shared
      models (multi-model evaluation).

  Returns:
    Extractor for extracting preprocessed features.
  """
  eval_shared_models = model_util.verify_and_update_eval_shared_models(
      eval_shared_model
  )

  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=_TRANSFORMED_FEATURES_EXTRACTOR_STAGE_NAME,
      ptransform=_ExtractTransformedFeatures(
          eval_config=eval_config,
          eval_shared_models={m.model_name: m for m in eval_shared_models},
      ),
  )

UnbatchExtractor

UnbatchExtractor() -> Extractor

Creates an extractor for unbatching batched extracts.

This extractor removes Arrow RecordBatch from the batched extract and outputs per-example extracts with the remaining keys. We assume that the remaining keys in the input extract contain list of objects (one per example).

RETURNS DESCRIPTION
Extractor

Extractor for unbatching batched extracts.

Source code in tensorflow_model_analysis/extractors/unbatch_extractor.py
def UnbatchExtractor() -> extractor.Extractor:
  """Creates an extractor for unbatching batched extracts.

  This extractor removes Arrow RecordBatch from the batched extract and outputs
  per-example extracts with the remaining keys. We assume that the remaining
  keys in the input extract contain list of objects (one per example).

  Returns:
    Extractor for unbatching batched extracts.
  """
  # pylint: disable=no-value-for-parameter
  return extractor.Extractor(
      stage_name=UNBATCH_EXTRACTOR_STAGE_NAME, ptransform=_UnbatchInputs()
  )