DataFrame.Observe(String, Column, Column[]) Method

Definition

Define (named) metrics to observe on the Dataset. This method returns an 'observed' DataFrame that returns the same result as the input, with the following guarantees:

  1. It will compute the defined aggregates(metrics) on all the data that is flowing through the Dataset at that point.
  2. It will report the value of the defined aggregate columns as soon as we reach a completion point.A completion point is either the end of a query(batch mode) or the end of a streaming epoch. The value of the aggregates only reflects the data processed since the previous completion point.

Please note that continuous execution is currently not supported.

[Microsoft.Spark.Since("3.0.0")]
public Microsoft.Spark.Sql.DataFrame Observe (string name, Microsoft.Spark.Sql.Column expr, params Microsoft.Spark.Sql.Column[] exprs);
[<Microsoft.Spark.Since("3.0.0")>]
member this.Observe : string * Microsoft.Spark.Sql.Column * Microsoft.Spark.Sql.Column[] -> Microsoft.Spark.Sql.DataFrame
Public Function Observe (name As String, expr As Column, ParamArray exprs As Column()) As DataFrame

Parameters

name
String

Named metrics to observe

expr
Column

Defined aggregate to observe

exprs
Column[]

Defined aggregates to observe

Returns

DataFrame object

Attributes

Applies to