StreamExecution

Instance Constructors

new StreamExecution(sparkSession: SparkSession, name: String, checkpointRoot: String, analyzedPlan: LogicalPlan, sink: Sink, trigger: Trigger, triggerClock: Clock, outputMode: OutputMode, deleteCheckpointOnStop: Boolean)

deleteCheckpointOnStop
whether to delete the checkpoint if the query is stopped without errors

Type Members

case class ExecutionStats(inputRows: Map[Source, Long], stateOperators: Seq[StateOperatorProgress], eventTimeStats: Map[String, String]) extends Product with Serializable

Definition Classes
ProgressReporter
trait State extends AnyRef

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
object ACTIVE extends State with Product with Serializable
object INITIALIZING extends State with Product with Serializable
object TERMINATED extends State with Product with Serializable
final def asInstanceOf[T0]: T0

Definition Classes
Any
var availableOffsets: StreamProgress

Tracks the offsets that are available to be processed, but have not yet be committed to the sink.
Tracks the offsets that are available to be processed, but have not yet be committed to the sink. Only the scheduler thread should modify this field, and only in atomic steps. Other threads should make a shallow copy if they are going to access this field more than once, since the field's value may change at any time.

Definition Classes
StreamExecution → ProgressReporter
def awaitInitialization(timeoutMs: Long): Unit

Await until all fields of the query have been initialized.
def awaitTermination(timeoutMs: Long): Boolean

Waits for the termination of this query, either by query.stop() or by an exception.
Waits for the termination of this query, either by query.stop() or by an exception. If the query has terminated with an exception, then the exception will be thrown. Otherwise, it returns whether the query has terminated or not within the timeoutMs milliseconds.
If the query has terminated, then all subsequent calls to this method will either return true immediately (if the query was terminated by stop()), or throw the exception immediately (if the query has terminated with exception).

Definition Classes
StreamExecution → StreamingQuery
Since
2.0.0
Exceptions thrown
StreamingQueryException if the query has terminated with an exception
def awaitTermination(): Unit

Waits for the termination of this query, either by query.stop() or by an exception.
Waits for the termination of this query, either by query.stop() or by an exception. If the query has terminated with an exception, then the exception will be thrown.
If the query has terminated, then all subsequent calls to this method will either return immediately (if the query was terminated by stop()), or throw the exception immediately (if the query has terminated with exception).

Definition Classes
StreamExecution → StreamingQuery
Since
2.0.0
Exceptions thrown
StreamingQueryException if the query has terminated with an exception.
val checkpointRoot: String
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
var committedOffsets: StreamProgress

Tracks how much data we have processed and committed to the sink or state store from each input source.
Tracks how much data we have processed and committed to the sink or state store from each input source. Only the scheduler thread should modify this field, and only in atomic steps. Other threads should make a shallow copy if they are going to access this field more than once, since the field's value may change at any time.

Definition Classes
StreamExecution → ProgressReporter
var currentBatchId: Long

The current batchId or -1 if execution has not yet been initialized.
The current batchId or -1 if execution has not yet been initialized.

Attributes
protected
Definition Classes
StreamExecution → ProgressReporter
var currentStatus: StreamingQueryStatus

Attributes
protected
Definition Classes
ProgressReporter
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def exception: Option[StreamingQueryException]

Returns the StreamingQueryException if the query was terminated by an exception.
Returns the StreamingQueryException if the query was terminated by an exception.

Definition Classes
StreamExecution → StreamingQuery
def explain(): Unit

Prints the physical plan to the console for debugging purposes.
Prints the physical plan to the console for debugging purposes.

Definition Classes
StreamExecution → StreamingQuery
Since
2.0.0
def explain(extended: Boolean): Unit

Prints the physical plan to the console for debugging purposes.
Prints the physical plan to the console for debugging purposes.
extended
whether to do extended explain or not

Definition Classes
StreamExecution → StreamingQuery
Since
2.0.0
def explainInternal(extended: Boolean): String

Expose for tests
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def finishTrigger(hasNewData: Boolean): Unit

Finalizes the query progress and adds it to list of recent status updates.
Finalizes the query progress and adds it to list of recent status updates.

Attributes
protected
Definition Classes
ProgressReporter
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
val id: UUID

Returns the unique id of this query that persists across restarts from checkpoint data.
Returns the unique id of this query that persists across restarts from checkpoint data. That is, this id is generated when a query is started for the first time, and will be the same every time it is restarted from checkpoint data. Also see runId.

Definition Classes
StreamExecution → ProgressReporter → StreamingQuery
Since
2.1.0
def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean = false): Boolean

Attributes
protected
Definition Classes
Logging
def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes
protected
Definition Classes
Logging
def isActive: Boolean

Whether the query is currently active or not
Whether the query is currently active or not

Definition Classes
StreamExecution → StreamingQuery
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def isTraceEnabled(): Boolean

Attributes
protected
Definition Classes
Logging
var lastExecution: IncrementalExecution

Definition Classes
StreamExecution → ProgressReporter
def lastProgress: StreamingQueryProgress

Returns the most recent query progress update or null if there were no progress updates.
Returns the most recent query progress update or null if there were no progress updates.

Definition Classes
ProgressReporter
def log: Logger

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logName: String

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
lazy val logicalPlan: LogicalPlan

Definition Classes
StreamExecution → ProgressReporter
val microBatchThread: StreamExecutionThread

The thread that runs the micro-batches of this stream.
The thread that runs the micro-batches of this stream. Note that this thread must be org.apache.spark.util.UninterruptibleThread to workaround KAFKA-1894: interrupting a running KafkaConsumer may cause endless loop, and HADOOP-10622: interrupting Shell.runCommand causes deadlock. (SPARK-14131)
val name: String

Returns the user-specified name of the query, or null if not specified.
Returns the user-specified name of the query, or null if not specified. This name can be specified in the org.apache.spark.sql.streaming.DataStreamWriter as dataframe.writeStream.queryName("query").start(). This name, if set, must be unique across all active queries.

Definition Classes
StreamExecution → ProgressReporter → StreamingQuery
Since
2.0.0
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
var newData: Map[Source, DataFrame]

Holds the most recent input data for each source.
Holds the most recent input data for each source.

Attributes
protected
Definition Classes
StreamExecution → ProgressReporter
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
val offsetLog: OffsetSeqLog

A write-ahead-log that records the offsets that are present in each batch.
A write-ahead-log that records the offsets that are present in each batch. In order to ensure that a given batch will always consist of the same data, we write to this log *before* any processing is done. Thus, the Nth record in this log indicated data that is currently being processed and the N-1th entry indicates which offsets have been durably committed to the sink.
var offsetSeqMetadata: OffsetSeqMetadata

Metadata associated with the offset seq of a batch in the query.
Metadata associated with the offset seq of a batch in the query.

Attributes
protected
Definition Classes
StreamExecution → ProgressReporter
val outputMode: OutputMode
def postEvent(event: Event): Unit

Attributes
protected
Definition Classes
StreamExecution → ProgressReporter
def processAllAvailable(): Unit

Blocks until all available data in the source has been processed and committed to the sink.
Blocks until all available data in the source has been processed and committed to the sink. This method is intended for testing. Note that in the case of continually arriving data, this method may block forever. Additionally, this method is only guaranteed to block until data that has been synchronously appended data to a org.apache.spark.sql.execution.streaming.Source prior to invocation. (i.e. getOffset must immediately reflect the addition).

Definition Classes
StreamExecution → StreamingQuery
Since
2.0.0
def recentProgress: Array[StreamingQueryProgress]

Returns an array containing the most recent query progress updates.
Returns an array containing the most recent query progress updates.

Definition Classes
ProgressReporter
def reportTimeTaken[T](triggerDetailKey: String)(body: ⇒ T): T

Records the duration of running body for the next query progress update.
Records the duration of running body for the next query progress update.

Attributes
protected
Definition Classes
ProgressReporter
val runId: UUID

Returns the unique id of this run of the query.
Returns the unique id of this run of the query. That is, every start/restart of a query will generated a unique runId. Therefore, every time a query is restarted from checkpoint, it will have the same id but different runIds.

Definition Classes
StreamExecution → ProgressReporter → StreamingQuery
val sink: Sink

Definition Classes
StreamExecution → ProgressReporter
var sources: Seq[Source]

All stream sources present in the query plan.
All stream sources present in the query plan. This will be set when generating logical plan.

Attributes
protected
Definition Classes
StreamExecution → ProgressReporter
val sparkSession: SparkSession

Returns the SparkSession associated with this.
Returns the SparkSession associated with this.

Definition Classes
StreamExecution → ProgressReporter → StreamingQuery
Since
2.0.0
def start(): Unit

Starts the execution.
Starts the execution. This returns only after the thread has started and QueryStartedEvent has been posted to all the listeners.
def startTrigger(): Unit

Begins recording statistics about query progress for a given trigger.
Begins recording statistics about query progress for a given trigger.

Attributes
protected
Definition Classes
ProgressReporter
def status: StreamingQueryStatus

Returns the current status of the query.
Returns the current status of the query.

Definition Classes
ProgressReporter
def stop(): Unit

Signals to the thread executing micro-batches that it should stop running after the next batch.
Signals to the thread executing micro-batches that it should stop running after the next batch. This method blocks until the thread stops running.

Definition Classes
StreamExecution → StreamingQuery
val streamMetadata: StreamMetadata

Metadata associated with the whole query
Metadata associated with the whole query

Attributes
protected
lazy val streamMetrics: MetricsReporter

Used to report metrics to coda-hale.
Used to report metrics to coda-hale. This uses id for easier tracking across restarts.
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
StreamExecution → AnyRef → Any
val trigger: Trigger
val triggerClock: Clock

Definition Classes
StreamExecution → ProgressReporter
def updateStatusMessage(message: String): Unit

Updates the message returned in status.
Updates the message returned in status.

Attributes
protected
Definition Classes
ProgressReporter
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package streaming

class StreamExecution extends StreamingQuery with ProgressReporter with internal.Logging

Instance Constructors

new StreamExecution(sparkSession: SparkSession, name: String, checkpointRoot: String, analyzedPlan: LogicalPlan, sink: Sink, trigger: Trigger, triggerClock: Clock, outputMode: OutputMode, deleteCheckpointOnStop: Boolean)

Type Members

case class ExecutionStats(inputRows: Map[Source, Long], stateOperators: Seq[StateOperatorProgress], eventTimeStats: Map[String, String]) extends Product with Serializable

trait State extends AnyRef

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

object ACTIVE extends State with Product with Serializable

object INITIALIZING extends State with Product with Serializable

object TERMINATED extends State with Product with Serializable

final def asInstanceOf[T0]: T0

var availableOffsets: StreamProgress

def awaitInitialization(timeoutMs: Long): Unit

def awaitTermination(timeoutMs: Long): Boolean

def awaitTermination(): Unit

val checkpointRoot: String

def clone(): AnyRef

var committedOffsets: StreamProgress

var currentBatchId: Long

var currentStatus: StreamingQueryStatus

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def exception: Option[StreamingQueryException]

def explain(): Unit

def explain(extended: Boolean): Unit

def explainInternal(extended: Boolean): String

def finalize(): Unit

def finishTrigger(hasNewData: Boolean): Unit

final def getClass(): Class[_]

def hashCode(): Int

val id: UUID

def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean = false): Boolean

def initializeLogIfNecessary(isInterpreter: Boolean): Unit

def isActive: Boolean

final def isInstanceOf[T0]: Boolean

def isTraceEnabled(): Boolean

var lastExecution: IncrementalExecution

def lastProgress: StreamingQueryProgress

def log: Logger

def logDebug(msg: ⇒ String, throwable: Throwable): Unit

def logDebug(msg: ⇒ String): Unit

def logError(msg: ⇒ String, throwable: Throwable): Unit

def logError(msg: ⇒ String): Unit

def logInfo(msg: ⇒ String, throwable: Throwable): Unit

def logInfo(msg: ⇒ String): Unit

def logName: String

def logTrace(msg: ⇒ String, throwable: Throwable): Unit

def logTrace(msg: ⇒ String): Unit

def logWarning(msg: ⇒ String, throwable: Throwable): Unit

def logWarning(msg: ⇒ String): Unit

lazy val logicalPlan: LogicalPlan

val microBatchThread: StreamExecutionThread

val name: String

final def ne(arg0: AnyRef): Boolean

var newData: Map[Source, DataFrame]

final def notify(): Unit

final def notifyAll(): Unit

val offsetLog: OffsetSeqLog

var offsetSeqMetadata: OffsetSeqMetadata

val outputMode: OutputMode

def postEvent(event: Event): Unit

def processAllAvailable(): Unit

def recentProgress: Array[StreamingQueryProgress]

def reportTimeTaken[T](triggerDetailKey: String)(body: ⇒ T): T

val runId: UUID

val sink: Sink

var sources: Seq[Source]

val sparkSession: SparkSession

def start(): Unit

def startTrigger(): Unit

def status: StreamingQueryStatus

def stop(): Unit

val streamMetadata: StreamMetadata

lazy val streamMetrics: MetricsReporter

final def synchronized[T0](arg0: ⇒ T0): T0