JoinSelection

Select the proper physical plan for join based on joining keys and size of logical plan.

At first, uses the ExtractEquiJoinKeys pattern to find joins where at least some of the predicates can be evaluated by matching join keys. If found, Join implementations are chosen with the following precedence:

- Broadcast: if one side of the join has an estimated physical size that is smaller than the user-configurable SQLConf.AUTO_BROADCASTJOIN_THRESHOLD threshold or if that side has an explicit broadcast hint (e.g. the user applied the org.apache.spark.sql.functions.broadcast() function to a DataFrame), then that side of the join will be broadcasted and the other side will be streamed, with no shuffling performed. If both sides of the join are eligible to be broadcasted then the - Shuffle hash join: if the average size of a single partition is small enough to build a hash table. - Sort merge: if the matching join keys are sortable.

If there is no joining keys, Join implementations are chosen with the following precedence: - BroadcastNestedLoopJoin: if one side of the join could be broadcasted - CartesianProduct: for Inner join - BroadcastNestedLoopJoin

Linear Supertypes

PredicateHelper, SparkStrategy, GenericStrategy[SparkPlan], internal.Logging, AnyRef, Any

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def apply(plan: LogicalPlan): Seq[SparkPlan]

Definition Classes
JoinSelection → GenericStrategy
final def asInstanceOf[T0]: T0

Definition Classes
Any
def canEvaluate(expr: Expression, plan: LogicalPlan): Boolean

Returns true if expr can be evaluated using only the output of plan.
Returns true if expr can be evaluated using only the output of plan. This method can be used to determine when it is acceptable to move expression evaluation within a query plan.
For example consider a join between two relations R(a, b) and S(c, d).
- canEvaluate(EqualTo(a,b), R) returns true - canEvaluate(EqualTo(a,c), R) returns false - canEvaluate(Literal(1), R) returns true as literals CAN be evaluated on any plan

Attributes
protected
Definition Classes
PredicateHelper
def canEvaluateWithinJoin(expr: Expression): Boolean

Returns true iff expr could be evaluated as a condition within join.
Returns true iff expr could be evaluated as a condition within join.

Attributes
protected
Definition Classes
PredicateHelper
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean = false): Boolean

Attributes
protected
Definition Classes
Logging
def initializeLogIfNecessary(isInterpreter: Boolean): Unit

Attributes
protected
Definition Classes
Logging
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def isTraceEnabled(): Boolean

Attributes
protected
Definition Classes
Logging
def log: Logger

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logDebug(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logError(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logInfo(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logName: String

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logTrace(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String, throwable: Throwable): Unit

Attributes
protected
Definition Classes
Logging
def logWarning(msg: ⇒ String): Unit

Attributes
protected
Definition Classes
Logging
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
def planLater(plan: LogicalPlan): SparkPlan

Returns a placeholder for a physical plan that executes plan.
Returns a placeholder for a physical plan that executes plan. This placeholder will be filled in automatically by the QueryPlanner using the other execution strategies that are available.

Attributes
protected
Definition Classes
SparkStrategy → GenericStrategy
def replaceAlias(condition: Expression, aliases: AttributeMap[Expression]): Expression

Attributes
protected
Definition Classes
PredicateHelper
def splitConjunctivePredicates(condition: Expression): Seq[Expression]

Attributes
protected
Definition Classes
PredicateHelper
def splitDisjunctivePredicates(condition: Expression): Seq[Expression]

Attributes
protected
Definition Classes
PredicateHelper
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package SparkStrategies

object JoinSelection extends Strategy with PredicateHelper

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

def apply(plan: LogicalPlan): Seq[SparkPlan]

final def asInstanceOf[T0]: T0

def canEvaluate(expr: Expression, plan: LogicalPlan): Boolean

def canEvaluateWithinJoin(expr: Expression): Boolean

def clone(): AnyRef

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean = false): Boolean

def initializeLogIfNecessary(isInterpreter: Boolean): Unit

final def isInstanceOf[T0]: Boolean

def isTraceEnabled(): Boolean

def log: Logger

def logDebug(msg: ⇒ String, throwable: Throwable): Unit

def logDebug(msg: ⇒ String): Unit

def logError(msg: ⇒ String, throwable: Throwable): Unit

def logError(msg: ⇒ String): Unit

def logInfo(msg: ⇒ String, throwable: Throwable): Unit

def logInfo(msg: ⇒ String): Unit

def logName: String

def logTrace(msg: ⇒ String, throwable: Throwable): Unit

def logTrace(msg: ⇒ String): Unit

def logWarning(msg: ⇒ String, throwable: Throwable): Unit

def logWarning(msg: ⇒ String): Unit

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

def planLater(plan: LogicalPlan): SparkPlan

def replaceAlias(condition: Expression, aliases: AttributeMap[Expression]): Expression

def splitConjunctivePredicates(condition: Expression): Seq[Expression]

def splitDisjunctivePredicates(condition: Expression): Seq[Expression]

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from PredicateHelper

Inherited from SparkStrategy

Inherited from GenericStrategy[SparkPlan]

Inherited from internal.Logging

Inherited from AnyRef

Inherited from Any

Ungrouped