DictionaryOptimizedMapAccessor

Makes use of dictionary indexes for strings if any. Depends only on the presence of dictionary per batch of rows (where the batch must be substantially greater than its dictionary for optimization to help).

For single column hash maps (groups or joins), it can be turned into a flat indexed array instead of a map. Create an array of class objects as stored in ObjectHashSet having the length same as dictionary so that dictionary index can be used to directly lookup the array. Then for the first lookup into the array for a dictionary index, lookup the actual ObjectHashSet for the key to find the map entry object and insert into the array. An alternative would be to pre-populate the array by making one pass through the dictionary, but it may not be efficient if many of the entries in the dictionary get filtered out by query predicates and never need to consult the created array.

For multiple column hash maps having one or more dictionary indexed columns, there is slightly more work. Instead of an array as in single column case, create a new hash map where the key columns values are substituted by dictionary index value. However, the map entry will remain identical to the original map so to save space add the additional index column to the full map itself. As new values are inserted into this hash map, lookup the full hash map to locate its map entry, then point to the same map entry in this new hash map too. Thus for subsequent look-ups the new hash map can be used completely based on integer dictionary indexes instead of strings.

An alternative approach can be to just store the hash code arrays separately for each of the dictionary columns indexed identical to dictionary. Use this to lookup the main map which will also have additional columns for dictionary indexes (that will be cleared at the start of a new batch). On first lookup for key columns where dictionary indexes are missing in the map, insert the dictionary index in those additional columns. Then use those indexes for equality comparisons instead of string.

The multiple column dictionary optimization will be useful for only string dictionary types where cost of looking up a string in hash map is substantially higher than integer lookup. The single column optimization can improve performance for other dictionary types though its efficacy for integer/long types will be reduced to avoiding hash code calculation. Given this, the additional overhead of array maintenance may not be worth the effort (and could possibly even reduce overall performance in some cases), hence this optimization is currently only for string type.

Linear Supertypes

AnyRef, Any

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def canHaveSingleKeyCase(keyExpressions: Seq[Expression]): Boolean
def checkSingleKeyCase(keyExpressions: Seq[Expression], keyVars: ⇒ Seq[ExprCode], ctx: CodegenContext, session: SnappySession): Option[DictionaryCode]
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( ... )
def dictionaryArrayGetOrInsert(ctx: CodegenContext, keyExpr: Seq[Expression], keyVar: ExprCode, keyDictVar: DictionaryCode, arrayVar: String, resultVar: String, valueInit: String, continueOnNull: Boolean, accessor: ObjectHashMapAccessor): String
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
def hashCode(): Int

Definition Classes
AnyRef → Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
final def notifyAll(): Unit

Definition Classes
AnyRef
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )

Related Doc: package execution

object DictionaryOptimizedMapAccessor

Value Members

final def !=(arg0: Any): Boolean

final def ##(): Int

final def ==(arg0: Any): Boolean

final def asInstanceOf[T0]: T0

def canHaveSingleKeyCase(keyExpressions: Seq[Expression]): Boolean

def checkSingleKeyCase(keyExpressions: Seq[Expression], keyVars: ⇒ Seq[ExprCode], ctx: CodegenContext, session: SnappySession): Option[DictionaryCode]

def clone(): AnyRef

def dictionaryArrayGetOrInsert(ctx: CodegenContext, keyExpr: Seq[Expression], keyVar: ExprCode, keyDictVar: DictionaryCode, arrayVar: String, resultVar: String, valueInit: String, continueOnNull: Boolean, accessor: ObjectHashMapAccessor): String

final def eq(arg0: AnyRef): Boolean

def equals(arg0: Any): Boolean

def finalize(): Unit

final def getClass(): Class[_]

def hashCode(): Int

final def isInstanceOf[T0]: Boolean

final def ne(arg0: AnyRef): Boolean

final def notify(): Unit

final def notifyAll(): Unit

final def synchronized[T0](arg0: ⇒ T0): T0

def toString(): String

final def wait(): Unit

final def wait(arg0: Long, arg1: Int): Unit

final def wait(arg0: Long): Unit

Inherited from AnyRef

Inherited from Any

Ungrouped