impl

Type Members

abstract class BaseColumnFormatRelation extends JDBCAppendableRelation with PartitionedDataSourceScan with RowInsertableRelation with MutableRelation

This class acts as a DataSource provider for column format tables provided Snappy.
This class acts as a DataSource provider for column format tables provided Snappy. It uses GemFireXD as actual datastore to physically locate the tables. Column tables can be used for storing data in columnar compressed format. A example usage is given below.
val data = Seq(Data(1, 2, 3), Data(7, 8, 9), Data(9, 2, 3), Data(4, 2, 3), Data(5, 6, 7)) val dataDF = snc.createDataset(data)(Encoders.product) snc.createTable(tableName, "column", dataDF.schema, props) dataDF.write.insertInto(tableName)
This provider scans underlying tables in parallel and is aware of the data partition. It does not introduces a shuffle if simple table query is fired. One can insert a single or multiple rows into this table as well as do a bulk insert by a Spark DataFrame. Bulk insert example is shown above.
abstract class ClusteredColumnIterator extends CloseableIterator[RegionEntry]

Base trait for iterators that are capable of reading and returning the entire set of columns of a column batch.
Base trait for iterators that are capable of reading and returning the entire set of columns of a column batch. These can be local region iterators or those fetching entries from remote nodes.
final class ColumnDelta extends ColumnFormatValue with Delta

Encapsulates a delta for update to be applied to column table and also is stored in the region.
Encapsulates a delta for update to be applied to column table and also is stored in the region. The key for a delta is a negative columnIndex evaluated as (ColumnFormatEntry.DELTA_STATROW_COL_INDEX - 1 + MAX_DEPTH * -columnIndex) where columnIndex is the 0-based index of the underlying table column.
Note that this delta is for carrying the delta update and applying on existing delta, if any, while the actual value that is stored in the region is a ColumnFormatValue. This is to ensure clean working of the delta mechanism where store-layer code checks the type of object for Delta and makes assumptions about it (like it being a temporary value that should not go into region etc).
For a description of column delta format see the class comments in org.apache.spark.sql.execution.columnar.encoding.ColumnDeltaEncoder.
final class ColumnFormatEncoder extends RowEncoder with Logging

A RowEncoder implementation for ColumnFormatValue and child classes.
final class ColumnFormatIterator extends ClusteredColumnIterator with DiskRegionIterator

A customized iterator for column store tables that projects out the required columns and returns those column batches first that have all their columns in the memory.
A customized iterator for column store tables that projects out the required columns and returns those column batches first that have all their columns in the memory. Further this will make use of DiskBlockSortManager to allow for concurrent partition iterators to do cross-partition disk block sorting and fault-in for best disk read performance (SNAP-2012).
final class ColumnFormatKey extends GfxdDataSerializable with ColumnBatchKey with RegionKey with Serializable

Key object in the column store.
class ColumnFormatRelation extends BaseColumnFormatRelation with BulkPutRelation
class ColumnFormatValue extends SerializedDiskBuffer with GfxdSerializable with Sizeable

Value object in the column store simply encapsulates binary data as a ByteBuffer.
Value object in the column store simply encapsulates binary data as a ByteBuffer. This can be either a direct buffer or a heap buffer depending on the system off-heap configuration. The reason for a separate type is to easily store data off-heap without any major changes to engine otherwise as well as efficiently serialize/deserialize them directly to Oplog/socket.
This class extends SerializedDiskBuffer to avoid a copy when reading/writing from Oplog. Consequently it writes the serialization header itself (typeID + classID + size) into stream as would be written by DataSerializer.writeObject. This helps it avoid additional byte writes when transferring data to the channels.
final class ColumnPartitionResolver extends InternalPartitionResolver[ColumnFormatKey, ColumnFormatValue]

Partition resolver for the column store.
final class ColumnarStorePartitionedRDD extends RDDKryo[Any] with KryoSerializable
case class CompactionResult(batchKey: ColumnFormatKey, bucketId: Int, success: Boolean) extends Product with Serializable

Result of compaction of a column batch added to transaction pre-commit results.
Result of compaction of a column batch added to transaction pre-commit results.
NOTE: if the layout of this class or ColumnFormatKey changes, then update the regex pattern in SnapshotConnectionListener.parseCompactionResult that parses the toString() of this class
final class DefaultSource extends ExternalSchemaRelationProvider with SchemaRelationProvider with CreatableRelationProvider with DataSourceRegister with Logging

Column tables don't support any extensions over regular Spark schema syntax, but the support for ExternalSchemaRelationProvider has been added as a workaround to allow for specifying schema in a CREATE TABLE AS SELECT statement.
Column tables don't support any extensions over regular Spark schema syntax, but the support for ExternalSchemaRelationProvider has been added as a workaround to allow for specifying schema in a CREATE TABLE AS SELECT statement.
Normally Spark does not allow specifying schema in a CTAS statement for DataSources (except its special "hive" provider), so schema is passed here as string which is parsed locally in the CreatableRelationProvider implementation.
class IndexColumnFormatRelation extends BaseColumnFormatRelation

Currently this is same as ColumnFormatRelation but has kept it as a separate class to allow adding of any index specific functionality in future.
class JDBCSourceAsColumnarStore extends ExternalStore with KryoSerializable

Column Store implementation for GemFireXD.
final class LongObjectHashMapWithState[V] extends Long2ObjectOpenHashMap[V]
final class RemoteEntriesIterator extends ClusteredColumnIterator with Logging

A ClusteredColumnIterator that fetches entries from a remote bucket.
A ClusteredColumnIterator that fetches entries from a remote bucket.
TODO: PERF: instead of fetching using getAll, this should instead open a named ColumnFormatIterator on the remote node hosting the bucket, then step through the iterator to fetch batch (or batches) at a time using Function/GfxdFunctionMessage invocations. As of now, the getAll invocation does not honour ordered disk reads, proper fault-in etc.
final class SingleColumnBatchIterator extends ColumnBatchIterator

Provides a ColumnBatchIterator over a single column batch for ColumnTableScan.
final class SmartConnectorColumnRDD extends RDDKryo[Any] with KryoSerializable
class SmartConnectorRowRDD extends RowFormatScanRDD
trait StatsPredicate extends AnyRef

The type of the generated class used by column stats check for a column batch.
trait StoreCallback extends Serializable

Value Members

object ColumnCompactor extends Logging

Compact column batches, if required, and insert new compacted column batches, or if they are too small then push into row delta buffer.
object ColumnDelta extends Enumeration with Logging
object ColumnFormatEntry

Utility methods for column format storage keys and values.
object ColumnFormatRelation extends Logging with StoreCallback
object StatsFilter extends Predicate[AnyRef] with Serializable
object StoreCallbacksImpl extends StoreCallbacks with Logging with Serializable

package impl

Type Members

abstract class BaseColumnFormatRelation extends JDBCAppendableRelation with PartitionedDataSourceScan with RowInsertableRelation with MutableRelation

abstract class ClusteredColumnIterator extends CloseableIterator[RegionEntry]

final class ColumnDelta extends ColumnFormatValue with Delta

final class ColumnFormatEncoder extends RowEncoder with Logging

final class ColumnFormatIterator extends ClusteredColumnIterator with DiskRegionIterator

final class ColumnFormatKey extends GfxdDataSerializable with ColumnBatchKey with RegionKey with Serializable

class ColumnFormatRelation extends BaseColumnFormatRelation with BulkPutRelation

class ColumnFormatValue extends SerializedDiskBuffer with GfxdSerializable with Sizeable

final class ColumnPartitionResolver extends InternalPartitionResolver[ColumnFormatKey, ColumnFormatValue]

final class ColumnarStorePartitionedRDD extends RDDKryo[Any] with KryoSerializable

case class CompactionResult(batchKey: ColumnFormatKey, bucketId: Int, success: Boolean) extends Product with Serializable

final class DefaultSource extends ExternalSchemaRelationProvider with SchemaRelationProvider with CreatableRelationProvider with DataSourceRegister with Logging

class IndexColumnFormatRelation extends BaseColumnFormatRelation

class JDBCSourceAsColumnarStore extends ExternalStore with KryoSerializable

final class LongObjectHashMapWithState[V] extends Long2ObjectOpenHashMap[V]

final class RemoteEntriesIterator extends ClusteredColumnIterator with Logging

final class SingleColumnBatchIterator extends ColumnBatchIterator

final class SmartConnectorColumnRDD extends RDDKryo[Any] with KryoSerializable

class SmartConnectorRowRDD extends RowFormatScanRDD

trait StatsPredicate extends AnyRef

trait StoreCallback extends Serializable

Value Members

object ColumnCompactor extends Logging

object ColumnDelta extends Enumeration with Logging

object ColumnFormatEntry

object ColumnFormatRelation extends Logging with StoreCallback

object StatsFilter extends Predicate[AnyRef] with Serializable

object StoreCallbacksImpl extends StoreCallbacks with Logging with Serializable

Ungrouped