Get the partitioning columns for the table, if any.
Get the partitioning columns for the table, if any.
Returns the list of Expressions that this datasource may not be able to handle.
Returns the list of Expressions that this datasource may not be able to handle. By default, this function will return all filters, as it is always safe to double evaluate an Expression.
Alter's table schema by adding or dropping a provided column.
Alter's table schema by adding or dropping a provided column. The schema of this instance must reflect the updated one after alter.
Table identifier
True if column is to be added else it is to be dropped
Column to be added or dropped
Any additional clauses accepted by underlying table storage like DEFAULT value or column constraints
Create an index on a table.
Create an index on a table.
Index Identifier which goes in the catalog
Table identifier on which the index is created.
Columns on which the index has to be created with the direction of sorting. Direction can be specified as None.
Options for indexes. For e.g. column table index - ("COLOCATE_WITH"->"CUSTOMER"). row table index - ("INDEX_TYPE"->"GLOBAL HASH") or ("INDEX_TYPE"->"UNIQUE")
Delete a set of row matching given criteria.
Delete a set of row matching given criteria.
SQL WHERE criteria to select rows that will be deleted
number of rows deleted
Destroy and cleanup this relation.
Destroy and cleanup this relation. It may include, but not limited to, dropping the external table that this relation represents.
Drops an index on this table
Drops an index on this table
Index identifier
Table identifier
Drop if exists
Execute a DML SQL and return the number of rows affected.
Execute a DML SQL and return the number of rows affected.
Get a spark plan to delete rows the relation.
Get a spark plan to delete rows the relation. The result of SparkPlan execution should be a count of number of updated rows.
Get a spark plan for insert.
Get a spark plan for insert. The result of SparkPlan execution should be a count of number of inserted rows.
Get the "key" columns for the table that need to be projected out by UPDATE and DELETE operations for affecting the selected rows.
Get the "key" columns for the table that need to be projected out by UPDATE and DELETE operations for affecting the selected rows.
Get primary keys of the row table
Get primary keys of the row table
Get a spark plan to update rows in the relation.
Get a spark plan to update rows in the relation. The result of SparkPlan execution should be a count of number of updated rows.
Insert a sequence of rows into the table represented by this relation.
Insert a sequence of rows into the table represented by this relation.
the rows to be inserted
number of rows inserted
Whether does it need to convert the objects in Row to internal representation, for example: java.lang.String to UTF8String java.lang.Decimal to Decimal
Whether does it need to convert the objects in Row to internal representation, for example: java.lang.String to UTF8String java.lang.Decimal to Decimal
If needConversion
is false
, buildScan() should return an RDD
of InternalRow
1.4.0
The internal representation is not stable across releases and thus data sources outside of Spark SQL should leave this as true.
Returns an estimated size of this relation in bytes.
Returns an estimated size of this relation in bytes. This information is used by the planner to decide when it is safe to broadcast a relation and can be overridden by sources that know the size ahead of time. By default, the system will assume that tables are too large to broadcast. This method will be called multiple times during query planning and thus should not perform expensive operations for each invocation.
1.3.0
It is always better to overestimate size than underestimate, because underestimation could lead to execution plans that are suboptimal (i.e. broadcasting a very large table).
Name of this table as stored in catalog.
Name of this table as stored in catalog.
Truncate the table represented by this relation.
Truncate the table represented by this relation.
Returns the list of Filters that this datasource may not be able to handle.
Returns the list of Filters that this datasource may not be able to handle. These returned Filters will be evaluated by Spark SQL after data is output by a scan. By default, this function will return all filters, as it is always safe to double evaluate a Filter. However, specific implementations can override this function to avoid double filtering when they are capable of processing a filter internally.
1.6.0
Update a set of rows matching given criteria.
Update a set of rows matching given criteria.
SQL WHERE criteria to select rows that will be updated
updated values for the columns being changed;
must match updateColumns
the columns to be updated; must match updatedColumns
number of rows affected
If required inject the key columns in the original relation.
If required inject the key columns in the original relation.
A LogicalPlan implementation for an external row table whose contents are retrieved using a JDBC URL or DataSource.