Returns the file extension to be used when writing files out.
When writing to a HadoopFsRelation, this method gets called by each task on executor side to instantiate new OutputWriters.
When writing to a HadoopFsRelation, this method gets called by each task on executor side to instantiate new OutputWriters.
Path to write the file.
Schema of the rows to be written. Partition columns are not included in the schema if the relation being written is partitioned.
The Hadoop MapReduce task context.
Returns a new instance of OutputWriter that will write data to the given path.
Returns a new instance of OutputWriter that will write data to the given path.
This method gets called by each task on executor to write InternalRows to
format-specific files. Compared to the other newInstance()
, this is a newer API that
passes only the path that the writer must write to. The writer must write to the exact path
and not modify it (do not add subdirectories, extensions, etc.). All other
file-format-specific information needed to create the writer must be passed
through the OutputWriterFactory implementation.
A factory that produces OutputWriters. A new OutputWriterFactory is created on driver side for each write job issued when writing to a HadoopFsRelation, and then gets serialized to executor side to create actual OutputWriters on the fly.