Index
!¶
! expr - Logical not.
%¶
expr1 % expr2 - Returns the remainder after expr1
/expr2
.
&¶
expr1 & expr2 - Returns the result of bitwise AND of expr1
and expr2
.
*¶
expr1 * expr2 - Returns expr1
*expr2
.
+¶
expr1 + expr2 - Returns expr1
+expr2
.
-¶
expr1 - expr2 - Returns expr1
-expr2
.
/¶
expr1 / expr2 - Returns expr1
/expr2
. It always performs floating point division.
<¶
expr1 < expr2 - Returns true if expr1
is less than expr2
.
<=¶
expr1 <= expr2 - Returns true if expr1
is less than or equal to expr2
.
<=>¶
expr1 <=> expr2 - Returns same result as the EQUAL(=) operator for non-null operands, but returns true if both are null, false if one of the them is null.
=¶
expr1 = expr2 - Returns true if expr1
equals expr2
, or false otherwise.
==¶
expr1 == expr2 - Returns true if expr1
equals expr2
, or false otherwise.
>¶
expr1 > expr2 - Returns true if expr1
is greater than expr2
.
>=¶
expr1 >= expr2 - Returns true if expr1
is greater than or equal to expr2
.
^¶
expr1 ^ expr2 - Returns the result of bitwise exclusive OR of expr1
and expr2
.
abs¶
abs(expr) - Returns the absolute value of the numeric value.
absolute_error¶
absolute_error() - Indicates absolute error present in the estimate (approx answer) calculated using error estimation method (ClosedForm or Bootstrap). [AQP]
Examples:
> SELECT sum(ArrDelay) ArrivalDelay, absolute_error(ArrivalDelay),
Month_ from airline group by Month_ order by Month_ desc with error 0.10;
1117.6, 43.4, Jan
acos¶
acos(expr) - Returns the inverse cosine (a.k.a. arccosine) of expr
if -1<=expr
<=1 or NaN otherwise.
add_months¶
add_months(start_date, num_months) - Returns the date that is num_months
after start_date
.
and¶
expr1 and expr2 - Logical AND.
approx_count_distinct¶
approx_count_distinct(expr[, relativeSD]) - Returns the estimated cardinality by HyperLogLog++. relativeSD
defines the maximum estimation error allowed.
approx_percentile¶
approx_percentile(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column col
at the given percentage. The value of percentage must be between 0.0 and 1.0. The accuracy
parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of accuracy
yields better accuracy, 1.0/accuracy
is the relative error of the approximation. When percentage
is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column col
at the given percentage array.
Examples:
> SELECT approx_percentile(10.0, array(0.5, 0.4, 0.1), 100);
[10.0,10.0,10.0]
> SELECT approx_percentile(10.0, 0.5, 100);
10.0
array¶
array(expr, ...) - Returns an array with the given elements.
array_contains¶
array_contains(array, value) - Returns true if the array contains the value.
ascii¶
ascii(str) - Returns the numeric value of the first character of str
.
asin¶
asin(expr) - Returns the inverse sine (a.k.a. arcsine) the arc sin of expr
if -1<=expr
<=1 or NaN otherwise.
assert_true¶
assert_true(expr) - Throws an exception if expr
is not true.
atan¶
atan(expr) - Returns the inverse tangent (a.k.a. arctangent).
atan2¶
atan2(expr1, expr2) - Returns the angle in radians between the positive x-axis of a plane and the point given by the coordinates (expr1
, expr2
).
avg¶
avg(expr) - Returns the mean calculated from values of a group.
base64¶
base64(bin) - Converts the argument from a binary bin
to a base 64 string.
bigint¶
bigint(expr AS type) - Casts the value expr
to the target data type type
.
bin¶
bin(expr) - Returns the string representation of the long value expr
represented in binary.
Examples:
> SELECT bin(13);
1101
> SELECT bin(-13);
1111111111111111111111111111111111111111111111111111111111110011
> SELECT bin(13.3);
1101
binary¶
binary(expr AS type) - Casts the value expr
to the target data type type
.
boolean¶
boolean(expr AS type) - Casts the value expr
to the target data type type
.
bround¶
bround(expr, d) - Returns expr
rounded to d
decimal places using HALF_EVEN rounding mode.
cbrt¶
cbrt(expr) - Returns the cube root of expr
.
ceil¶
ceil(expr) - Returns the smallest integer not smaller than expr
.
ceiling¶
ceiling(expr) - Returns the smallest integer not smaller than expr
.
coalesce¶
coalesce(expr1, expr2, ...) - Returns the first non-null argument if exists. Otherwise, null.
collect_list¶
collect_list(expr) - Collects and returns a list of non-unique elements.
collect_set¶
collect_set(expr) - Collects and returns a set of unique elements.
concat¶
concat(str1, str2, ..., strN) - Returns the concatenation of str1, str2, ..., strN.
concat_ws¶
concat_ws(sep, [str | array(str)]+) - Returns the concatenation of the strings separated by sep
.
conv¶
conv(num, from_base, to_base) - Convert num
from from_base
to to_base
.
corr¶
corr(expr1, expr2) - Returns Pearson coefficient of correlation between a set of number pairs.
cos¶
cos(expr) - Returns the cosine of expr
.
cosh¶
cosh(expr) - Returns the hyperbolic cosine of expr
.
count¶
count(*) - Returns the total number of retrieved rows, including rows containing null.
count(expr) - Returns the number of rows for which the supplied expression is non-null.
count(DISTINCT expr[, expr...]) - Returns the number of rows for which the supplied expression(s) are unique and non-null.
covar_pop¶
covar_pop(expr1, expr2) - Returns the population covariance of a set of number pairs.
covar_samp¶
covar_samp(expr1, expr2) - Returns the sample covariance of a set of number pairs.
crc32¶
crc32(expr) - Returns a cyclic redundancy check value of the expr
as a bigint.
cube¶
cume_dist¶
cume_dist() - Computes the position of a value relative to all values in the partition.
current_database¶
current_database() - Returns the current database.
current_date¶
current_date() - Returns the current date at the start of query evaluation.
current_schema¶
current_schema() - Returns the current database.
current_timestamp¶
current_timestamp() - Returns the current timestamp at the start of query evaluation.
current_user¶
current_user() - Returns the name of the user that owns the session executing the current SQL statement.
current_user_ldap_groups¶
current_user_ldap_groups() - Returns all the ldap groups as an ARRAY to which the user who is executing the current SQL statement belongs.
date¶
date(expr AS type) - Casts the value expr
to the target data type type
.
date_add¶
date_add(start_date, num_days) - Returns the date that is num_days
after start_date
.
date_format¶
date_format(timestamp, fmt) - Converts timestamp
to a value of string in the format specified by the date format fmt
.
date_sub¶
date_sub(start_date, num_days) - Returns the date that is num_days
before start_date
.
datediff¶
datediff(endDate, startDate) - Returns the number of days from startDate
to endDate
.
Examples:
> SELECT datediff('2009-07-31', '2009-07-30');
1
> SELECT datediff('2009-07-30', '2009-07-31');
-1
day¶
day(date) - Returns the day of month of the date/timestamp.
dayofmonth¶
dayofmonth(date) - Returns the day of month of the date/timestamp.
dayofyear¶
dayofyear(date) - Returns the day of year of the date/timestamp.
decimal¶
decimal(expr AS type) - Casts the value expr
to the target data type type
.
decode¶
decode(bin, charset) - Decodes the first argument using the second argument character set.
degrees¶
degrees(expr) - Converts radians to degrees.
dense_rank¶
dense_rank() - Computes the rank of a value in a group of values. The result is one plus the previously assigned rank value. Unlike the function rank, dense_rank will not produce gaps in the ranking sequence.
double¶
double(expr AS type) - Casts the value expr
to the target data type type
.
dsid¶
dsid() - Returns the unique distributed member ID of executor fetching current row.
e¶
e() - Returns Euler's number, e.
elt¶
elt(n, str1, str2, ...) - Returns the n
-th string, e.g., returns str2
when n
is 2.
encode¶
encode(str, charset) - Encodes the first argument using the second argument character set.
exp¶
exp(expr) - Returns e to the power of expr
.
explode¶
explode(expr) - Separates the elements of array expr
into multiple rows, or the elements of map expr
into multiple rows and columns.
expm1¶
expm1(expr) - Returns exp(expr
) - 1.
factorial¶
factorial(expr) - Returns the factorial of expr
. expr
is [0..20]. Otherwise, null.
find_in_set¶
find_in_set(str, str_array) - Returns the index (1-based) of the given string (str
) in the comma-delimited list (str_array
). Returns 0, if the string was not found or if the given string (str
) contains a comma.
first¶
first(expr[, isIgnoreNull]) - Returns the first value of expr
for a group of rows. If isIgnoreNull
is true, returns only non-null values.
first_value¶
first_value(expr[, isIgnoreNull]) - Returns the first value of expr
for a group of rows. If isIgnoreNull
is true, returns only non-null values.
float¶
float(expr AS type) - Casts the value expr
to the target data type type
.
floor¶
floor(expr) - Returns the largest integer not greater than expr
.
format_number¶
format_number(expr1, expr2) - Formats the number expr1
like '#,###,###.##', rounded to expr2
decimal places. If expr2
is 0, the result has no decimal point or fractional part. This is supposed to function like MySQL's FORMAT.
format_string¶
format_string(strfmt, obj, ...) - Returns a formatted string from printf-style format strings.
from_unixtime¶
from_unixtime(unix_time, format) - Returns unix_time
in the specified format
.
from_utc_timestamp¶
from_utc_timestamp(timestamp, timezone) - Given a timestamp, which corresponds to a certain time of day in UTC, returns another timestamp that corresponds to the same time of day in the given timezone.
get_json_object¶
get_json_object(json_txt, path) - Extracts a json object from path
.
greatest¶
greatest(expr, ...) - Returns the greatest value of all parameters, skipping null values.
grouping¶
grouping_id¶
hash¶
hash(expr1, expr2, ...) - Returns a hash value of the arguments.
hex¶
hex(expr) - Converts expr
to hexadecimal.
hour¶
hour(timestamp) - Returns the hour component of the string/timestamp.
hypot¶
hypot(expr1, expr2) - Returns sqrt(expr1
2 + expr2
2).
if¶
if(expr1, expr2, expr3) - If expr1
evaluates to true, then returns expr2
; otherwise returns expr3
.
ifnull¶
ifnull(expr1, expr2) - Returns expr2
if expr1
is null, or expr1
otherwise.
in¶
expr1 in(expr2, expr3, ...) - Returns true if expr
equals to any valN.
initcap¶
initcap(str) - Returns str
with the first letter of each word in uppercase. All other letters are in lowercase. Words are delimited by white space.
inline¶
inline(expr) - Explodes an array of structs into a table.
input_file_name¶
input_file_name() - Returns the name of the current file being read if available.
instr¶
instr(str, substr) - Returns the (1-based) index of the first occurrence of substr
in str
.
int¶
int(expr AS type) - Casts the value expr
to the target data type type
.
isnan¶
isnan(expr) - Returns true if expr
is NaN, or false otherwise.
isnotnull¶
isnotnull(expr) - Returns true if expr
is not null, or false otherwise.
isnull¶
isnull(expr) - Returns true if expr
is null, or false otherwise.
java_method¶
java_method(class, method[, arg1[, arg2 ..]]) - Calls a method with reflection.
Examples:
> SELECT java_method('java.util.UUID', 'randomUUID');
c33fb387-8500-4bfa-81d2-6e0e3e930df2
> SELECT java_method('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2');
a5cf6c42-0c85-418f-af6c-3e4e5b1328f2
json_tuple¶
json_tuple(jsonStr, p1, p2, ..., pn) - Return a tuple like the function get_json_object, but it takes multiple names. All the input parameters and output column types are string.
kurtosis¶
kurtosis(expr) - Returns the kurtosis value calculated from values of a group.
lag¶
lag(input[, offset[, default]]) - Returns the value of input
at the offset
th row before the current row in the window. The default value of offset
is 1 and the default value of default
is null. If the value of input
at the offset
th row is null, null is returned. If there is no such offset row (e.g., when the offset is 1, the first row of the window does not have any previous row), default
is returned.
last¶
last(expr[, isIgnoreNull]) - Returns the last value of expr
for a group of rows. If isIgnoreNull
is true, returns only non-null values.
last_day¶
last_day(date) - Returns the last day of the month which the date belongs to.
last_value¶
last_value(expr[, isIgnoreNull]) - Returns the last value of expr
for a group of rows. If isIgnoreNull
is true, returns only non-null values.
lcase¶
lcase(str) - Returns str
with all characters changed to lowercase.
lead¶
lead(input[, offset[, default]]) - Returns the value of input
at the offset
th row after the current row in the window. The default value of offset
is 1 and the default value of default
is null. If the value of input
at the offset
th row is null, null is returned. If there is no such an offset row (e.g., when the offset is 1, the last row of the window does not have any subsequent row), default
is returned.
least¶
least(expr, ...) - Returns the least value of all parameters, skipping null values.
length¶
length(expr) - Returns the length of expr
or number of bytes in binary data.
levenshtein¶
levenshtein(str1, str2) - Returns the Levenshtein distance between the two given strings.
like¶
str like pattern - Returns true if str matches pattern, null if any arguments are null, false otherwise.
Arguments:
str - a string expression
pattern - a string expression. The pattern is a string which is matched literally, with
exception to the following special symbols:
_ matches any one character in the input (similar to . in posix regular expressions)
% matches zero or more characters in the input (similar to .* in posix regular
expressions)
The escape character is '\'. If an escape character precedes a special symbol or another
escape character, the following character is matched literally. It is invalid to escape
any other character.
Examples:
> SELECT '%SystemDrive%\Users\John' like '\%SystemDrive\%\\Users%'
true
See also:
Use RLIKE to match with standard regular expressions.
ln¶
ln(expr) - Returns the natural logarithm (base e) of expr
.
locate¶
locate(substr, str[, pos]) - Returns the position of the first occurrence of substr
in str
after position pos
. The given pos
and return value are 1-based.
log¶
log(base, expr) - Returns the logarithm of expr
with base
.
log10¶
log10(expr) - Returns the logarithm of expr
with base 10.
log1p¶
log1p(expr) - Returns log(1 + expr
).
log2¶
log2(expr) - Returns the logarithm of expr
with base 2.
lower¶
lower(str) - Returns str
with all characters changed to lowercase.
lower_bound¶
lower_bound() - Lower value of an estimate interval for a given confidence.calculated using error estimation method (ClosedForm or Bootstrap). [AQP]
Examples:
> SELECT sum(ArrDelay) ArrivalDelay, lower_bound(ArrivalDelay),
Month_ from airline group by Month_ order by Month_ desc with error 0.10;
1117.6, 11101.5, Jan
"```
### lpad
lpad(str, len, pad) - Returns `str`, left-padded with `pad` to a length of `len`.
If `str` is longer than `len`, the return value is shortened to `len` characters.
SELECT lpad('hi', 5, '??'); ???hi SELECT lpad('hi', 1, '??'); h
Examples: SELECT ltrim(' SparkSQL'); SparkSQL Examples: SELECT map(1.0, '2', 3.0, '4'); {1.0:"2",3.0:"4"} Examples: SELECT map_keys(map(1, 'a', 2, 'b')); [1,2] Examples: SELECT map_values(map(1, 'a', 2, 'b')); ["a","b"] No example/argument for max. Examples: SELECT md5('Spark'); 8cde774d6f7333752ed72cacddb05126 No example/argument for mean. No example/argument for min. Examples: SELECT minute('2009-07-30 12:58:59'); 58No example/argument for monotonically_increasing_id. Examples: SELECT month('2016-07-30'); 7### monotonically_increasing_id monotonically_increasing_id() - Returns monotonically increasing 64-bit integers. The generated ID is guaranteed to be monotonically increasing and unique, but not consecutive. The current implementation puts the partition ID in the upper 31 bits, and the lower 33 bits represent the record number within each partition. The assumption is that the data frame has less than 1 billion partitions, and each partition has less than 8 billion records.
Examples: SELECT months_between('1997-02-28 10:30:00', '1996-10-30'); 3.94959677### months_between months_between(timestamp1, timestamp2) - Returns number of months between `timestamp1` and `timestamp2`.
Examples: SELECT named_struct("a", 1, "b", 2, "c", 3); {"a":1,"b":2,"c":3} Examples: SELECT nanvl(cast('NaN' as double), 123); 123.0 Examples: SELECT negative(1); -1### named_struct named_struct(name1, val1, name2, val2, ...) - Creates a struct with the given field names and values.
Examples: SELECT next_day('2015-01-14', 'TU'); 2015-01-20 No example/argument for not. No example/argument for now.### next_day next_day(start_date, day_of_week) - Returns the first date which is later than `start_date` and named as indicated.
No example/argument for ntile. Examples: SELECT nullif(2, 2); NULL Examples: SELECT nvl(NULL, array('2')); ["2"] Examples: SELECT nvl2(NULL, 2, 1); 1 No example/argument for or. Examples: SELECT parse_url('http://spark.apache.org/path?query=1', 'HOST') spark.apache.org SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY') query=1 SELECT parse_url('http://spark.apache.org/path?query=1', 'QUERY', 'query') 1 No example/argument for percent_rank.### ntile ntile(n) - Divides the rows for each window partition into `n` buckets ranging from 1 to at most `n`.
No example/argument for percentile.### percentile percentile(col, percentage) - Returns the exact percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. percentile(col, array(percentage1 [, percentage2]...)) - Returns the exact percentile value array of numeric column `col` at the given percentage(s). Each value of the percentage array must be between 0.0 and 1.0.
Examples: SELECT percentile_approx(10.0, array(0.5, 0.4, 0.1), 100); [10.0,10.0,10.0] SELECT percentile_approx(10.0, 0.5, 100); 10.0 Examples: SELECT pi(); 3.141592653589793 Examples: SELECT pmod(10, 3); 1 SELECT pmod(-10, 3); 2### percentile_approx percentile_approx(col, percentage [, accuracy]) - Returns the approximate percentile value of numeric column `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields better accuracy, `1.0/accuracy` is the relative error of the approximation. When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. In this case, returns the approximate percentile array of column `col` at the given percentage array.
Examples: SELECT posexplode(array(10,20)); 0 10 1 20 No example/argument for positive. Examples: SELECT pow(2, 3); 8.0 Examples: SELECT power(2, 3); 8.0 Examples: SELECT printf("Hello World %d %s", 100, "days"); Hello World 100 days Examples: SELECT quarter('2016-08-31'); 3 Examples: SELECT radians(180); 3.141592653589793### posexplode posexplode(expr) - Separates the elements of array `expr` into multiple rows with positions, or the elements of map `expr` into multiple rows and columns with positions.
Examples: SELECT rand(); 0.9629742951434543 SELECT rand(0); 0.8446490682263027 SELECT rand(null); 0.8446490682263027### rand rand([seed]) - Returns a random value with independent and identically distributed (i.i.d.) uniformly distributed values in [0, 1).
Examples: SELECT randn(); -0.3254147983080288 SELECT randn(0); 1.1164209726833079 SELECT randn(null); 1.1164209726833079### randn randn([seed]) - Returns a random value with independent and identically distributed (i.i.d.) values drawn from the standard normal distribution.
No example/argument for rank. Examples: SELECT reflect('java.util.UUID', 'randomUUID'); c33fb387-8500-4bfa-81d2-6e0e3e930df2 SELECT reflect('java.util.UUID', 'fromString', 'a5cf6c42-0c85-418f-af6c-3e4e5b1328f2'); a5cf6c42-0c85-418f-af6c-3e4e5b1328f2 Examples: SELECT regexp_extract('100-200', '(\d+)-(\d+)', 1); 100 Examples: SELECT regexp_replace('100-200', '(\d+)', 'num'); num-num Examples: SELECT sum(ArrDelay) ArrivalDelay, relative_error(ArrivalDelay), Month_ from airline group by Month_ order by Month_ desc with error 0.10; 1117.6, 0.3, Jan "```### rank rank() - Computes the rank of a value in a group of values. The result is one plus the number of rows preceding or equal to the current row in the ordering of the partition. The values will produce gaps in the sequence.
repeat¶
repeat(str, n) - Returns the string which repeats the given string value n times.
reverse¶
reverse(str) - Returns the reversed given string.
rint¶
rint(expr) - Returns the double value that is closest in value to the argument and is equal to a mathematical integer.
rlike¶
str rlike regexp - Returns true if str
matches regexp
, or false otherwise.
rollup¶
round¶
round(expr, d) - Returns expr
rounded to d
decimal places using HALF_UP rounding mode.
row_number¶
row_number() - Assigns a unique, sequential number to each row, starting with one, according to the ordering of rows within the window partition.
rpad¶
rpad(str, len, pad) - Returns str
, right-padded with pad
to a length of len
. If str
is longer than len
, the return value is shortened to len
characters.
rtrim¶
rtrim(str) - Removes the trailing space characters from str
.
second¶
second(timestamp) - Returns the second component of the string/timestamp.
sentences¶
sentences(str[, lang, country]) - Splits str
into an array of array of words.
sha¶
sha(expr) - Returns a sha1 hash value as a hex string of the expr
.
sha1¶
sha1(expr) - Returns a sha1 hash value as a hex string of the expr
.
sha2¶
sha2(expr, bitLength) - Returns a checksum of SHA-2 family as a hex string of expr
. SHA-224, SHA-256, SHA-384, and SHA-512 are supported. Bit length of 0 is equivalent to 256.
Examples:
> SELECT sha2('Spark', 256);
529bc3b07127ecb7e53a4dcf1991d9152c24537d919178022b2c42657f79a26b
shiftleft¶
shiftleft(base, expr) - Bitwise left shift.
shiftright¶
shiftright(base, expr) - Bitwise (signed) right shift.
shiftrightunsigned¶
shiftrightunsigned(base, expr) - Bitwise unsigned right shift.
sign¶
sign(expr) - Returns -1.0, 0.0 or 1.0 as expr
is negative, 0 or positive.
signum¶
signum(expr) - Returns -1.0, 0.0 or 1.0 as expr
is negative, 0 or positive.
sin¶
sin(expr) - Returns the sine of expr
.
sinh¶
sinh(expr) - Returns the hyperbolic sine of expr
.
size¶
size(expr) - Returns the size of an array or a map. Returns -1 if null.
skewness¶
skewness(expr) - Returns the skewness value calculated from values of a group.
smallint¶
smallint(expr AS type) - Casts the value expr
to the target data type type
.
sort_array¶
sort_array(array[, ascendingOrder]) - Sorts the input array in ascending or descending order according to the natural ordering of the array elements.
soundex¶
soundex(str) - Returns Soundex code of the string.
space¶
space(n) - Returns a string consisting of n
spaces.
spark_partition_id¶
spark_partition_id() - Returns the current partition id.
split¶
split(str, regex) - Splits str
around occurrences that match regex
.
sqrt¶
sqrt(expr) - Returns the square root of expr
.
stack¶
stack(n, expr1, ..., exprk) - Separates expr1
, ..., exprk
into n
rows.
std¶
std(expr) - Returns the sample standard deviation calculated from values of a group.
stddev¶
stddev(expr) - Returns the sample standard deviation calculated from values of a group.
stddev_pop¶
stddev_pop(expr) - Returns the population standard deviation calculated from values of a group.
stddev_samp¶
stddev_samp(expr) - Returns the sample standard deviation calculated from values of a group.
str_to_map¶
str_to_map(text[, pairDelim[, keyValueDelim]]) - Creates a map after splitting the text into key/value pairs using delimiters. Default delimiters are ',' for pairDelim
and ':' for keyValueDelim
.
Examples:
> SELECT str_to_map('a:1,b:2,c:3', ',', ':');
map("a":"1","b":"2","c":"3")
> SELECT str_to_map('a');
map("a":null)
string¶
string(expr AS type) - Casts the value expr
to the target data type type
.
struct¶
struct(col1, col2, col3, ...) - Creates a struct with the given field values.
substr¶
substr(str, pos[, len]) - Returns the substring of str
that starts at pos
and is of length len
, or the slice of byte array that starts at pos
and is of length len
.
Examples:
> SELECT substr('Spark SQL', 5);
k SQL
> SELECT substr('Spark SQL', -3);
SQL
> SELECT substr('Spark SQL', 5, 1);
k
substring¶
substring(str, pos[, len]) - Returns the substring of str
that starts at pos
and is of length len
, or the slice of byte array that starts at pos
and is of length len
.
Examples:
> SELECT substring('Spark SQL', 5);
k SQL
> SELECT substring('Spark SQL', -3);
SQL
> SELECT substring('Spark SQL', 5, 1);
k
substring_index¶
substring_index(str, delim, count) - Returns the substring from str
before count
occurrences of the delimiter delim
. If count
is positive, everything to the left of the final delimiter (counting from the left) is returned. If count
is negative, everything to the right of the final delimiter (counting from the right) is returned. The function substring_index performs a case-sensitive match when searching for delim
.
sum¶
sum(expr) - Returns the sum calculated from values of a group.
tan¶
tan(expr) - Returns the tangent of expr
.
tanh¶
tanh(expr) - Returns the hyperbolic tangent of expr
.
timestamp¶
timestamp(expr AS type) - Casts the value expr
to the target data type type
.
tinyint¶
tinyint(expr AS type) - Casts the value expr
to the target data type type
.
to_date¶
to_date(expr) - Extracts the date part of the date or timestamp expression expr
.
to_unix_timestamp¶
to_unix_timestamp(expr[, pattern]) - Returns the UNIX timestamp of the give time.
to_utc_timestamp¶
to_utc_timestamp(timestamp, timezone) - Given a timestamp, which corresponds to a certain time of day in the given timezone, returns another timestamp that corresponds to the same time of day in UTC.
translate¶
translate(input, from, to) - Translates the input
string by replacing the characters present in the from
string with the corresponding characters in the to
string.
trim¶
trim(str) - Removes the leading and trailing space characters from str
.
trunc¶
trunc(date, fmt) - Returns date
with the time portion of the day truncated to the unit specified by the format model fmt
.
Examples:
> SELECT trunc('2009-02-12', 'MM');
2009-02-01
> SELECT trunc('2015-10-27', 'YEAR');
2015-01-01
ucase¶
ucase(str) - Returns str
with all characters changed to uppercase.
unbase64¶
unbase64(str) - Converts the argument from a base 64 string str
to a binary.
unhex¶
unhex(expr) - Converts hexadecimal expr
to binary.
unix_timestamp¶
unix_timestamp([expr[, pattern]]) - Returns the UNIX timestamp of current or specified time.
Examples:
> SELECT unix_timestamp();
1476884637
> SELECT unix_timestamp('2016-04-08', 'yyyy-MM-dd');
1460041200
upper¶
upper(str) - Returns str
with all characters changed to uppercase.
upper_bound¶
upper_bound() - Upper value of an estimate interval for a given confidence.calculated using error estimation method (ClosedForm or Bootstrap). [AQP]
Examples:
> SELECT sum(ArrDelay) ArrivalDelay, upper_bound(ArrivalDelay),
Month_ from airline group by Month_ order by Month_ desc with error 0.10;
1117.6, 11135.5, Jan
"```
### var_pop
var_pop(expr) - Returns the population variance calculated from values of a group.
SELECT weekofyear('2008-02-20'); 8
No example/argument for when.### when CASE WHEN expr1 THEN expr2 [WHEN expr3 THEN expr4]* [ELSE expr5] END - When `expr1` = true, returns `expr2`; when `expr3` = true, return `expr4`; else return `expr5`.
Examples: SELECT xpath('b1b2b3### window ### xpath xpath(xml, xpath) - Returns a string array of values within the nodes of xml that match the XPath expression.
c1 c2 ','a/b/text()'); ['b1','b2','b3']Examples: SELECT xpath_boolean('1','a/b'); true### xpath_boolean xpath_boolean(xml, xpath) - Returns true if the XPath expression evaluates to true, or if a matching node is found.
Examples: SELECT xpath_double('12', 'sum(a/b)'); 3.0### xpath_double xpath_double(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples: SELECT xpath_float('12', 'sum(a/b)'); 3.0### xpath_float xpath_float(xml, xpath) - Returns a float value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples: SELECT xpath_int('12', 'sum(a/b)'); 3### xpath_int xpath_int(xml, xpath) - Returns an integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples: SELECT xpath_long('12', 'sum(a/b)'); 3### xpath_long xpath_long(xml, xpath) - Returns a long integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples: SELECT xpath_number('12', 'sum(a/b)'); 3.0### xpath_number xpath_number(xml, xpath) - Returns a double value, the value zero if no match is found, or NaN if a match is found but the value is non-numeric.
Examples: SELECT xpath_short('12', 'sum(a/b)'); 3### xpath_short xpath_short(xml, xpath) - Returns a short integer value, or the value zero if no match is found, or a match is found but the value is non-numeric.
Examples: SELECT xpath_string('b### xpath_string xpath_string(xml, xpath) - Returns the text contents of the first xml node that matches the XPath expression.
cc ','a/c'); cc Examples: SELECT year('2016-07-30'); 2016 Examples: SELECT 3 | 5; 7 Examples: SELECT ~ 0; -1 ```