नोट
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप साइन इन करने या निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
इस पेज तक पहुँच के लिए प्रमाणन की आवश्यकता होती है. आप निर्देशिकाओं को बदलने का प्रयास कर सकते हैं.
Creates a new array column from the input columns or column names.
Syntax
from pyspark.sql import functions as sf
sf.array(*cols)
Parameters
| Parameter | Type | Description |
|---|---|---|
cols |
pyspark.sql.Column or str |
Column names or Column objects that have the same data type. |
Returns
pyspark.sql.Column: A new Column of array type, where each value is an array containing the corresponding values from the input columns.
Examples
Example 1: Basic usage of array function with column names.
from pyspark.sql import functions as sf
df = spark.createDataFrame([("Alice", "doctor"), ("Bob", "engineer")],
("name", "occupation"))
df.select(sf.array('name', 'occupation')).show()
+-----------------------+
|array(name, occupation)|
+-----------------------+
| [Alice, doctor]|
| [Bob, engineer]|
+-----------------------+
Example 2: Usage of array function with Column objects.
from pyspark.sql import functions as sf
df = spark.createDataFrame([("Alice", "doctor"), ("Bob", "engineer")],
("name", "occupation"))
df.select(sf.array(df.name, df.occupation)).show()
+-----------------------+
|array(name, occupation)|
+-----------------------+
| [Alice, doctor]|
| [Bob, engineer]|
+-----------------------+
Example 3: Single argument as list of column names.
from pyspark.sql import functions as sf
df = spark.createDataFrame([("Alice", "doctor"), ("Bob", "engineer")],
("name", "occupation"))
df.select(sf.array(['name', 'occupation'])).show()
+-----------------------+
|array(name, occupation)|
+-----------------------+
| [Alice, doctor]|
| [Bob, engineer]|
+-----------------------+
Example 4: Usage of array function with columns of different types.
from pyspark.sql import functions as sf
df = spark.createDataFrame(
[("Alice", 2, 22.2), ("Bob", 5, 36.1)],
("name", "age", "weight"))
df.select(sf.array(['age', 'weight'])).show()
+------------------+
|array(age, weight)|
+------------------+
| [2.0, 22.2]|
| [5.0, 36.1]|
+------------------+
Example 5: array function with a column containing null values.
from pyspark.sql import functions as sf
df = spark.createDataFrame([("Alice", None), ("Bob", "engineer")],
("name", "occupation"))
df.select(sf.array('name', 'occupation')).show()
+-----------------------+
|array(name, occupation)|
+-----------------------+
| [Alice, NULL]|
| [Bob, engineer]|
+-----------------------+