Piezīmes
Lai piekļūtu šai lapai, ir nepieciešama autorizācija. Varat mēģināt pierakstīties vai mainīt direktorijus.
Lai piekļūtu šai lapai, ir nepieciešama autorizācija. Varat mēģināt mainīt direktorijus.
Loads Parquet files and returns the result as a DataFrame.
Syntax
parquet(*paths, **options)
Parameters
| Parameter | Type | Description |
|---|---|---|
*paths |
str | One or more file paths to read the Parquet files from. |
Returns
DataFrame
Examples
Write a DataFrame into a Parquet file and read it back.
import tempfile
df = spark.createDataFrame(
[(10, "Alice"), (15, "Bob"), (20, "Tom")], schema=["age", "name"])
with tempfile.TemporaryDirectory(prefix="parquet") as d:
df.write.mode("overwrite").format("parquet").save(d)
spark.read.parquet(d).orderBy("name").show()
# +---+-----+
# |age| name|
# +---+-----+
# | 10|Alice|
# | 15| Bob|
# | 20| Tom|
# +---+-----+
Read multiple Parquet files and merge schemas.
import tempfile
df = spark.createDataFrame(
[(10, "Alice"), (15, "Bob"), (20, "Tom")], schema=["age", "name"])
df2 = spark.createDataFrame([(70, "Alice"), (80, "Bob")], schema=["height", "name"])
with tempfile.TemporaryDirectory(prefix="parquet1") as d1:
with tempfile.TemporaryDirectory(prefix="parquet2") as d2:
df.write.mode("overwrite").format("parquet").save(d1)
df2.write.mode("overwrite").format("parquet").save(d2)
spark.read.option(
"mergeSchema", "true"
).parquet(d1, d2).select(
"name", "age", "height"
).orderBy("name", "age").show()
# +-----+----+------+
# | name| age|height|
# +-----+----+------+
# |Alice|NULL| 70|
# |Alice| 10| NULL|
# | Bob|NULL| 80|
# | Bob| 15| NULL|
# | Tom| 20| NULL|
# +-----+----+------+