
upper()
The upper()
function is used to convert all characters in a string column of a DataFrame to uppercase.
Create Spark Session and sample DataFrame
from pyspark.sql import SparkSessionfrom pyspark.sql.functions import upper
# Initialize Spark Sessionspark = SparkSession.builder.appName("upperExample").getOrCreate()
# Sample DataFrame with String Columndata = [("james bond ",), ("harry porter",), ("sherlock holmes",)]columns = ["Name"]df = spark.createDataFrame(data, columns)df.show()
Output:
+---------------+
| Name|
+---------------+
| james bond |
| harry porter|
|sherlock holmes|
+---------------+
Example: Use upper()
to convert text to uppercase
upper(df.Name)
: it converts the text in the Name column of the df DataFrame to uppercase.- The resulting uppercase text is stored in a new column Uppercase Name.
uppercase_df = df.withColumn("Uppercase Name", upper(df.Name))uppercase_df.show(truncate=False)
Output:
+---------------+---------------+
|Name |Uppercase Name |
+---------------+---------------+
|james bond |JAMES BOND |
|harry porter |HARRY PORTER |
|sherlock holmes|SHERLOCK HOLMES|
+---------------+---------------+
# Stop the Spark Sessionspark.stop()