
lower()
The lower()
function is used to convert all characters in a string column of a DataFrame to lowercase.
Create Spark Session and sample DataFrame
from pyspark.sql import SparkSessionfrom pyspark.sql.functions import lower
# Initialize Spark Sessionspark = SparkSession.builder.appName("lowerExample").getOrCreate()
# Sample DataFrame with String Columndata = [("James BOND",), ("Harry POTTER",), ("Sherlock HOLMES",)]columns = ["Name"]df = spark.createDataFrame(data, columns)df.show()
Output:
+---------------+
| Name|
+---------------+
| James BOND|
| Harry POTTER|
|Sherlock HOLMES|
+---------------+
Example: Use lower()
to Convert Text to Lowercase
lower(df.Name)
: Converts the text in the "Name" column of the DataFrame df to lowercase.- The resulting lowercase text is stored in a new column "Lowercase Name".
lowercase_df = df.withColumn("Lowercase Name", lower(df.Name))lowercase_df.show(truncate=False)
Output:
+---------------+---------------+
|Name |Lowercase Name |
+---------------+---------------+
|James BOND |james bond |
|Harry POTTER |harry potter |
|Sherlock HOLMES|sherlock holmes|
+---------------+---------------+
# Stop the Spark Sessionspark.stop()