
log()
The log()
function calculates the natural logarithm (logarithm to the base e
) of each numeric value in a column.
Usage
log()
is applied to a column containing positive numeric values.- It computes the natural logarithm of each value in the column.
Create Spark Session and sample DataFrame
from pyspark.sql import SparkSessionfrom pyspark.sql.functions import log
# Initialize Spark Sessionspark = SparkSession.builder.appName("logExample").getOrCreate()
# Sample DataFramedata = [(1.0,), (10.0,), (100.0,)]columns = ["Value"]df = spark.createDataFrame(data, columns)df.show()
Output:
+-----+
|Value|
+-----+
| 1.0|
| 10.0|
|100.0|
+-----+
Example: Use log
to calculate natural logarithm value
log("Value")
: this computes the natural logarithm for each value in the Value column of the DataFrame df.alias("Natural Logarithm")
: it renamed the computed column as Natural Logarithm.
log_df = df.select(df["Value"], log("Value").alias("Natural Logarithm"))log_df.show()
Output:
+-----+-----------------+
|Value|Natural Logarithm|
+-----+-----------------+
| 1.0| 0.0|
| 10.0|2.302585092994046|
|100.0|4.605170185988092|
+-----+-----------------+
# Stop the Spark Sessionspark.stop()