Skip to content Skip to sidebar Skip to footer

How To Find Maximum Value Of A Column In Python Dataframe

I have a data frame in pyspark. In this data frame I have column called id that is unique. Now I want to find the maximum value of the column id in the data frame. I have tried lik

Solution 1:

if you are using pandas .max() will work :

>>>df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})>>>df2['A'].max()
5

Else if it's a spark dataframe:

Best way to get the max value in a Spark dataframe column

Solution 2:

I'm coming from scala, but I do believe that this is also applicable on python.

val max = df.select(max("id")).first()

but you have first import the following :

from pyspark.sql.functions importmax

Solution 3:

The following can be used in pyspark:

df.select(max("id")).show()

Solution 4:

You can use the aggregate max as also mentioned in the pyspark documentation link below:

Link : https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg

Code:

row1 = df1.agg({"id": "max"}).collect()[0]

Post a Comment for "How To Find Maximum Value Of A Column In Python Dataframe"