How To Find Maximum Value Of A Column In Python Dataframe
I have a data frame in pyspark. In this data frame I have column called id that is unique. Now I want to find the maximum value of the column id in the data frame. I have tried lik
Solution 1:
if you are using pandas .max()
will work :
>>>df2=pd.DataFrame({'A':[1,5,0], 'B':[3, 5, 6]})>>>df2['A'].max()
5
Else if it's a spark
dataframe:
Best way to get the max value in a Spark dataframe column
Solution 2:
I'm coming from scala, but I do believe that this is also applicable on python.
val max = df.select(max("id")).first()
but you have first import the following :
from pyspark.sql.functions importmax
Solution 3:
The following can be used in pyspark:
df.select(max("id")).show()
Solution 4:
You can use the aggregate max as also mentioned in the pyspark documentation link below:
Link : https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=agg
Code:
row1 = df1.agg({"id": "max"}).collect()[0]
Post a Comment for "How To Find Maximum Value Of A Column In Python Dataframe"