Unlock the Power of Pandas: Mastering the nunique() Method
Understanding the Syntax
The nunique()
method in Pandas takes two optional arguments: axis
and dropna
. The axis
parameter specifies the axis to compute the number of unique values along, while dropna
determines whether to include NaN values in the count.
import pandas as pd
# Example DataFrame
df = pd.DataFrame({'A': [1, 2, 2, 3, 3, 3],
'B': [4, 5, 5, 6, 6, 6]})
print(df.nunique(axis=0, dropna=True)) # Default behavior
Counting Unique Values with Ease
When applied to a Series, the nunique()
method returns a scalar value representing the number of unique values. For instance, if we have a Series containing exam scores, the nunique()
method can quickly give us the number of distinct scores.
scores = pd.Series([90, 80, 90, 70, 80, 90])
print(scores.nunique()) # Output: 3
Including NaN Values in the Count
By default, the nunique()
method excludes NaN values from the count. However, by setting dropna=False
, we can include these values in the count. This is particularly useful when working with datasets containing missing values.
scores_with_nan = pd.Series([90, 80, 90, None, 80, 90])
print(scores_with_nan.nunique(dropna=False)) # Output: 4
Counting Unique Values Across Rows
When working with DataFrames, we can change the axis
parameter to 1 to count unique values across rows. This allows us to identify patterns and relationships between columns.
print(df.nunique(axis=1)) # Count unique values across rows
Putting it all Together
With the nunique()
method, you can unlock new insights into your datasets and take your data analysis to the next level. By mastering this powerful tool, you’ll be able to:
- Uncover hidden patterns
- Identify trends
- Make more informed decisions