Skip to content

Incompatible Pandas column selection syntax in HourlyStats(breaks with recent Pandas versions) #127

@ufuk-cakir

Description

@ufuk-cakir

Bug Description

Running the current version of Cell2Fire(0.2) with Python (3.11.10) and pandas=2.2.3 on MacOS 15.4 results in a ValueError during statistics generation. The error is due to outdated column selection syntax inside cell2fire/utils/Stats.py

Error message

ValueError: Cannot subset columns with a tuple with more than one element. Use a list instead.

Location in the Code

all places in the Stats.py where groupby appears

SummaryDF = Ah[["NonBurned", "Burned", "Harvested", "Hour"]].groupby('Hour')["NonBurned", "Burned", "Harvested"].mean()

Cause of Bug

The line uses a legacy style of selecting multiple columns after groupby, passing a tuple instead of a list:

.groupby('Hour')["NonBurned", "Burned", "Harvested"]

This syntex is no longer supported in newer version of Pandas. Check for example this StackOverflow post

It now requires passing a list of column names

.groupby('Hour')[["NonBurned", "Burned", "Harvested"]]

Suggested FIx

  • either change lines where groupby() is used to the syntax that uses double brackets like
.groupby('Hour')[["NonBurned", "Burned", "Harvested"]]
  • Or Fix pandas version to pandas<2.0.0 in the requirements.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions