python - Fill missing dates by group in pandas -
i need fill missing date down group. here code create data frame. want add date of fill column down far when date of fill column changes , until group 'name' changes.
data = {'tdate': [20080815,20080915,20081226,20090110,20090131,20080807,20080831, 20080918,20081023,20081114,20081207,20090117,20090203,20090219,20090305,20090318,20090501], 'name': ['a','a','a','a','a','b','b','b','b','b','b','b','b','b','b','b','b'], 'fill': [nan,nan,20080915,nan,nan,nan,nan,nan,nan,20081023, nan,nan,nan,nan,20090219,nan,nan]} df = pd.dataframe(data, columns=['tdate', 'name', 'fill']) df
current data frame
tdate name fill 0 20080815 nan 1 20080915 nan 2 20081226 20080915 3 20090110 nan 4 20090131 nan 5 20080807 b nan 6 20080831 b nan 7 20080918 b nan 8 20081023 b nan 9 20081114 b 20081023 10 20081207 b nan 11 20090117 b nan 12 20090203 b nan 13 20090219 b nan 14 20090305 b 20090219 15 20090318 b nan 16 20090501 b nan
desired output
tdate name fill 0 20080815 nan 1 20080915 nan 2 20081226 20080915 3 20090110 20080915 4 20090131 20080915 5 20080807 b nan 6 20080831 b nan 7 20080918 b nan 8 20081023 b nan 9 20081114 b nan 10 20081207 b 20081023 11 20090117 b 20081023 12 20090203 b 20081023 13 20090219 b 20081023 14 20090305 b 20081023 15 20090318 b 20090219 16 20090501 b 20090219
here code
df.groupby(df["name"])["fill"].fill()
you pretty close, need forward-fill rather filling:
df.groupby('name')["fill"].ffill() out[42]: 0 nan 1 nan 2 20080915 3 20080915 4 20080915 5 nan 6 nan 7 nan 8 nan 9 20081023 10 20081023 11 20081023 12 20081023 13 20081023 14 20090219 15 20090219 16 20090219 dtype: float64
or equivalently:
df.groupby('name')["fill"].fillna(method='ffill')
Comments
Post a Comment