1. Initialize an empty record (np.NaN)
  2. Insert the empty record into exist DataFrame according to DataFrame’s index
  3. Within the DataFrame, Fill np.NaN with previous record (method=’ffill’)

Notes: Before fill in missing value with method forwar fill, the dataframe should be sorted first.

import pandas as pd
import numpy as np
from transform import transform_trafficData, transform_weatherData, get_y, create_matrix
weatherSlotDf = transform_weatherData('2016-01-26', folder='testing')

The records with index range(46, 158, 12) are missing.
We will append these record first and then fill in values with its pevious record.

weatherSlotDf
date slot Weather temperature PM25
time_slot
43 2016-01-26 43 1 -2.0 60
44 2016-01-26 44 1 -2.0 60
45 2016-01-26 45 1 -2.0 60
67 2016-01-26 67 1 3.0 65
69 2016-01-26 69 1 3.0 65
79 2016-01-26 79 1 5.0 66
80 2016-01-26 80 1 5.0 66
81 2016-01-26 81 1 5.0 66
91 2016-01-26 91 1 7.0 59
92 2016-01-26 92 1 7.0 59
93 2016-01-26 93 1 7.0 59
104 2016-01-26 104 1 6.0 58
105 2016-01-26 105 1 6.0 58
115 2016-01-26 115 2 5.0 65
116 2016-01-26 116 2 5.0 65
117 2016-01-26 117 2 4.0 65
127 2016-01-26 127 9 4.0 89
128 2016-01-26 128 9 4.0 89
139 2016-01-26 139 3 3.0 101
140 2016-01-26 140 3 4.0 101
141 2016-01-26 141 3 4.0 101
# initialize the record and set all columns to np.NaN
rowTemp = weatherSlotDf.iloc[0]
# set all values to NaN
for key in rowTemp.keys():
    rowTemp[key] = np.NaN

for i in range(46, 146, 12):
    weatherSlotDf.loc[i] = rowTemp

weatherSlotDf.sort_index(inplace=True)
/Users/hadoop1/anaconda/lib/python2.7/site-packages/ipykernel/__main__.py:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
weatherSlotDf
date slot Weather temperature PM25
time_slot
43 2016-01-26 43.0 1.0 -2.0 60.0
44 2016-01-26 44.0 1.0 -2.0 60.0
45 2016-01-26 45.0 1.0 -2.0 60.0
46 NaN NaN NaN NaN NaN
58 NaN NaN NaN NaN NaN
67 2016-01-26 67.0 1.0 3.0 65.0
69 2016-01-26 69.0 1.0 3.0 65.0
70 NaN NaN NaN NaN NaN
79 2016-01-26 79.0 1.0 5.0 66.0
80 2016-01-26 80.0 1.0 5.0 66.0
81 2016-01-26 81.0 1.0 5.0 66.0
82 NaN NaN NaN NaN NaN
91 2016-01-26 91.0 1.0 7.0 59.0
92 2016-01-26 92.0 1.0 7.0 59.0
93 2016-01-26 93.0 1.0 7.0 59.0
94 NaN NaN NaN NaN NaN
104 2016-01-26 104.0 1.0 6.0 58.0
105 2016-01-26 105.0 1.0 6.0 58.0
106 NaN NaN NaN NaN NaN
115 2016-01-26 115.0 2.0 5.0 65.0
116 2016-01-26 116.0 2.0 5.0 65.0
117 2016-01-26 117.0 2.0 4.0 65.0
118 NaN NaN NaN NaN NaN
127 2016-01-26 127.0 9.0 4.0 89.0
128 2016-01-26 128.0 9.0 4.0 89.0
130 NaN NaN NaN NaN NaN
139 2016-01-26 139.0 3.0 3.0 101.0
140 2016-01-26 140.0 3.0 4.0 101.0
141 2016-01-26 141.0 3.0 4.0 101.0
142 NaN NaN NaN NaN NaN

Done!!!

weatherSlotDf.fillna(method='ffill')
date slot Weather temperature PM25
time_slot
43 2016-01-26 43.0 1.0 -2.0 60.0
44 2016-01-26 44.0 1.0 -2.0 60.0
45 2016-01-26 45.0 1.0 -2.0 60.0
46 2016-01-26 45.0 1.0 -2.0 60.0
58 2016-01-26 45.0 1.0 -2.0 60.0
67 2016-01-26 67.0 1.0 3.0 65.0
69 2016-01-26 69.0 1.0 3.0 65.0
70 2016-01-26 69.0 1.0 3.0 65.0
79 2016-01-26 79.0 1.0 5.0 66.0
80 2016-01-26 80.0 1.0 5.0 66.0
81 2016-01-26 81.0 1.0 5.0 66.0
82 2016-01-26 81.0 1.0 5.0 66.0
91 2016-01-26 91.0 1.0 7.0 59.0
92 2016-01-26 92.0 1.0 7.0 59.0
93 2016-01-26 93.0 1.0 7.0 59.0
94 2016-01-26 93.0 1.0 7.0 59.0
104 2016-01-26 104.0 1.0 6.0 58.0
105 2016-01-26 105.0 1.0 6.0 58.0
106 2016-01-26 105.0 1.0 6.0 58.0
115 2016-01-26 115.0 2.0 5.0 65.0
116 2016-01-26 116.0 2.0 5.0 65.0
117 2016-01-26 117.0 2.0 4.0 65.0
118 2016-01-26 117.0 2.0 4.0 65.0
127 2016-01-26 127.0 9.0 4.0 89.0
128 2016-01-26 128.0 9.0 4.0 89.0
130 2016-01-26 128.0 9.0 4.0 89.0
139 2016-01-26 139.0 3.0 3.0 101.0
140 2016-01-26 140.0 3.0 4.0 101.0
141 2016-01-26 141.0 3.0 4.0 101.0
142 2016-01-26 141.0 3.0 4.0 101.0