How to Perform Feature Engineering on DateTime Column

kashish18 · July 7, 2023, 10:10am

Hello community,

I am currently working on a dataset that contains a bunch of

Numerical columns - Float type,
Categorical columns (which i have converted into Pandas categorical numeric codes),
A dateTime column and
A predictor, which is also a category named Class with acceptable values as 0 or 1.

The dateTime column contains mix of null values and actual Dates in the format mm/dd/yyyy

For Continuous columns, imputing with median or performing KNNImputation works just fine. But for columns containing Date-time, I was wondering how should I handle NuLL values in such cases ?

Also, if there is any other feature engineering recommended for DateTime columns, I’m happy to learn that as well. Thanks!

Ifeanyi · July 8, 2023, 5:33pm

I can’t believe I found you here (@lies_and_stats from twitter).

The course pretty much explains the most essential feature engineering for datetime columns. The add_datepart columns does most of them for you.

You can also creates lags and rolling mean features.
marktenenholtz
also has a good thread on the topic.

if the datetime was sorted , you can use bfill or ffil to deal with null. If not any of the usual ways of dealing with null values still applies

·