How to Perform Feature Engineering on DateTime Column

Hello community,

I am currently working on a dataset that contains a bunch of

  1. Numerical columns - Float type,
  2. Categorical columns (which i have converted into Pandas categorical numeric codes),
  3. A dateTime column and
  4. A predictor, which is also a category named Class with acceptable values as 0 or 1.

The dateTime column contains mix of null values and actual Dates in the format mm/dd/yyyy

For Continuous columns, imputing with median or performing KNNImputation works just fine. But for columns containing Date-time, I was wondering how should I handle NuLL values in such cases ?

Also, if there is any other feature engineering recommended for DateTime columns, I’m happy to learn that as well. Thanks!

1 Like

I can’t believe I found you here :sweat_smile: (@lies_and_stats from twitter).

The course pretty much explains the most essential feature engineering for datetime columns. The add_datepart columns does most of them for you.

You can also creates lags and rolling mean features.
marktenenholtz
also has a good thread on the topic.

if the datetime was sorted , you can use bfill or ffil to deal with null. If not any of the usual ways of dealing with null values still applies

·