Encode longitudinal data visually or pursue tabular approach?

I have a dataset of monthly temperature and precipitation data with which I would like to predict my classes of interest. I could encode these visually, with time on the x axis, and temp and precipitation on y1 and y2 axes, similar to a spectrogram. These pseudo-spectrogram data could be used to train a vision learner. Or I could pursue an approach in which I represent these data tabularly, train with a tabular learner.

My intuition tells me that visual encoding would obviate the need to conduct feature engineering, which might be needed for the tabular approach. Does anyone have advice for longitudinal datasets such as these?