Need some ideas about a dataset

Hello wonderful people, I have a dataset of ships and ports. Dataset contains ship features(flag, weight of the ship etc), port features(country) and technical inspection date and result. Inspection results are “error codes”. Basicly ship goes to a port drop containers and getting inspected by the experts. A ship could be given multiple error codes. I want a model that could predict inspection error codes according to given date,port and ship. Logistic regression or maybe content based recommendation algorithms.

What do yo think? Any idea will be appreciated.

1 Like

Not an expert, so this is just to give an idea.

We want to predict the error:

w, port, datesModelerr/nerr

I am not sure, but maybe port can be passed as a sort of latitude/longitude? So something like error (0,1), port(lat/long), date(seconds), could be interesting to try? It’s how to turn that into something we can feed to a model.


I’ve also checked the first chapter, where there is this model for csv/tabular data, but that’s for predicting a single column based on the others, so it may not be possible to predict many error types.

Thank you for answer @mrnobody . There are more than 300 different boats. There are different inspection regimes in ports. For example some ports gave some error codes more than others. There is an obvious pattern, but i wonder if its beyond fastai tools.

I think it’s pretty standard, but as it’s said in the intro of the book, we need to find a way to input meaningful data to the function (of course, this means numbers).

If you upload the data somewhere, we could try to think out something. It seems to be a standard classification problem, but needs some treatment on the input data.