A different kind of multi-label classification

I’m trying to train a model to recognize chess positions from images like this.

_______________p_____Pkbp_pp____P__P____n______pR___K__N_______b

One way would be to use computer vision to slice the board into 64 squares and do vanilla image recognition on each square. This makes a lot of sense, but I’m curious if I can get a model to work on the whole board.

In order to do this, each board will need to have 64 labels (one for each square). This is different from the usual Multi-label Classification, because it’s not using the same output layer and thresholding at a certain value, it’s actually using a much bigger output layer. In this case, the square recognizer would have an output layer of length 13 (6 white pieces, 6 black pieces, empty) whereas the board recognizer would have length of 13 x 64.

Is there a way I can set this up in fastai? Or is the whole idea too stupid to contemplate and I should just slice it into 64 squares? :smiley:

i think slicing would be easier, but to do with fastai, one approach would be to generate a bunch of images with all the different positions and feed into a classifier.

Hi Nate. This looks very much like an image segmentation problem. But with an 8x8 output image and 13 classes. I bet the standard approach could be adapted.

1 Like

You probably don’t need a full neural network for this, just a single conv layer. I would use 12 different convolution filters, where each filter contains the image of a chess piece (if I counted correctly there are 12 different pieces?). Then slide each over the 8x8 cells. For each cell, take the piece with the highest activation. (You could encode black as -1 and white as 1, and nothing as 0.)

Nate, I’d second Malcom’s thought: that you have a an image segmentation problem. In fastai, the main dataset for this task is called camvid so you can search for previous lessons on this.

Another approach is to identify the top left and bottom right corners of the board using Image -> Point(s) tasks (headpose dataset in fastai). Then you can slice up into 8x8 and do traditional image classification.

I’m also working on (real life) chess pieces for my project: https://github.com/sutt/fastai2-dev/tree/master/chess-classification-hw . Let me know if you have a way we can collaborate.

1 Like

I’d even go a step further and do full ImagePoints. So long as you set up what each point value is on your row of returned values and you set when chess pieces aren’t there to -1,-1, you can decide all of them and you’re set.

1 Like

Oh, cool! We actually worked on something like this at my previous job, but we didn’t manage to get it completely where we wanted it. When you try to apply it to a whole chess game it gets surprisingly hard because of things like occlusion (piece is blocked by another piece or a player’s hand). Excited to see what you come up with, and yes, would love to collaborate!

This approach makes a lot of sense. The only issue I see is if I want it to recognize different chess sets (each chess site has a slightly different design) then a NN might come in handy.

opencv has a function called findChessboardCorners that might be useful if you go with the “slice up the board” approach. It’s supposed to be for camera calibration but I think you could re-purpose it for this use case https://docs.opencv.org/2.4/modules/calib3d/doc/camera_calibration_and_3d_reconstruction.html#findchessboardcorners

2 Likes

Just want to add a demo video to this, because seeing is believing with this algo: https://www.youtube.com/watch?v=Xi2jMe8oDes

I actually tried this (couldn’t believe my eyes when I found opencv had a built in find chessboard function)! Unfortunately in my testing it didn’t work reliably if there are pieces on the board.

Have you considered using object detection? It would allow for different board/piece types and you could also detect visually blocked pieces…

I’m gonna start with the uniform 2d board diagram and if I can get that to work go from there. Looks like Will is already pretty far along on the 3d version (see above), which does seem a lot like object detection. If your end goal is to get the position on the board, you need to not only detect the object (piece) but also figure out which square it’s on.

You can get the position from the object coordinates prediction I guess…

Perhaps CoordConv layers with multi-label classes (ie position, piece type, and piece color per piece on each image) could be useful. The way CoordConv preserves spatial information in their examples seems relevant to your project.

Well it took me awhile but I got this working! I made a Voila app that you can try out if you like. It’s been tested with the standard boards from lichess, chess.com, and Chessbase. The image needs to be cropped so it includes only the board and nothing else.

As a next step, I’d like to use object detection to find a board in a larger image, so the user doesn’t need to crop the board out manually. This task is easier than what’s typically meant by object detection in a few ways.

  • I’m only looking for one category of object, not many
  • I know there will be exactly one board in the image
  • I know it’s a square

For these reasons the standard object detection models like YOLO and SSD seem like overkill. Any suggestions for an appropriate architecture to use for this relatively simple object detection problem?

1 Like

I would have used cv2 template matching for this task. Here is someone using it for your chessboard. https://stackoverflow.com/questions/61779288/how-to-template-match-a-simple-2d-shape-in-opencv

If it has to be fast, very fast, I’d suggest finding some sort of ‘fingerprint’ for each piece. Like maybe only the White Queen is black at (100,100) and white at (50,50), or whatever.

Also, either of the approaches above can be used to find where the square is in a screenshot. Here, your needle in the haystack is the one pixel top row of the chessboard.

Neural nets are awesome but if the task doesn’t need to be generalisable and if you can get an accurate result faster with features, use features.

It does need to be somewhat generalizable though because I want it to recognize boards from different sources. Currently it supports lichess, chess.com, and Chessbase, which all use different board and piece designs, and I’m planning to add support for other sites as well. Ultimately, I’d like it to be able to even recognize designs it hasn’t seen yet, as long as they’re recognizable as chess pieces.