I want to generate text descriptions from a electronic circuit diagram. I’ve been struggling myself with some circuits on my side projects and I think it’d be a fun project to try. So far, this is the structure I have in mind
- Segment the image into components (and figure out how they are connected)
- CV model to recognize each of the components
- Generate a text representation
- Feed the representation to a LLM.
The part 2, 3 & 4 are kind of straightforward. I might struggle to build the classifier in 2, but I know how to tackle the problem. The text format for step three could be something like the mermaid format:
A[Battery] -- Current --> B[(Resistor)] B -- Current --> C[(Capacitor)] C -- Current --> D[(Inductor)] D -- Current --> A
However, I am not sure how to approach the task of segmenting the image. I have tried some frameworks like yolo but with little success. Any ideas on how to do this?