AGI idea. Need help from experts

So I personally have a very limited experience with actually building my own deep learning architectures, otherwise I would’ve done this myself.

But here’s what I’m thinking:

The model will have 3 parts:

  1. Entry blocks
  2. Central block (this is shared among all entry and exit blocks)
  3. Exit blocks

As mentioned, the central block is shared among all, and probably is going to be a massive architecture.

And for entry blocks you have architectures for few types of inputs.

For example, the IMAGE ENTRY BLOCK will accept images of 500x500, then multiple layers, then connects to the shared CENTRAL BLOCK.

Or, the SPEECH ENTRY BLOCK will accept audios, and multiple layers has multiple layers connected to this input layer, then at the connects to the shared CENTRAL BLOCK.

Same for the TEXT ENTRY BLOCK.

Then at the end you have your EXIT BLOCKS where it gives you a summary, as an image, and a text (which is the same as speech now that I think about it… Unless you don’t turn it to speech and let it blabber out nonsense until it figures itself out… Hmm)

I think the idea of swappable ENTRY and EXIT BLOCKS could be useful, but considering I have next to nothing experience with building model architectures, I will refrain from going too SciFi on y’all.


What you are suggesting has been explored in the deep learning literature before; for example, Gato by DeepMind, MultiModel by Google Brain, and Pathways, a work-in-progress architecture, also by Google Brain, that has witnessed great success in natural language processing. These models are straightforward to understand, with no particularly knotty bits, and I would recommend you give the papers a read if this topic has garnered your attention.

On a related note, models like these have spurred polarizing conversations about whether AGI is nigh or not. Upon Gato’s release, for instance, some claimed that AGI is essentially here, and it is only a matter of time, better accelerators, scale, etc. until true AGI is attained, whereas others replied that DeepMind’s model, although impressive, is far, far from AGI.

It is important to bear in mind what AGI refers to when engaging in such discussions - is it being able to simultaneously play different Atari games, conduct image classification, and write prose? Is it equivalent to human intelligence? Perhaps AGI surpasses human inteliggence? In that case, how can humans comprehend something that is, by definition, outside the scope of their understanding? Etc.

Is that helpful?


This was very helpful, thank you. I’ll take a look at those papers for sure.

1 Like