What's changed in the world of ML and deep learning since Part 1 2022 came out?


I was just reading the paragraph in chapter 1 where Jeremy talks about how everyone’s ignored foundation (“pre-trained”) models even though they’re the fastest way to get results.

Today, in 2024, foundational models are EVERYWHERE. From Hugging Face, to all the built-in pre-trained models offered in services like AWS Bedrock, this is definitely something the book is out-of-date with (it’s only been like 2 years, but here we are!)

I’m not complaining - I’m genuinely curious. What else has drastically changed since the book came out?

1 Like

yes. I’d also like to know more about it… Do we play with finetuning or work on architectures?

Lots of things have changed and keep changing, indeed.
To get a more stable view on things, I find the paper “On the Opportunities and Risks of Foundation Models”, by Bommasani et al, rather enlightening about why and how things have changed so much and what is probably coming.

I would say the models. Modern models have billions of parameter which means they have high costs of computation. Does your brain require billions of parameter when first learning to crawl so although the new models are fastastic (on a good day) I think they are a stepping stone only. I suspect Chomsky might be right. Semantics takes more than a billion parameters.
Regards Conwyn

So would you say, for data scientists, the skill they need to master is the use of foundation models?

What about for product focused people? Do they have to learn the benefits of foundation models to apply them to their business?

Similarly for ML/MLops engineers - should they focus on learning to integrate foundation models via their APIs?

I think the problem is the name Data Scientist. A Data Scientist should know traditional statistics, presentation skills and understand the new AI tools. Once they become nothing more than Foundation Model technicians (and the money might be good) they stop being Data Scientists. A model is an approximation of reality. It no understanding only the appearance of understanding.

For people who work in the real business when data is vague (on a good day) then will use models which I assume will pop up in Excel eventually.

ML/MLops just need to use the models. That is why we have API (input - magic - output)

You can say there are a lot of general foundational models available. However, if I remember correctly, chapter 1 specifically mentioned there is a lack of foundational models in specific or specialized fields/domains. May be worth looking at it from this angle.

Since Part 1 2022, there have been several notable advancements and changes in the world of machine learning (ML) and deep learning:

  1. Advancements in Models: There have been significant advancements in large language models (LLMs), such as GPT-4 and newer versions of BERT, which have achieved better performance in various natural language processing (NLP) tasks.
  2. Attention Mechanisms: Attention mechanisms have become more prevalent in deep learning models, improving their ability to focus on relevant parts of input data and enhancing performance in tasks like image recognition and language translation.
  3. Efficient Architectures: There has been a focus on developing more efficient deep learning architectures, such as EfficientNet and MobileNet, to reduce model size and computational requirements while maintaining or even improving performance.
  4. AutoML and Model Compression: The development of AutoML tools and techniques for model compression has democratized machine learning by making it more accessible to developers with limited expertise and resources.
  5. Applications in Healthcare: Deep learning has seen increased adoption in healthcare, with applications ranging from medical image analysis for diagnosis to drug discovery and personalized treatment plans.
  6. Ethical Considerations: There is growing awareness of the ethical implications of AI and deep learning, leading to discussions and initiatives around responsible AI development, fairness, transparency, and accountability.
  7. Climate Change and Sustainability: Researchers are exploring ways to make deep learning more sustainable by developing energy-efficient models and reducing the environmental impact of training large-scale models.