• AI Time to Impact
  • Posts
  • . . AI: AI Evolution, Combining Models and Workflows: The Next Paradigm Shift? (3.21.24)

. . AI: AI Evolution, Combining Models and Workflows: The Next Paradigm Shift? (3.21.24)

AI Agent Workflows, Foundational Models, Gemini 1.5 Pro, Open Data

Friends, today's stories are amazing. From Sakana AI's revolutionary approach to model development through evolutionary techniques to the UN's landmark resolution on AI regulation, today's news captures the irony of our quest for progress—where machines learn to evolve and global policies strive to keep pace with their rapid advancements. 

Thanks for joining us. If you appreciate these stories, we hope you’ll join others in sharing them with friends and co-workers. A warm welcome to the hundreds of you who’ve joined us so far this month.

Now here’s today’s AI news,

Marshall Kirkpatrick, Editor

First impacted: AI model developers, AI researchers
Time to impact: Medium

Sakana AI, whose founder was one of the co-authors of the legendary "Attention Is All You Need" paper, has shared a report on what they call the Evolutionary Model Merge, a method they say automates the creation of foundation models. It's "...a general method that uses evolutionary techniques to efficiently discover the best ways to combine different models from the vast ocean of different open-source models with diverse capabilities. As of writing, Hugging Face has over 500k models in dozens of different modalities that, in principle, could be combined to form new models with new capabilities!" The company tested this method by creating a Japanese Large Language Model (LLM) and a Japanese Vision-Language Model (VLM), claiming it can generate new foundation models without gradient-based training (the dominant training model, like gradually adjusting the knobs on a machine to find the perfect setting where it makes the least mistakes), thus using fewer computing resources. If you are researching AGI and the evolutionary pathways of foundational models, we highly recommend you read this. [Evolving New Foundation Models: Unleashing the Power of Automating Model Development] Explore more of our coverage of: Sakana AI, Evolutionary Model Merge, Foundation Models. Share this story by email

First impacted: AI researchers, AI developers
Time to impact: Medium

Andrew Ng, renowned as one of the Godfathers of AI, has released a post outlining how AI agent workflows are anticipated to drive substantial AI progress this year, potentially surpassing the impact of next-generation foundation models. The use of these workflows with LLMs, the proliferation of open source agent tools, and academic research on agents is contributing to the development of more design patterns for building agents. Ng says his team regularly uses reflection, tool use, planning, and multi-agent collaboration. [DeepLearning.ai The Batch Issue 241] Explore more of our coverage of: AI Agent Workflows, Foundation Models, Open Source Tools. Share this story by email

First impacted: Developers, Financial Analysts
Time to impact: Short

Google has launched API support for its Gemini 1.5 Pro, a model capable of handling a whopping 10M token context length. They plan to start with developers and to gradually increase the number of users and API access. The context length is enough to process an entire movie's worth of information. You can test it out in a chat interface here; unfortunately, when we gave it a 4 minute video talking through an outline, and asked it to capture that outline, the output was on an entirely different topic. [Google AI Studio] Explore more of our coverage of: Google API, Gemini 1.5 Pro, Code Analysis. Share this story by email

First impacted: Data scientists, AI researchers
Time to impact: Short

Hugging Face has released a post outlining the creation of Cosmopedia, the largest open synthetic dataset to date, with over 30 million files and 25 billion tokens of synthetic textbooks, blogposts, stories and WikiHow articles. To create Cosmopedia, Hugging Face used a unique approach of combining curated sources and web data to generate over 30 million diverse prompts, ensuring less than 1% duplicate content and covering a wide range of topics. In addition, they have launched a model named cosmo-1b, trained on Cosmopedia, which they claim outperforms TinyLlama 1.1B on several benchmarks and matches Qwen-1.5-1B on others. [Cosmopedia: how to create large-scale synthetic data for pre-training] Explore more of our coverage of: Hugging Face, Synthetic Dataset, AI Benchmarking. Share this story by email

First impacted: Robotics Researchers, AI Developers
Time to impact: Short

DROID, a diverse robotic dataset, has been launched, featuring 76k episodes from 13 institutions, 52 buildings, and 564 scenes. Sergey Levine (Associate Professor at UC Berkeley) announced that the datasets development was led by researchers Sasha Khazatsky and Karl Pertsch and is designed to prepare robots for a wide range of real-world environments and comes with a website, dataset visualizer, dataset Colab, and developer documentation. [via @svlevine] Explore more of our coverage of: Robotic Dataset, Real-World Environments, AI Development. Share this story by email

First impacted: AI system developers, Policy makers in AI technology
Time to impact: Long

The UN General Assembly, supported by over 120 member countries, has passed a resolution focused on the creation of "safe, secure and trustworthy" AI systems. The resolution, a first in this domain, emphasizes the importance of human rights in AI design and use, and promotes the development of regulations for safe AI use, while also advising stakeholders “to refrain from or cease the use of artificial intelligence systems that are impossible to operate in compliance with international human rights law or that pose undue risks to the enjoyment of human rights.” [General Assembly adopts landmark resolution on artificial intelligence] Explore more of our coverage of: UN General Assembly, AI Regulation, Human Rights in AI. Share this story by email