. . AI: LlaVA Claims to Top Gemini Pro, and more (1.31.24)

Together Inference, Argilla DPO, AI Audits

Friends, four stories rose to the top of the AI community conversation today. We’ve got short summaries below. From a multi-modal challenger to Gemini Pro to an audit of audits (they didn’t fare well), I hope these stories are interesting and useful to you.

-Marshall Kirkpatrick, Editor

First impacted: Artificial Intelligence developers, Chinese multimodal users
Time to impact: Short

The LLaVa team (with developers from Microsoft Research, TikTok, U of Madison and UC Berkeley) has released LLaVA-1.6, an upgraded model displaying enhanced reasoning and OCR capabilities, broader world knowledge, and the capacity to process images with four times more pixels across three aspect ratios. According to the team, the model was trained on 32 A100 GPU's in approximately one day, and it outperforms Gemini Pro on various tests. [LLaVA-1.6: Improved reasoning, OCR, and world knowledge] Explore more of our coverage of: AI Development, OCR Capabilities, Chinese Multimodal Scenarios. Share this story by email

First impacted: Data Scientists, Software Developers
Time to impact:

Together Inference has launched JSON Mode and Function Calling for its language models, meaning its customers can add 3rd party functionality to their use of Mixtral or CodeLlama and then have the output be delivered in a very specific JSON structure. Those are powerful capabilities. [Announcing function calling and JSON mode] Explore more of our coverage of: Language Models, JSON Mode, Function Calling. Share this story by email

First impacted: AI model developers, Data scientists, QA
Time to impact: Short

Spanish startup Argilla has launched an open direct preference optimization (DPO) dataset called 'distilabel capybara-dpo'. The dataset assists in fine-tuning AI chat models to more closely output answers that a human prefers, it operates without explicit reward modeling or reinforcement learning and has been identified as an important area for research in 2024 by people like Andrew Ng. [via @argilla_io] Explore more of our coverage of: Argilla, AI Chat Models, DPO Dataset. Share this story by email

First impacted: AI auditors, AI researchers, Alignment researchers, QA
Time to impact: Medium

The final story for today may not be seen as a 'riveting' field of focus (for now) but alignment and AI accountability is a very important topic for the proliferation of AI in all walks of life. They ensure that AI models stay safe and reliable. Modern audit frameworks lack many of the mechanisms to appropriately evaluate and monitor these aspects though. This research paper outlines the potential issues in the execution of AI audits and by extension, accountability. According to the findings, only a subset of AI audit methods are effectively achieving the desired accountability outcomes, underscoring the importance of audit design and the institutional context in which they are conducted. [AI auditing: The Broken Bus on the Road to AI Accountability] Explore more of our coverage of: AI Accountability, AI Audits, Audit Design. Share this story by email

Read more of today's top AI news stories like this at [https://aitimetoimpact.com/].

First impacted: Artificial Intelligence developers, Chinese multimodal users
Time to impact: Short

The LLaVa team (with developers from Microsoft Research, TikTok, U of Madison and UC Berkeley) has released LLaVA-1.6, an upgraded model displaying enhanced reasoning and OCR capabilities, broader world knowledge, and the capacity to process images with four times more pixels across three aspect ratios. According to the team, the model was trained on 32 A100 GPU's in approximately one day, and it outperforms Gemini Pro on various tests. [LLaVA-1.6: Improved reasoning, OCR, and world knowledge] Explore more of our coverage of: AI Development, OCR Capabilities, Chinese Multimodal Scenarios. Share this story by email

First impacted: Data Scientists, Software Developers
Time to impact:

Together Inference has launched JSON Mode and Function Calling for its language models, meaning its customers can add 3rd party functionality to their use of Mixtral or CodeLlama and then have the output be delivered in a very specific JSON structure. Those are powerful capabilities. [Announcing function calling and JSON mode] Explore more of our coverage of: Language Models, JSON Mode, Function Calling. Share this story by email

First impacted: AI model developers, Data scientists, QA
Time to impact: Short

Spanish startup Argilla has launched an open direct preference optimization (DPO) dataset called 'distilabel capybara-dpo'. The dataset assists in fine-tuning AI chat models to more closely output answers that a human prefers, it operates without explicit reward modeling or reinforcement learning and has been identified as an important area for research in 2024 by people like Andrew Ng. [via @argilla_io] Explore more of our coverage of: Argilla, AI Chat Models, DPO Dataset. Share this story by email

First impacted: AI auditors, AI researchers, Alignment researchers, QA
Time to impact: Medium

The final story for today may not be seen as a 'riveting' field of focus (for now) but alignment and AI accountability is a very important topic for the proliferation of AI in all walks of life. They ensure that AI models stay safe and reliable. Modern audit frameworks lack many of the mechanisms to appropriately evaluate and monitor these aspects though. This research paper outlines the potential issues in the execution of AI audits and by extension, accountability. According to the findings, only a subset of AI audit methods are effectively achieving the desired accountability outcomes, underscoring the importance of audit design and the institutional context in which they are conducted. [AI auditing: The Broken Bus on the Road to AI Accountability] Explore more of our coverage of: AI Accountability, AI Audits, Audit Design. Share this story by email

That’s it! More AI news tomorrow!