DAWN OF LMMs 🔥 Microsoft puts GPT Vision to test... Final AI Agents Puzzle Piece?
Get on my daily AI newsletter 🔥
https://natural20.beehiiv.com/subscribe
[News, Research and Tutorials on AI]
See more at:
https://natural20.com/
The Paper:
https://arxiv.org/abs/2309.17421
My AI Playlist:
https://www.youtube.com/playlist?list=PLb1th0f6y4XROkUAwkYhcHb7OY9yoGGZH
[TIMELINE]
[00:00] Intro
[02:22] Abstract
[03:53] Accounting
[04:44] Attention to Detail
[06:23] Image Recognition Across Domains
[08:53] Medical Reasoning
[11:23] Making Coffee + Embodied Agents
[12:54] Industry, Manufacturing and QA
[17:11] Graphical User Interface Navigation
[26:24] Understanding Video, Emotions and Aethetics
[29:10] Analyzing Dash Cam Footage
[30:48] Improving AI Image Prompts
[32:42] Visual Poitnting
[37:51] Charts, Languages, Memes and Clues.
[51:23] Final Points
2023-10-08T00:42:03Z1280
https://www.youtube.com/embed/-aV9Ud-pYxQ