RT-X and the Dawn of Large Multimodal Models: Google Breakthrough and 160-page Report Highlights
A huge new insider report on GPT Vision is released by Microsoft and just in the last few hours the RT-X series is dropped by Google in Robotics. I will not only break down the 160+ page report on what GPT-4V can do and what it can't, including new use cases, prompting techniques and failure modes, I'll also go through the full RT-2-X and RT-1-X demo, which I am calling the GPT2 moment for robotics. Plus its huge new opensource Open X-Embodiment dataset.
https://www.patreon.com/AIExplained
RT-X Series: Paper: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/scaling-up-learning-across-many-different-robot-types/Open_X_Embodiment__Robotic_Learning_Datasets_and_RT_X_Models.pdf
Blog: https://www.deepmind.com/blog/scaling-up-learning-across-many-different-robot-types?
Github: https://robotics-transformer-x.github.io/
utm_source=twitter&utm_medium=social&utm_campaign=RT-X
The Dawn of LLMS: https://arxiv.org/abs/2309.17421
GPT 4 vs Pali 17B: https://openai.com/research/gpt-4
GPT V Inception: https://twitter.com/markchen90/status/1708867434491380057/photo/1
PaLI-X: https://arxiv.org/pdf/2305.18565.pdf
PromptBreeder: https://arxiv.org/pdf/2309.16797.pdf
Wozniak Definition: https://en.wikipedia.org/wiki/Artificial_general_intelligence
Gobi from The Information: https://www.theinformation.com/articles/openai-hustles-to-beat-google-to-launch-multimodal-llm?rc=sy0ihq
https://www.patreon.com/AIExplained
2023-10-06T13:49:26Z1280
https://www.youtube.com/embed/GZdytTKeGYM