David Shapiro, Max Tegmark, MLLM -

Papers referenced: Language Models Represent Space and Time - Beyond Surface Statistics: Scene Representations in a Latent Diffusion Model - Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task - Unveiling Theory of Mind in Large Language Models: A Parallel to Single Neurons in the Human Brain - Boosting Theory-of-Mind Performance in Large Language Models via Prompting - Tiny and Efficient Model for the Edge Detection Generalization - ART: Automatic multi-step reasoning and tool-use for large language models - Not referenced, but hilarious because Max Tegmark has the best paper titles: Omnigrok: Grokking Beyond Algorithmic Data

AGI, AI, Artificial Cognition, MLLM -

Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan ChatGPT is attracting a cross-field interest as it provides a language interface with remarkable conversational competency and reasoning capabilities across many domains. However, since ChatGPT is trained with languages, it is currently not capable of processing or generating images from the visual world. At the same time, Visual Foundation Models, such as Visual Transformers or Stable Diffusion, although showing great visual understanding and generation capabilities, they are only experts on specific tasks with one-round fixed inputs and outputs....

AGI, Artificial Cognition, MLLM, Sebastien Bubeck -

CSAIL lectures with Sebastien Bubeck (April 6,2023) The new wave of AI systems, ChatGPT and its more powerful successors, exhibit extraordinary capabilities across a broad swath of domains. In light of this, we discuss whether artificial INTELLIGENCE has arrived.

AI, AI Models, Ilya Sutskever, MLLM -

Asked Ilya Sutskever (Chief Scientist of OpenAI) about - time to AGI - leaks and spies - what's after generative models - post AGI futures - working with MSFT and competing with Google - difficulty of aligning superhuman AI      Timestamps 00:00 Time to AGI 05:57 What’s after generative models? 10:57 Data, models, and research 15:27 Alignment 20:53 Post AGI Future 26:56 New ideas are overrated 36:22 Is progress inevitable? 41:27 Future Breakthroughs     

MLLM, Multimodal Large Language Model -

This article is part of Demystifying AI, a series of posts that (try to) disambiguate the jargon and myths surrounding AI. OpenAI has released GPT-4, the latest edition of its flagship large language model (LLM). And though few details are available, what we do know is that it will be a “multimodal” LLM, according to a Microsoft executive who spoke at a company event last week. Basically, multimodal LLMs combine text with other kinds of information, such as images, videos, audio, and other sensory data. Multimodality can solve some of the problems of the current generation of LLMs. Multimodal language models will also unlock new...

