AI agent + Vision = Incredible
A step by step tutorial of how to build vision powered AI agent via autogen + llava + stable diffusion AND Break down of 160-page analysis of GPT4V capabilities
🤘 Get 15% off on sceneXplain via my code AIJASON : https://go.jina.ai/scenexplainjason
🔗 Links
- Follow me on twitter: https://twitter.com/jasonzhou1993
- Join my AI email list: https://www.ai-jason.com/
- My discord: https://discord.gg/eZXprSaCDE
- sceneXplain: https://go.jina.ai/scenexplainjason
- Vision-agent Github: https://github.com/JayZeeDesign/vision-agent-with-llava
⏱️ Timestamps
0:00 Intro
1:15 What is multi-modal model
2:12 GPT4V ability break down
4:34 sceneXplain
6:00 Visual prompt techniques
10:53 Use cases
13:00 Build vision agent #1 - Setup
14:20 Build vision agent #2 - Use Llava model
15:58 Build vision agent #3 - Use Stable diffusion
16:52 Build vision agent #4 - Set agent system via autogen
18:53 Build vision agent #5 - Demo
👋🏻 About Me
My name is Jason Zhou, a product designer who shares interesting AI experiments & products. Email me if you need help building AI apps! ask@ai-jason.com
#gpt4 #autogen #autogpt #ai #artificialintelligence #tutorial #stepbystep #openai #llm #chatgpt #largelanguagemodels #largelanguagemodel #bestaiagent #chatgpt #agentgpt #agent #babyagi #llava #stablediffusion
2023-10-26T14:37:58Z1280
https://www.youtube.com/embed/JgVb8A6OJwM