GPT-4 is coming next week – and it will be multimodal

AI, AI Models, Artificial Cognition, GPT-4, Multimodal Large Language Model -

GPT-4 is coming next week – and it will be multimodal

GPT-4 is coming next week – and it will be multimodal, says Microsoft Germany

The release of GPT-4 is imminent, as Microsoft Germany CTO Andreas Braun mentioned at an AI kickoff event on March 9, 2023.

GPT-4 is coming next week: at an approximately one-hour hybrid information event entitled " AI in Focus - Digital Kickoff " on 9 March 2023, four Microsoft Germany employees presented Large Language Models (LLM) like GPT series as a disruptive force for companies and their Azure-OpenAI offering in detail. The kickoff event took place in the German language, news outlet Heise was present. Rather casually, Andreas Braun, CTO Microsoft Germany and Lead Data & AI STU, mentioned what he said was the imminent release of GPT-4. The fact that Microsoft is fine-tuning multimodality with OpenAI should no longer have been a secret since the release of Kosmos-1 at the beginning of March.


Andreas Braun, CTO Microsoft Germany and Lead Data &  AI STU


dr Andreas Braun, CTO Microsoft Germany and Lead Data & AI STU at the Microsoft Digital Kickoff: "AI in Focus" (AI in Focus, Screenshot)

(Image: Microsoft)


"We will introduce GPT-4 next week, there we will have multimodal models that will offer completely different possibilities – for example videos," Braun said. The CTO called LLM a "game changer" because they teach machines to understand natural language, which then understand in a statistical way what was previously only readable and understandable by humans. In the meantime, the technology has come so far that it basically "works in all languages": You can ask a question in German and get an answer in Italian. With multimodality, Microsoft(-OpenAI) wants "make the models comprehensive".

Braun was joined by the CEO of Microsoft Germany, Marianne Janik, who spoke across the board about disruption through AI in companies. Janik emphasized the value creation potential of artificial intelligence and spoke of a turning point in time – the current AI development and ChatGPT were "an iPhone moment". It's not about replacing jobs, she said, but about doing repetitive tasks in a different way than before. One point that is often forgotten in the public discussion is that "we in Germany still have a lot of legacy in our companies" and "keep old treasures alive for years".

Disruption does not necessarily mean job losses. It will take "many experts to make the use of AI value-adding", Janik emphasized. Traditional job descriptions are now changing and exciting new professions are emerging as a result of the enrichment with the new possibilities. She recommends that companies form internal "competence centres" that can train employees in the use of AI and bundle ideas for projects. In doing so, "the migration of old darlings should be considered".


dr  Marianne Janik, CEO of Microsoft Germany


Disruption in the German Industry: Keynote by Dr. Marianne Janik, CEO Microsoft Germany at "AI in Focus" (Screenshot)

(Image: Microsoft)


In addition, the CEO emphasized that Microsoft does not use customers' data to train models (which, however, does not or did not apply at least to their research partner OpenAI according to its ChatGPT policy). Janik spoke of a "democratisation" – by which she admittedly only meant the immediate usability of the models within the framework of the Microsoft product range, in particular their broad availability through the integration of AI in the Azure platform, Outlook and Teams.

Clemens Sieber (Senior AI Specialist) and Holger Kenn (Chief Technologist Business Development AI & Emerging Technologies, both Microsoft Germany) provided insights into practical AI use and concrete use cases that their teams are currently working on, but also into technical backgrounds. Kenn explained what multimodal AI is about, which can translate text not only accordingly into images, but also into music and video. He talked about embeddings, which are used for the internal representation of text in the model, in addition to the GPT-3.5 model class. Responsible AI is already built into Microsoft products according to Kenn, and "millions of queries can be mapped into the APIs" via the cloud. Most of the audience probably agreed with him on a basic assessment, that now is the time to get started. Especially in the programming area,


dr  Andreas Braun, CTO Microsoft Germany, and Dr.  Holger Kenn, Chief Technologist Business Development AI &  Emerging Technologies, vividly explains how multimodality works ("AI in Focus", Screenshot)


dr Holger Kenn, Chief Technologist Business Development AI & Emerging Technologies, vividly explains how multimodality works ("AI in focus", screenshot)

(Image: Microsoft)


Clemens Siebler illustrated with use cases what is already possible today. For example, speech-to-text telephone calls could be recorded and the agents of a call center would no longer have to manually summarize and type in the content. According to Siebler, this could save 500 working hours a day for a large Microsoft customer in the Netherlands, which receives 30,000 calls a day. And the prototype for the project was created within two hours, a single developer implemented the project in a fortnight (plus further time for final implementation). According to him, the three most common use cases are answering questions on company knowledge that is only accessible to employees, AI-assisted document processing and semi-automation by processing spoken language in the call and response centre.


Focus on AI – Microsoft digital kickoff on March 9, 2023


Digital Kick-off Event "AI in Focus" at Microsoft Germany, 9 March 2023 ("AI in Focus - Digital Kickoff", Screenshot)

(Image: Microsoft)


When asked about operational reliability and fact fidelity, Siebler said that the AI ​​will not always answer correctly, so it is necessary to validate. Microsoft is currently creating confidence metrics to address this issue. 

Customers often use AI support only on their own data sets, primarily for reading comprehension and querying inventory data, where the models are already quite accurate. However, the text generated by the model remains generative and is therefore not easily verifiable. 

"We build a feedback loop around it with thumbs up and thumbs down," Siebler said – this is an iterative process. Interestingly, none of the four Microsoft employees commented on AI integration in the company's own search engine, "the new Bing". The final panel was not open to audience questions,



Leave a comment

#WebChat .container iframe{ width: 100%; height: 100vh; }