- Cyberman
- Posts
- ChatGPT vs Gemini: The Battle of Visual Abilities
ChatGPT vs Gemini: The Battle of Visual Abilities
AND: News from Microsoft, Samsung and DeepMind
Welcome back, Cyberman!
Today we're exploring ChatGPT-4V vs. Google's Gemini Pro in visual capabilities. Although both models are impressive, there is still room for improvement. The paper concludes that the journey towards achieving multimodal general-purpose AI is still a long one.
In today’s menu:
ChatGPT-4V and Google's Gemini Pro compete in visual capabilities
Microsoft: Today begins the era of the AI PC
Samsung: Galaxy AI is coming
AI Tutorials: 26 effective prompting principles
3 Trending short takes
Read time: 4 minutes!
LATEST NEWS
ChatGPT-4V and Google's Gemini Pro compete in visual capabilities
Two recent papers from Tencent Youtu Lab, the University of Hong Kong, and several other universities and institutes provide a comprehensive comparison of the visual capabilities of Gemini Pro and GPT-4V, which are currently the most advanced multimodal language models (MLLMs).
In certain tasks, both models perform equally well, although GPT-4V is slightly rated as more powerful overall.
The evaluation covered various areas such as image recognition, text recognition in images, image and text understanding, object localization, and multilingual capabilities.
You can read more about the study here.
NEWS FROM MICROSOFT
Today begins the era of the AI PC
Microsoft is making 2024 the "year of the AI PC" by introducing a new Copilot key on laptops and PCs.
This key provides quick access to Microsoft's AI-powered Windows Copilot experience. It's the first major change to the Windows PC keyboard layout in almost 30 years.
According to Yusuf Mehdi, executive vice president and consumer chief marketing officer at Microsoft, this marks a transformative moment where Copilot becomes the gateway to AI on the PC.
NEWS FROM SAMSUNG
Galaxy AI is coming
Samsung just released a teaser video on YouTube announcing its Galaxy launch event on Jan. 17.
According to Samsung “A revolutionary mobile experience is coming.” The new Galaxy S series promises to transform everyday life with an intelligent mobile experience powered by AI.
The event will be streamed live on various Samsung platforms at 10 a.m. PST, 1 p.m. EST, 6 p.m. GMT, and 7 p.m. CET.
AI TUTORIAL
How to effectively communicate with ChatGPT
A new research paper has been released demonstrating 26 principles for effectively communicating with a language model like ChatGPT. The paper suggests being direct, tailoring prompts to the audience's knowledge, using simple and affirmative language, and ensuring clarity by explaining concepts as if to a beginner.
Here is the link to the full paper.
SHORT TAKES
Researchers at DeepMind have introduced ALOHA: A Low-cost Open-source Hardware System for Bimanual Teleoperation. This robot can autonomously complete complex mobile manipulation tasks, such as:
Cooking and serving shrimp
Calling and taking an elevator
Storing a 3Ibs pot in a two-door cabinet and more
A Reddit user saved a life using ChatGPT! A day at the office turns into an unexpected wildlife rescue! Discover how ChatGPT helps a receptionist save a distressed baby sparrow, guiding them through a heartwarming journey of care and reunion.
Read the full story here.
OpenAI’s custom GPT store opens up next week! OpenAI's GPT Store, a platform for users to sell and share customized AI agents, is launching next week. Although the launch was previously delayed, the platform is now nearing readiness.
In an email addressed to GPT Builders, OpenAI reminded GPT Builders to comply with brand guidelines and make their GPTs public.
FEEDBACK
How would you rate it?Your feedback is invaluable in helping me create better emails for you! |
Thanks for reading!
Feel free to share any specific feedback or interesting insights by replying to this email. I’m all ears!