Elon Musk's AI company, xAI, has introduced an upgraded version of its Grok 1.5 model, known as Grok 1.5 Vision. This enhanced model now incorporates computer vision capabilities, enabling it to process visual content and respond to related queries.

| Updated on: Apr 16 2024, 08:27 IST
Elon Musk's AI venture, xAI, has recently introduced an upgraded version of its Grok 1.5 model – the Grok 1.5 Vision. This new model integrates computer vision capabilities, allowing it to interpret visual content and respond to questions about images. This development comes shortly after OpenAI presented its GPT-4 model, which also boasts computer vision features.

xAI announced this upgrade through their official X account (formerly Twitter), sharing insights into the model's capabilities via a blog post. While the core features of Grok 1.5 remain consistent with this updated version, the added vision capabilities promise to open new horizons in AI interaction with the real world.

Benchmark Scores and Performance

Benchmark tests were conducted by xAI, showcasing Grok 1.5 Vision's performance against various metrics, including the company's proprietary RealWorldQA benchmark. This benchmark evaluates the model's "real-world spatial understanding." Additionally, the model was assessed in other tests like MMMU and ChartQA. Impressively, in RealWorldQA, Grok surpassed OpenAI's GPT-4 with Vision and Google's Gemini 1.5 Pro, although it lagged behind in other tests.

Understanding Computer Vision

Computer vision is an exciting field in computer science focused on enabling computers, including AI models, to recognize and interpret real-world objects through images and videos. Essentially, it aims to empower machines with human-like vision capabilities.

Several leading tech companies are investing heavily in developing vision-centric AI models. Google's Gemini 1.5 Pro and OpenAI's GPT-4 with Vision are notable competitors in this space.

The potential applications for computer vision are vast and transformative. For instance, Healthify, an Indian platform for calorie tracking and nutrition, recently integrated a feature named 'Snap'. Here, users can photograph food items, and the AI suggests healthier recipe modifications and exercise regimens to offset calorie intake. Beyond this, computer vision holds promise for medical diagnostics, autonomous vehicles, and more.

First Published Date: 15 Apr, 18:52 IST
    Trending Gadgets

    Mobiles Laptops Tablets