UNIT.City — місце, де люди працюють... КРАЩЕ! Обирай свій простір просто зараз 👉
Олександр КузьменкоAI Eng
11 April 2025, 13:28
2025-04-11
Google's Gemini AI is better at tracking and fixing bugs on iOS than on Android
Instabug created a tool called SmartResolve that uses AI models to find the causes (and potential fixes) of bugs in apps on Apple’s iOS and Google’s Android. It turns out that all AI models, including Gemini, do a better job on iOS. Why?
Instabug created a tool called SmartResolve that uses AI models to find the causes (and potential fixes) of bugs in apps on Apple’s iOS and Google’s Android. It turns out that all AI models, including Gemini, do a better job on iOS. Why?
The study, reported by Business Insider, used leading artificial intelligence models to automate the process of detecting application failures, diagnosing problems, and generating useful code fixes.
The researchers used models from OpenAI, Anthropic, Google, and Meta on a database of real-world app crashes. Each fix was scored on correctness, similarity to human fixes, depth of root cause analysis, relevance, and overall consistency.
The AI models were found to consistently perform better on iOS than Android. The bug fixes found by SmartResolve on Apple’s platform were more accurate, consistent, and well-structured in almost every model tested.
OpenAI’s GPT-4o model scored 60% on iOS versus 49% on Android. With the o1 model, the difference was even more pronounced—it reached 62% on iOS but dropped to 26% on Android, and often failed to respond at all to tests on Android.
Other models had a similar picture. Anthropic’s Claude Sonnet 3.5 V1 scored 58% on iOS and 56% on Android — a smaller gap, but still iOS leading the way.
Even Google’s Gemini 1.5 Pro performed worse on Android (51%) than on iOS (59%). Instabug found that this model also experienced more hallucinations.
This difference may be due to the openness and fragmentation of the Android ecosystem. Compared to iOS, which offers a more unified environment, the wider range of devices and crash types on Android can make it difficult for AI models to generalize fixes.
«The higher performance on iOS is partly due to the structure of native iOS languages like Swift and Objective-C. Their syntax is more predictable and strongly typed, making it easier for LLM developers to generate accurate fixes,» says Kenny Johnston, CPO at Instabug.
He added that Android’s programming languages (Java and Kotlin), as well as the variability of bug formats, mean greater complexity for fixes.
Як нейромережі бачать вільну та незалежну Україну? Тест dev.ua
Нейронні мережі для генерації зображень бачать світ по-своєму, їхню логіку зрозуміти часом зовсім неможливо. Але таки хочеться. На честь Дня Незалежності України редакція dev.ua вирішила провести невеликий експеримент.
Ми задали чотирьом різним нейронним мережам п’ять однакових запитів: «прапор України», «День Незалежності України», «український Крим», «перемога України» та «українці». Отриманими результатами ми ділимося з вами нижче.