Skip to main content

Seeing apps like a user

· One min read
Chris Navrides

Robot looking at a computer

Web pages have lots of elements

Today a modern web application has hundreds or thousands of elements on each page. Most of these elements are only there for styling or because the framework used automatically added them in there. This complicates the process of finding and interacting with the right element. Additionally it can slow down operations during test runs because all of these elements must be filtered down.

YOLO (You Only Look Once)

With advances in computer vision and object detection model architectures, you can now find objects quickly from an image. At Dev Tools we used AI models, like YOLO, and train them specifically on web and mobile apps to find elements. Today we are happy to share that the results are looking amazing!

Amazon.com

View of Amazon

NYTimes.com

View of NYTimes

Next Steps

As a next step to further train the AI, we are working on training the AI not to just detect elements, but understand what the elements are. Imagine the possibilities of seeing objects on the screen not as boxes, but as search icons, and shopping carts :)

Icon Understanding