How ‘Object Recognition’ Is Helping the Blind

Vanessa Martinez
3 min readMar 3, 2020

Woman uses Amazon Echo Show’s “Show and Tell” feature

Digital health technology has improved the way people manage their conditions. If you take prescription medication, you can use dose tracking apps. If you have heart disease, you can use wearables, like Fitbit, to monitor a healthy lifestyle. That’s great, but what about underserved populations like the blind or visually impaired?

Technologies with object recognition are proving quite useful. For instance, Amazon has a smart display called, “Echo Show,” which has a “Show and Tell” feature. The feature uses object recognition to identify household pantry items for blind or visually impaired people. It works like this:

1) A person holds up a common pantry item in front of the camera.

2) They say, “Alexa, what am I holding?”

3) Alexa tells them the name of the object.

It sounds simple enough, but there’s way more to it than that.

Object recognition works by identifying objects in images and videos. This is made possible through computer vision and machine learning — a subset of artificial intelligence. The computer vision technique allows a computer to literally “see” and understand content. The process by which this happens involves machine learning.

With machine learning, a computer uses models and algorithms to learn for itself. It starts by recognizing features in objects (manual feature extraction) and then grouping them into a specific class.

Image by MathWorks

As you can see, the computer is able to recognize different versions of one thing and categorize it correctly. Based on what it has learned from data, it can recognize patterns, and ultimately make decisions on its own. Here’s another example:

Image by MathWorks

The two different types of cats are classified as “cat” and stored into an algorithm. The more cats the computer sees, the better it becomes at recognizing one.

Deep Learning vs. Machine Learning

Deep learning, a subset of machine learning, does this on a massive scale. It’s similar to machine learning but provides far greater accuracy. That’s because deep learning can involve showing a computer thousands (or millions) of images of cats and dogs until the system is able to automatically distinguish the two. Some deep learning models, such as convolutional neural networks (CNNs) make this possible by mimicking the human brain’s neural networks.

Image by MathWorks

It can take enormous quantities of data to achieve deep learning. So, if you don’t possess a large quantity of training images, it may be best to use machine learning.

Just remember, no matter how you get there, if your goal is to achieve object recognition, you may be doing more than creating a cool feature. You could also be helping someone see.

References:

https://www.mathworks.com/solutions/image-video-processing/object-recognition.html

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Vanessa Martinez
Vanessa Martinez

Written by Vanessa Martinez

Freelance Software Engineer. Experience in JavaScript, React.js, and Ruby on Rails.

No responses yet