Ellick Chan Helps his Clients and Students Harness Deep Learning Technology
From speech recognition to self-driving cars, artificial intelligence (AI) has gone from science fiction to a significant part of our lives in less than a decade. A growing number of state-of-the-art AI systems are powered by Deep Learning, a form of machine learning inspired by the human brain to solve challenging problems in computer vision, speech recognition and language processing. Deep Learning techniques are rapidly becoming one of the most effective tools for developing intelligent phones, cars, and other smart things such as thermostats and appliances.
Siebel Scholar Ellick Chan (UIUC CS ’04) discovered Deep Learning as a post-doc at Stanford and now he’s committed to teaching the tool to teachers, musicians, doctors and others who wouldn’t normally have access to the engineering or computer science background to pursue it. Chan recently delivered a tutorial on Deep Learning to industry partners at Northwestern’s analytics exchange.
An adjunct faculty member at Northwestern University’s McCormick School of Engineering, he also teaches graduate students who he hopes will apply their newly acquired knowledge of Deep Learning and data analytics to many facets of industry. Chan’s courses are offered through Northwestern’s MSiA data science program, which has been identified by Forbes as a top 6 Data Science Master degree courses in the U.S.
A PhD in computer science who also has a MBA, Chan hopes to democratize the use of Deep Learning to spread the benefits of artificial intelligence beyond the usual suspects in Silicon Valley.
Chan spoke with the Siebel program about the various uses of Deep Learning and ways to make it available beyond the companies that are currently using it.
Q: Let’s start with the fundamentals. How do you define Deep Learning?
The original model for deep learning was loosely based on a simplified model of the human brain with layers of interconnected neurons that work together.
Neuroscientists recognize that even the best scientific models only begin to scratch the surface when it comes to the complexity of the human brain. One simple way to think about Deep Learning is that there are many perceptual tasks that are relatively easy for humans to do, such as recognizing images. You could show a person dozens of pictures of cats in different poses. You could show cats running, sleeping, up close, in black and white or in color and humans can recognize them without a problem. Recognizing images with lots of variations in lighting, camera angle and poses was super easy for humans and super hard for computers, until recently.
The same goes for recognizing speech, like understanding the sentiment of a movie review. If someone tweets, “The acting was great but the plot was unoriginal,” a computer previously couldn’t easily tell if the reviewer thought highly of the movie.
Deep Learning is what allows computers to do these things. Deep learning is what makes speech recognition in systems such as Siri, Cortana and Google Now work. Self-driving cars are all driven by Deep Learning. The technology, datasets and computing power necessary to do effective deep learning have largely developed over the last couple years. The technology is now rapidly graduating from academia to industry and there’s a broad interest in many economic sectors from the automotive industry to medicine.
We’re currently approaching a point where many industries are interested in trying Deep learning, but there are some open problems that need to be solved before human-level performance can be attained reliably. Deep learning currently seems to be very good at learning lots of varied examples through rote memorization and computation power rather than developing a genuine intuition about why objects are similar, a task which humans seem to have a much easier time with. Researchers are working on ways to get computers to recognize objects from a small number of examples rather than the hundreds of examples currently needed. At the same time, it’s difficult to understand why a model makes decisions, and understanding these models is a topic that we’ll have to tackle to build reliable and trustworthy AI.
Q: What are some of the problems you can solve with this type of machine learning?
One of the first projects I worked on at Exponent experimented with Deep Learning to help detect buried explosive hazards (IEDs) in Afghanistan by automatically scanning radar images of the ground.
At the time, US troops had to drive very cautiously so they’d have adequate time to slam on the brakes if they saw an IED on the radar screen. Over time, the troops would get fatigued and they’d have to go even slower to avoid making deadly mistakes.
Exponent had built a state-of-the-art IED detection system using Ground Penetrating Radar (GPR) and it relied on traditional computer vision techniques. You can think of it like an airport security scanner but instead of putting luggage through scanner, now you take the scanner and point it at the ground. One problem is it’s really hard to see through all the sand and rocks to find the hiding IEDs.
When I joined Exponent two and a half years ago, I helped build a prototype GPR system based on Deep Learning, and the system has shown great potential toward improving the detection rate of IEDs and saving lives in the process. It’s not perfect. We weren’t able to tune the system to achieve the same level of accuracy as the best humans, but the computer never gets tired so it’s a great companion to help the already fatigued troops.
We got the whole system running on a truck and it has shown great potential toward improving the detection rate of IEDs and saving lives in the process. It’s one of the projects I enjoyed the most.
Q: What are some of the other projects you work on as a consultant?
Exponent is a multi-disciplinary consulting company, which serves over 90 different scientific and engineering disciplines. Over half of our consultants have a PhD or MD degree, and I’m constantly finding new uses for analytics and Deep Learning in other areas that aren’t traditionally a hotbed for machine learning and analytics. People often come to me with an image recognition problem that they don’t yet realize can be solved by computer vision or algorithms.
I joined two and a half years ago because I wanted to get into different areas where advanced technology is just starting to be adopted. Deep Learning is an up and coming field with lots of potential application to different domains in which domain experts usually don’t have deep machine learning expertise and machine learning experts don’t usually have deep domain expertise.
Q: How did you become interested in Deep Learning?
I earned three degrees in computer science, a bachelor’s, master’s and a PhD at the University of Illinois at Urbana-Champaign. I jumped out to go to business school at the same university because I felt too confined in traditional computer science and engineering. Most of the world outside of Silicon Valley didn’t understand the work I did and how it directly applies to their everyday life.
In pure academia, I felt backed into a corner in the sense that few people outside of computer science research could understand the work we did. To broaden my horizons, I joined Stanford in 2011 to do a post doc on a medical data privacy project. At the tail end of the post doc in 2013, Deep Learning started getting popular and I knew that I wanted to pursue more work in the area. Industry was just starting to pick up the technology, and it was a great time to get involved because of all the breakthroughs in computer vision, natural language processing and speech recognition powered by deep learning.
After my post doc at Stanford, I wanted to get closer to other industries where people are working on the diverse areas such as the power grid, consumer electronics and automotive. I see lots of potential for Deep Learning in many industries, and I’m taking a two-pronged approach to accelerating the adoption of Deep Learning.
Q: One prong is your work as a consultant, is teaching the other?
Yes, I teach Deep Learning at Northwestern University in the Masters of Science in Analytics (MSiA) program. Many of my students have already worked in consulting, and many of them will work in non-traditional areas for Deep Learning, such as healthcare, insurance, finance, or operations. There’s a lot of other industries in need of machine learning and artificial intelligence but the people who are domain experts in those areas usually don’t have machine learning expertise. And machine learning experts don’t usually have domain expertise.
The graduates of medical school or automotive schools don’t usually study the analytics part deeply so my goal is to make it available to people who wouldn’t have the math background to do this stuff. Most of my students are not engineers or computer scientists. Some come from economics, some from accounting, two were musicians, and the vast majority do not have a traditional engineering or mathematical background.
Last year, my students produced eight student projects. One group of my students placed in the top ten percent of an international data competition. I was especially proud of them. I only teach in the spring quarter and honestly, it’s probably the most enjoyable part of my year. I love working with the students. They’re pretty open-minded and very passionate about the work they do and how it can be applied to practical problems.
Q: What types of projects do they work on?
They were given only ten weeks to do their projects and they came up with some really respectable results in detecting distracted driving, classifying Yelp food images, making musical lyrics and finding people in a crowd.
Q: Did you always plan to teach?
No, not at all. If you asked me five years, ago, I wouldn’t have imagined doing so. In graduate school, I didn’t care to teach or TA at all because I was too focused on my work. I got through school without ever doing either. It’s been deeply gratifying to be able to give back to the university and share my knowledge and enthusiasm in an area that I’m very passionate about.
Q: What are the limitations of Deep Learning?
Have you heard the saying, “Don’t practice until you get it right. Practice until you can’t get it wrong,”?
Machine learning still gets things wrong too often for many use cases. Even if a computer can get to 80 percent accuracy, 80 percent is not always good enough for an airplane or for a driverless car.
To get to 99.999%, we have to understand how the machine is making its decisions, where it is right, and where it fails. Many algorithms for artificial intelligence have long been considered black boxes that we cannot meaningfully inspect. How do we understand and visualize how the algorithm is making decisions?
One paper that inspired me to investigate this route was from fellow Siebel Scholar Carlos Guestrin (Stanford CS ’02) about how AI models make decisions. I’m working to integrate some of his work into my class this year to help students better understand and debug their models so that they can build explainable and dependable models – two key characteristics needed for AI to proliferate into safety-critical applications.
I think that being able to test and understand models will allow us to make better and safer models, and understand their limitations. Currently, we’re still in the early days of how existing safety standards in the automotive industry apply to AI models, and I think that the effort to understand and test models is one step in the right direction.