You’ve heard the name over and over, and for most of you it probably settles into the same category as Harry Potter’s “levitation charm” as far as whether you need to understand it. That’s cool, most people will never need to know this stuff, in the same fashion as you don’t need to know the specific chemical reactions that go on when gasoline burns inside your engine cylinders. You just want to turn the key and go!
When the term “Deep Learning” started getting used, the media clamped onto it because it was a pretty sexy marketing term, and it sold a lot of print and eyeballs. This is because we have a somewhat intuitive sense of what we conceive “Deep Learning” to be – it conjures up images of professors doing serious research and coming up with great discoveries, etc. Because of this, a lot of intrinsic value is being assigned to the topic of Deep Learning, so much so that it seems a startup can automatically generate a B-round or further simply by stating that “we use Deep Learning to make the best decisions on what goes into today’s lunch soup”. That’s a bit of a stretch, but not by far ).
So What Is It?
We need to start by looking at the name: Deep Learning. These two words, just like pretty much everything in computerese, have very specific meanings which often don’t attach well to the real world. In this case, they do, which is fortunate for me because it’ll help keep this article brief.
Let’s tackle each word separately.
Learning. A DL system quite often has a multitude of modules or units, each of which is responsible for Learning something. If it’s a good object-oriented model, each unit is responsible for learning one thing, and one thing only, and becoming very good at that one thing. It will have inputs, and it will have outputs, and what it does internally it might involve multiple stages of analysis against the inputs that help it craft an output that has “meaning” to the other parts of the system that use it.
What would be a good example of a learning unit? How about something in image identification? Within a satellite photo, parts of the image get pulled as possible missiles, or tanks. These parts have attributes of their parts, such as the ratio of length to width, possibly their height (if the image was taken from an angle not directly overhead it will be possible to calculate height), maybe differences in shading top to bottom (the turret of a tank will produce shadows), etc.
A unit that is given candidate images might build a statistical history based on its prior attempts to classify, its positives and negative results, and may (if it is particularly advanced) also ‘tune’ its results based on things like geographic location and so on. In its operation, it will come back with a statistical probability that object X which it was given is a tank or a missile. Over time it will be ‘trained’ – either by itself, or by humans inputting images to it with flags saying “tank” or “no tank” (when humans give it cues like this it is called “supervised learning”), to be better at identifying candidate objects.
Deep. The ability of a computer system using Deep Learning is generally amplified by how many different things it can learn, and each one of these things that can be learned, when placed in line with one another (or if you go for the visual, they can be “stacked up”) produce depth. A system that can stack many layers is considered Deep.
The example above, where a system was given a photo object, might be part of a system which takes a single large photograph, and multiple different units act on it. Let’s walk through the hypothetical layers:
The first might look for straight lines. It trains itself to be better and better at examining colors of pixels in images and finding ones that line up as straight, given the resolution of the image.
The second looks for corners, where straight lines meet. It takes the identified straight lines from unit one, and then will seek places where their ends meet. It will train itself to avoid “T” bits, and might decide that “rounded” corners are acceptable within a certain threshold. It outputs its data to…
The third, which takes corners and try to find ‘boxes’, places where multiple corners might form an enclosed space. It must train itself to avoid opposite-facing corners, etc. It then sends its candidate ‘boxes’ to:
The fourth, which takes boxes and begins to look for color shading gradients which can describe sides, tops, fins, etc. It compares these values to its own historical knowledge and ‘learns’ to classify these boxes as specific objects – and outputs things like “75% missile” or “80% tank”, etc. Particularly sophisticated versions might even be able to compare signatures finely enough as to identify the type of missile or tank.
Each of these units described above might be comprised of multiple sub-units, which share connections with one another (which, by the way, is a “neural network” – different discussion, might write about that some time too). These sub-units would be hidden from the units outside itself.
So between the two, we have a Deep system, that Learns to do its job better over time.
And before you ask, yes, there is also “Shallow Learning” – Shallow simply refers to a “stack” that doesn’t have many layers. There’s no set boundary between what “Shallow” versus “Deep” is.
How Good Is It?
As with pretty much every computer system ever invented, the answer to the question “How good is it?” is: GIGO. Garbage in, garbage out. The system is only as good as its training. In the example above, if insufficient valid positive and negative images are given to the system to train on, it can suffer from muddied “perception” and never get better.
However, DL is powerful. By that, I mean that when compared to a human, it can reach or exceed our capabilities within its specialized tasking in a very short (comparatively) period of time.
For example, I have a set of rules I built in Outlook over the course of probably fifteen years or more, and these rules successfully negate 99.5% of the spam that lands in my inbox (today’s example: 450+ messages are in my ‘Deleted’ folder, about 5 landed in my inbox that had to be deleted). Occasionally, I get a ‘false positive’ and a good mail will get deleted, but it’s pretty rare. These rules have taken over a decade to produce, and act on a host of subject triggers, address triggers, body triggers, etc. A DL system can establish a similar ‘hit rate’ to my rules in only a few days, perhaps as fast as a few hours.
But these factors depend on how well the system is built, and how good its learning data is.
What Does It Mean To Me?
Well, now that’s the catch. Depending on your life, DL systems may not have a direct impact on you at all. Plumbers, for example, are unlikely to care. Insurers, actuaries and other folks whose livelihoods depend on statistical analysis, however, had better sit up and take notice. The initial “green field” territories where statistics are the primary function have already been broadly affected by Deep Learning. Today, areas such as FinTech and advertising sales are steadily moving to use DL in certain aspects of their business. Self-driving vehicles are a perfect example of another “Deep Learning” application. What do you think those first few highly-publicized autonomous vehicle voyages were a few years back? Supervised training. They were teaching the vehicles how not to get into wrecks.
We’re just beginning to see learning systems entering healthcare and other more ‘soft’ sectors.
And here is where the warning bells sound. Not because SkyNet is going to set off the rise of the machines (though there is some legitimate reason to be concerned in that regard, particularly when you see robot chassis and drones armed with weapons). No, the concern should presently be directed at how these tools get used. As I mentioned, these are powerful systems that can be used to great benefit – and can also be used to do great harm.
For example, one of the innocuous sentences I’ve seen with regard to the application of a learning system to healthcare was: “Given the patient’s past history, and their medical claims, are you able to predict the cost for the next year?” (Healthcare IT News). Okay, in context, that question was raised with the intent of predicting utilization, how much hospital care might be needed across a population.
But what if the question is “Find the best combination of interest rates to keep people paying their credit card bills without completely bankrupting them, and to maintain their indebtedness for the longest period.” In that case, Deep Learning can be used to figuratively enslave them.
What if that question was asked by an insurance executive in the USA, wanting to see where the profit line cuts and using that data to kick people off their insurance who would negatively impact the company’s margin? In that case, Deep Learning can be used quite literally to kill people.
The tools will only be used within the ethical boundaries set by the persons who use them. In the United States and several other countries, there are certain political parties who feel that ethics have no place in business – that might makes right. Just as with dangerous vehicles, dangerous weapons, and other hazards, we as members of our societies must make our voices heard – through the voting booth, in our investment choices, in journalistic endeavors – and ensure that these tools are used to benefit, not harm, the public.
It might even be worth considering that from a software engineer’s perspective, perhaps it is time to establish something similar to the medical professional’s Hyppocratic Oath:
First, do no harm.