Skip to main content

Posts

Showing posts from January, 2018

2.0

I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions. Unfortunately, this interpretation completely misses the forest for the trees. Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software. They are Software 2.0. The “classical stack” of Software 1.0 is what we’re all familiar with — it is written in languages such as Python, C++, etc. It consists of explicit instructions to the computer written by a programmer. By writing each line of code, the programmer is identifying a specific point in program space with some desirable behavior. In contrast, Software 2.0 is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in weights is kind of …

Machine Learning in my way

I’ve seen machine learning overrated in a few ways, both by people with little experience and, more perniciously, people deeply invested in the field. The most common belief is that machine learning is more general and more powerful than it really is. Machine learning is good at things machine learning is good at and, of course, it’s bad at everything else. If you listen to some people though, you’d believe you could throw a neural net at any problem and get a solid solution. I mostly chalk this down to inexperience and misplaced enthusiasm, but it’s also a result of aggressive hype by people who should know better. Karpathy’s recent viral post Software 2.0 is a great example: he makes some interesting points but leaves you with an impression that deep learning is the future of programming. The article somehow elides how problems outside a few niches (vision, speech, NLP, robotics) aren’t clearly amenable to this approach. It’s not just systems software; even most areas of business l…

Neural Networks vs. Deep Learning

What’s the difference between deep learning and a regular neural network? The simple answer is that deep learning is larger in scale. Before we get into what that means, let’s talk about how a neural network functions. To make sense of observational data (like photos or audio), neural networks pass data through interconnected layers of nodes. When information passes through a layer, each node in that layer performs simple operations on the data and selectively passes the results to other nodes. Each subsequent layer focuses on a higher-level feature than the last, until the network creates an output. In between the input layer and the output layer are hidden layers. And here’s where users typically differentiate between neural nets and deep learning: A basic neural network might have one or two hidden layers, while a deep learning network might have dozens or even hundreds. For example, a simple neural network with a few hidden layers can solve a common classification problem. But to id…