Skip to main content

Posts

Showing posts from January, 2018

2.0

I sometimes see people refer to neural networks as just “another tool in your machine learning toolbox”. They have some pros and cons, they work here or there, and sometimes you can use them to win Kaggle competitions. Unfortunately, this interpretation completely misses the forest for the trees. Neural networks are not just another classifier, they represent the beginning of a fundamental shift in how we write software. They are Software 2.0 . The “classical stack” of  Software 1.0  is what we’re all familiar with — it is written in languages such as Python, C++, etc. It consists of explicit instructions to the computer written by a programmer. By writing each line of code, the programmer is identifying a specific point in program space with some desirable behavior. In contrast,  Software 2.0  is written in neural network weights. No human is involved in writing this code because there are a lot of weights (typical networks might have millions), and coding directly in we

Machine Learning in my way

I’ve seen machine learning overrated in a few ways, both by people with little experience and, more perniciously, people deeply invested in the field. The most common belief is that machine learning is more general and more powerful than it really is. Machine learning is good at things  machine learning is good at  and, of course, it’s bad at everything else. If you listen to some people though, you’d believe you could throw a neural net at  any problem  and get a solid solution. I mostly chalk this down to inexperience and misplaced enthusiasm, but it’s also a result of aggressive hype by people who should know better. Karpathy’s recent viral post  Software 2.0  is a great example: he makes some interesting points but leaves you with an impression that deep learning is  the  future of programming. The article somehow elides how problems outside a few niches (vision, speech, NLP, robotics) aren’t clearly amenable to this approach. It’s not just systems software; even most areas o

Neural Networks vs. Deep Learning

What’s the difference between deep learning and a regular neural network? The simple answer is that deep learning is larger in scale. Before we get into what that means, let’s talk about how a neural network functions. To make sense of observational data (like photos or audio), neural networks pass data through interconnected layers of nodes. When information passes through a layer, each node in that layer performs simple operations on the data and selectively passes the results to other nodes.  Each subsequent layer focuses on a higher-level feature than the last, until the network creates an output. In between the input layer and the output layer are hidden layers. And here’s where users typically differentiate between neural nets and deep learning: A basic neural network might have one or two hidden layers, while a deep learning network might have dozens or even hundreds. For example, a simple  neural network with a few hidden layers can solve a  common classification problem