Machine Learning in my way

I’ve seen machine learning overrated in a few ways, both by people with little experience and, more perniciously, people deeply invested in the field.

The most common belief is that machine learning is more general and more powerful than it really is. Machine learning is good at things machine learning is good at and, of course, it’s bad at everything else. If you listen to some people though, you’d believe you could throw a neural net at any problem and get a solid solution.

I mostly chalk this down to inexperience and misplaced enthusiasm, but it’s also a result of aggressive hype by people who should know better. Karpathy’s recent viral post Software 2.0 is a great example: he makes some interesting points but leaves you with an impression that deep learning is the future of programming. The article somehow elides how problems outside a few niches (vision, speech, NLP, robotics) aren’t clearly amenable to this approach. It’s not just systems software; even most areas of business logic are still better solved by somebody experienced writing a few hundred lines of code than machine learning.

If anything gets to be “software 2.0” it’s garbage collection and high-level languages, and deep learning isn’t even “software 3.0”¹. Neural nets are “just ‘another tool in your machine learning toolbox’” and, more importantly, machine learning is just another tool in your programming toolbox!

This has real consequences. I see people putting massive resources into machine learning-based systems when simpler solutions would be both faster to build and more effective. Take the problem of forecasting demand for items at a store. You could try doing this as a pure machine learning system, but the system would struggle—and ultimately fail—to extract all the structure you need from your data. There are a lot of factors that matter, and some aren’t going to show up in data you can realistically work with. We’d be far better off modeling a bunch of things explicitly (demand elasticity based on pricing and promotions) and relying on human experience for others (changes in consumer fashion).

The ideal system for solving a lot of hard problems has to be a hybrid: some machine learning based on data, some explicit modeling and some interactive ways to take advantage of experts. But too many people don’t design problems like this because they see machine learning as a panacea and see building a black box that operates solely on data as a goal.

I’m not surprised by this state of events. The open secret of research—academic or industrial—is that only the things that work see the light of day. How many problems have numerous teams tried to solve with machine learning an failed? You don’t hear about many of them except through back channels if you chat with active researchers in the field. (One example: I know a bunch of people have tried and failed to apply deep learning to various problems in program synthesis, but only because I heard it through a research grapevine.)

A related problem is that people overstate the impact of machine learning in a product. A lot of consumer products now feature machine learning at their core—think of Quora and Facebook’s feed. Since machine learning is the new hotness and deeply technical, the products’ success must be due to machine learning!

Thing is, I bet the impact of machine learning is marginal at best: most of the effect is explained by the social design of the tools. What really matters is that Quora has a feed and lets you follow people and topics. I would not be surprised if a much rougher algorithm (perhaps a heuristic-based rules engine) could produce a feed as good as if not better than Quora’s black magic! I use other products that have a similar design to Quora without any “machine learning” (like Reddit) and, frankly, my Reddit front page does a better job surfacing things I care about than Quora does! (The rest of Quora’s design—the non-machine-learny bits—fit me a lot better than Reddit.)

One thing I find illuminating is how many quantitative trading shops have notembraced machine learning whole-heartedly. Some have, of course, but a bunch of others continue making obscene amounts of money with relatively straightforward hand-tuned algorithms. Again: a rules engine filled with expert-written heuristics works eerily well! Some strategies were developed or discovered with machine learning techniques (also known as “statistics”) but others are created more thanks to deep domain expertise.

My point is not that machine learning is useless for trading: it clearly has its place. Rather the point is that it very much does not rule the roost, contrary to what you might expect just from hype.

The final ways I see machine learning being overrated is going to be painfully familiar to anyone who’s tried implementing machine learning systems in production: machine learning is way more fiddly than it seems.

You might think you can just apply some machine learning algorithm you’ve heard about to your problem, but chances are it won’t work nearly as well as the blog post or paper you got it from. A lot of details never make it to papers; they exist solely as institutional knowledge among professionals in the field. You’ll need to spend a lot of time configuring the algorithm for your problem, even if your problem is almost identical to the original you’re working from. You’ll need to tune hyperparameters, find the right architecture, preprocess your data in weird ways, maybe even restate parts of your problem… You can’t just throw your problem at an existing algorithm; you’ll either need extensive experience or a lot of trial and error.

Machine learning is a powerful, useful set of techniques and has allowed us to solve problems we couldn’t have handled before. The supply chain optimization system I’m working on today, for example, will benefit from adding some machine learning systems on top of the classical operations research foundation we have now.

But, all that said, machine learning is nowhere near as general, powerful or impactful as people seem to believe!

footnotes
¹ If I had to make a bet on what future technology will be worth calling “software 3.0”, I’d say it’s interactive development with tools backed by program synthesis. But that might just be wishful thinking and it is a long way out!

Reference: Quora
Mr.Tikhon Jelvis

Machine Learning

Search This Blog

Machine Learning in my way

Comments

Post a Comment

Popular posts from this blog

How does a total beginner start to learn machine learning if they have some knowledge of programming languages?

Crazy 6 Years with My Software Development Company

Why do we need Machine Learning?