For Now, AI Tinkering Has Outpaced The Cloud
For what seemed like a decade, tech observers, when explaining the phenomenon of startups in lectures or blog pieces, would recite a narrative about the ever-decreasing costs of building software applications and making them available to the public. Physical servers were once required, the tale goes, but now anybody can spin up an instance in the cloud for nearly no money. What once took $20,000 in hardware and a deep pool of server-admin knowledge was now solved by browsing Stack Overflow for 30 minutes and an AWS account.
With the costs of servers approaching zero—at least for applications with low traffic and smaller data loads, an age of tinkerers was born. Students, night-time coders, anybody who could string a script together was able to flit from project to project, giving the world a critical mass of new and clever companies that seemingly popped up overnight.
The same phenomenon is playing out within the world of AI computing right now. This is a subset of the startup movement overall, to be sure, but what's happening is the germination of another large crop of tinkerers, these ones trading in machine learning and deep learning models. College freshmen and sophomores now regularly plunge into the deep end of a pool that was limited to academia just several years ago.
One of the interesting side effects of this rush to AI is the corresponding bump in engineers building their own computers, constructed with the express purpose of building deep learning models, which requires many layers of parallel processing. It's a departure from the now-standard use of cloud-based servers that form the backbone of most startups' applications.
It's not that resources to build these models aren't available in the cloud, but the required processing power can get quite expensive, especially on a tinkerer's budget and when the model may not be constructed correctly or as efficiently as possible—which can lead to extra time processing. Training a neural network is most efficiently done by a GPU, which often contains hundreds of cores, as opposed to a CPU, with only a few.
Harnessing enough GPU power to build a model through AWS or Azure can cost between $8 and $15 an hour. Some models can take five or more days to build at a cost of $1,000 in cloud computing. For somebody building multiple models—and making mistakes and alterations while doing it—using cloud-based processors becomes incredibly expensive.
Enter the tinkerer's alternative: building an AI box from scratch.
The required parts - these are merely an example:
• 2-4 Nvidia 1080 Ti Cards: $699
• 1 big motherboard: $379
• 64 GB Memory: $600
• 128 GB SSD Harddrive for the OS: $100
• 500 GB SSD HD for Data: $200
• 1 Intel i7 CPU: $315
• 1,600-Watt power supply: $299 — don't plug this into old wiring!
• 1 big box to hold it all: $169
Tinkerers can spend more, or less, depending on their needs and budget, obviously. More information on AI hardware can be found on Tim Dettmers' blog, a fount of information on deep learning.
With its own box, an early startup working on AI solutions has a distinct advantage over competitors that don't. With such a tool at their disposal, an AI tinkerer can work on ideas and models without fear of running up a big bill. Now making a mistake won't cost hundreds or thousands of dollars—it will only tie up the hardware, which is a cost all its own, but one that's preferable to most early-stage startups and their founders.
Like everything in computing, the age of the homebuilt AI box will pass
It's arguable that no company has benefited more from the AI explosion than Nvidia. The company whose cards were once primarily the province of high-end machines that needed superior graphics and quick response rates has found itself at the heart of the AI movement. Nvidia's GPU cards, with their ability to parallelize so many processes at once, are the workhorses of AI engineers. Running complicated neural nets is almost impossible without them.
Nvidia knows this, of course, and it's seizing the moment. The company recently unveiled what it calls GPU Cloud, which will allow anybody seeking to build deep learning networks using cloud computing the ability to package logic into specially-designed cloud containers, backed by processing power in the form of Nvidia GPUs at datacenters belonging to the standard cloud providers: AWS, Azure, Google, etc.
Nvidia announced its network at its annual GPU conference, an event that, I've been told by people who attended, included roughly four times as many VCs as it did last year. So if there were any lingering doubt that the lid is off AI, let there be none. This is further evinced by a handy set of resources pushed out by Andreessen Horowitz, which the firm calls the AI Playbook. It contains information and links that can help people begin to understand AI, and even become practitioners.
Nvidia hasn't rolled GPU Cloud out to the public just yet. The company is likely wondering what to make of the recent news from Google, which announced the second-generation of its Tensor Processor Unit, a chip made specifically for the toil of machine learning. Google will use the chips in its data-centers to provide power to those building models via its cloud network.
Not hotdog
True AI methods—not those behind Hal 9000 or Tony Stark's digital assistant—have entered the modern entertainment lexicon. Yes, it was in a show about tech, but a very popular one.
Those who watch HBO's Silicon Valley may have recently downloaded an iOS app dubbed Not Hotdog, which was featured on the show and built in real life, with real AI, in a gambit of clever marketing. The app is image-analyzing piece of software that, using a local neural net model leveraging Google's open-source TensorFlow library, gives users an answer to that most vital of questions: is this thing I've just snapped a picture of a hot dog? The neural net model dispatches of this task quite well.
There's a thread on Hacker News about the app; the engineer behind it checks in at the top and explains how they made it. Creating the model required 150,000 images—3,000 of them containing hot dogs, 147,000 of them without hot dogs—and 80 hours of processing time on an Nvidia GPU.
Data wanted
The interest in AI methods is so sharp at the moment that many would-be tinkerers are picking random projects to work on—things they have no interest in—just because there is free data available. Rich sets of data–usually a minimum of 10,000 entries with as many as hundreds of fields—has always been a prerequisite to carrying out high-end AI.
There's more data available now than ever, but some engineers' good ideas are still on hold because the data required isn't available—and even more developers may not have that great idea at the moment, but they want to be proficient in the methods that will soon be a standard part of the technology stack.
Again, the trends on HN have been instructive here. There exists a 156-comment thread that asks, "What do you use machine learning for?" The thread has proven popular, undoubtedly because there exists a latent demand within the general developer ranks to tinker and build something, anything, with machine learning. The question, as always: what?
There is a sense within the engineering world that, by not knowing how to employ some set of AI methods, a developer may be left behind.
In some ways it's still early
Some of the best methods, outside of getting basic frameworks set up, aren't yet found on Stack Overflow or Github, but are buried in dissertations and academic papers. That requires more hunting and detective work than many engineers might be used to, but it's where the best work can be found at the moment. There's little doubt that the richness of information available on the web for building out a deep learning application will grow exponentially going forward. But for those who don't mind doing some sifting work, solutions can be found now.
Good Times
As I plod away on my own projects in Ruby and Javascript, I find myself captivated by the work going on around me by others—many of them within feet of my desk at Northwestern. Other Entrepreneurs in Residence, as well as 19 and 20-year-old students, are building applications on top of AI that do increasingly impressive things. TensorFlow, Cuda, epochs—I'm learning by osmosis, and by asking small questions that, I hope, won't be too annoying.
The result is a little bit more of understanding, and a lot more awe at what's possible.
Christopher Steiner is the founder of ZRankings, and Aisle50, YCS11, which was acquired by Groupon in 2015.