I Tried to Get an AI to Write This Story: Paul Ford
What a time for artificial intelligence! Google announced a new AI-powered set of products and services at its I/O conference for developers, including one called Duplex that makes phone calls for you and sounds just like a real person, which freaked everyone right out. The Trump administration held some sort of AI summit with representatives from Amazon, Facebook, Microsoft, Nvidia, and buttermaker Land O’Lakes, presumably because the White House has so much churn. Add to this the public reveal that the musician Grimes and Elon Musk are dating, after the two shared a joke about AI.
And yet when people ask what the software company I run is doing with machine learning, I say, calmly, “Nothing.” Because at some level there’s just nothing to do.
The hotness of the moment is machine learning, a subfield of AI. In machine learning you take regular old data—pictures, emails, songs—and run it all through some specialized software. That software builds up a “model.” Since the model encodes what came before, it’s predictive—you can feed the model incomplete data and it will suggest ways to complete it. A trivial example: Anyone, including you and I, can feed the alphabet to a “recurrent neural network,” or RNN. That makes a model of the alphabet. Now you execute that model (maybe by running a script) and give it the letters “ABC.” If your specially trained neural network is having a good day, it’ll say “D.”
Go up a level: Feed your neural network a million pictures with captions, then feed it a picture without a caption and ask it to fill in the missing caption. Feed it countless emails with replies, then show it one without a reply and ask it what to say.
Since we use software all the time, we create an unbelievable amount of data. You can’t hire enough humans to weed through it, so we turn to computers, which lack discretion but make up for it in vigor. The biggest data holders—Google, Apple, Facebook, Microsoft, Amazon, financial companies, and, sure, Big Butter—are into AI for lots of reasons. But most important is that they’ve got all that data and not enough programmers to make sense of it. Machine learning is an enormous shortcut, a path to new products and big savings.
So out of curiosity and a deeply optimistic laziness, I set out to learn enough about machine learning that I could feed a neural network everything I’ve ever written and have it write an article, or even just a paragraph, that sounded like me. The first wall I hit is that, even for a nerd who’s used to befuddlement, machine learning is opaque. Reading up on it means relearning many words, absorbing acronyms such as RNN or LSTM (long short-term memory). People talk about the temperature parameter and cooling functions and simulated annealing. I am a veteran of jargon, and trust me, this is one big epistemological hootenanny.
Even worse, when you look under the rock at all the machine learning, you see a horrible nest of mathematics: Squiggling brackets and functions and matrices scatter. Software FAQs, PDFs, Medium posts all spiral into equations. Do I need to understand the difference between a sigmoid function and tanh? Can’t I just turn a dial somewhere?
It all reminds me of Linux and the web in the 1990s: a sense of wonderful possibility if you could just scale the wall of jargon. And of course it’s worth learning, because it works.
It works because what machine learning does is write software for you. You feed data to a program and it spits out a new program for classifying data. The big software people often don’t even know what’s happening inside the model. This should give us pause, but asking Silicon Valley to pause for reflection is like asking a puppy to drop its squeaky toy.
Here’s more good news: Machine learning is amazingly slow. We’re so used to computers being ridiculously fast, doing thousands of things at once—showing you a movie and connecting to dozens of Wikipedia pages while you chat in one window, write in a word processor, and tweet all the while (admittedly I might have a problem). But when I tried to feed a machine-learning toolkit all my writing in the hope of making the computer write some paragraphs for me, my laptop just shook its head. It was going to take at least a night, maybe days, to make a model of my prose. At least for now, it’s faster for me to write the paragraphs myself.
But I’d already read tutorials and didn’t want to give up. I’d downloaded and installed TensorFlow, a large machine-learning programming environment produced by Google and released as open source software. Fishing around, I decided to download my Google calendar and feed all my meetings to TensorFlow to see if it could generate new, realistic-sounding meetings. Just what the world needs: a meeting generator.
Unfortunately, my meetings are an enormous pile of events with names like “Staffing,” “Pipeline,” “John x Paul,” and “Office happy hour.” I ran a script once to load the data, then ran another script to spit out calendar invites. However, on that trial run I set the wrong “beam” (God only knows what that is) and the RNN just produced the word “pipeline” over and over again. To which I must say, fair. Sales = my life.
The thing is, that might look like failure. But I’d fed my machine learner a few thousand lines of text (tiny by machine learning standards), and it had learned one word. I was almost as proud as when I thought my infant son said “cat.” I was back to the seminal 1950 paper by Alan Turing in which he proposed simulating a child via computer. “Presumably the child brain is something like a notebook as one buys it from the stationer’s,” he wrote. “Rather little mechanism, and lots of blank sheets.”
Change the settings, try again. After 50 “epochs” (when the program reads in all of your data one time, that’s an epoch—training a network requires beaucoup epochs) I had it generating meetings with titles like “BOOK,” “Sanananing broces,” and “Talking Upgepteeelrent,” even though I’ve never talked Upgepteeelrent with anyone. After a hundred epochs, I had meetings like “Broam Shappery” and “DONKER STAR E5K.”
Many hours passed. I was so engrossed in simulating meetings that I missed a real sales pipeline meeting. So I went home, where I have a faster computer with a graphics processing unit, or GPU. GPUs have turned out to be the secret weapon of Bitcoin miners and machine learners. That’s because they’re good at carrying out massive numbers of calculations at the same time. A regular microprocessor is sort of a logic-powered sausage maker; you feed it meat (instructions) and it processes the meat and produces sausage (output) all day long. A GPU is like thousands of sausage grinders grinding at once. What kinds of problems can be decomposed into little tasks that can all run at once? Calculating the lighting in a 3D scene. Mining Bitcoins. And machine learning. These things can be sped up dozens, even hundreds of times.
Sadly, even though I followed the instructions, I couldn’t get Linux to recognize my graphics card, which after 20 years of using Linux feels more like a familiar feature than a bug. Of course, all would not be lost: I could jump online and rent a TPU, or Tensor Processing Unit, from Google (a tensor is a math thing where things connect to other things) using its cloud services. Microsoft Corp. has cloud machine learning for $50 a month for “100 Managed Models,” and Amazon.com Inc. has “Elastic GPUs” for 5¢ an hour. Google will also rent you a computer for about that. But if you want to rent a Google TPU and blast through a ton of machine learning tasks, it’ll cost $6.50 an hour, billed by the second. Is it worth 130 times more money to use a TPU to mess with tensors? If you’re looking at tons of satellite imagery or MRIs—probably.
I went back to my work laptop and applied a skill that’s fundamental to programming: cheating. I switched from “character”-based neural networks to training against “words”—and since my pet neural network was no longer learning the alphabet but merely looking at “tokens,” my meetings got much more plausible in a hurry.
After 2,000 epochs, it got to some relatively good meetings: “Paul and Paul!,” “Sarony Hears,” and the dreaded “Check-in,” but it was still mostly producing stuff like “Sit (Contench: Proposal/Gina Mcconk.” I started to feel why everyone is so excited: There is always, always one more knob to turn, one other thing to tweak that could make the computer more thoughtful-seeming. Or, as then-Ph.D. student Andrej Karpathy wrote in a 2015 essay, The Unreasonable Effectiveness of Recurrent Neural Networks: “I’m training RNNs all the time and I’ve witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.” He’s currently director of AI at Tesla Inc. His neural network must have been more than just amusing.
Messing with machine learning scratches a nerd itch to understand the world and master it a little, too—to reduce reality to inputs and outputs, and remix it. I wanted to forget my family and my company and just fall backward into this world of cloud TPUs, feeding it ever more data and letting it create ever more surprising models that I would explore and filter. As you run the model, it keeps getting smarter. Watching a machine-learning model train itself is like watching a movie montage. At the end, a robotic Rocky runs up the stairs of the Philly art museum and raises his robot arms in the air. It’s too bad that Robot Rocky was trained on a data set of hockey films instead of boxing, but it’ll still be fascinating to watch him enter the ring and try to score a goal.
Finally, I just let ’er crank for 20,000 epochs and went home, but the results weren’t any better in the morning. Or worse. They included: “Knight Days,” “Happy Sales,” “Company and home catchup,” “Chit Planning personal bus. Pitch Lunch: Wendy no get,” and “Tyler chat Deck.” I don’t know what it says of my life that all these could be real invites.
I’d tapped out the limit of what I could do without learning more. I’d learned that machine learning is very slow unless you use special equipment, and that my life, at least by the meetings I attend, is pretty boring. I accept both of those things. The reality is that my corpus wasn’t big enough; I need millions, billions of meetings to build a good predictive model. But imagine what I could do! I have no idea! Get me a whiteboard!
I work in software, and machine learning is the big new thing, but I’m not worried, nor are we retooling our company. Machine learning is important, but it’s not ready for civilians (although check out lobe.ai to see how things might look in the future). As with all software, machine-learning tools still need people to come along to make them look good and teach them how to behave. At least for now, computers need people as much as we need them.
Also, why bother? The amount of lock-in the big players have is ridiculous. They have the data, the software, and the engineers. Don’t want to give Google your money? You can jump onto Amazon’s SageMaker platform and get yourself a machine with 8 GPUs and 616 gigabytes of memory across all its processors for $24.48 an hour. Today, training models is slow; tomorrow, your dishwasher will be training a neural network on your dishes, the better to clean them.
In the meantime, for the biggest tech companies, there’s almost unlimited upside. And for none more than Google, an online advertising company with a sideline in search. It didn’t set out to be an ad company, but it is, and its market value is around $750 billion, so it will have to accept that. It has a ton of data. And machine learning is really effective at productizing (a real word) big data.
So if I’m Google, the absolute, most horrible, worst-case outcome is that I will be able to use what machine learning gives me and apply it to my enormous suite of advertising products and make them smarter and better and more useful, and do smarter and better search across the enormous swaths of culture where I charge a toll, which includes YouTube, all world geography, and (practically) the web itself. Plus, I can make it easier to use Android phones, which I also indirectly control.
Simultaneously I, Google, will release TensorFlow, and that will bring a huge group of expensive-to-recruit engineers up to speed on the tools we use internally, creating in them a great desire to come and do machine learning at our massive scale, where they can have all the TPU hours they want. And that will add up to tens of billions of dollars over the years.
But—still channeling the Google in my heart—in my wildest dreams I will open up entirely new product lines around machine vision, translation, automatic trading services, and generate many hundreds of billions of dollars in value, all before machine learning succumbs to the inevitable downward pressure and gets too cheap and easy.
I mean, even if TPUs shrink and everyone in the world can do machine learning, I’ll have the data. The beautiful, expensive-to-acquire data. I will have turned all my maps into self-driving cars, all my conversations into phones that have conversations for you, all my emails into automated replies. And I will be providing the cloud infrastructure for a whole machine learning world—clawing back what’s rightfully mine from those mere booksellers at Amazon—because my tools will be the standard, and our data will be the biggest, and the applications the most immense. Some of it will be problematic. The cops can search for people who might become criminals, the credit agencies can predict people who will have bad credit, the homeland security offices of many nations can filter through their populations and make lists of questionable value. We will be the infrastructure for the whole thing.
At the worst, I, Google, will merely succeed wildly. At the best, I will be the foundational technology for a bold new digital modernity in which the computer is deeply embedded in human life in ways we can only glimpse today.