RNNs & LSTM: Deep Learning Ransom Notes

Practical Futures is a series where members of the development team at Uncorked Studios bring realistic expectations to experiments with emerging technology. Next up in the series is on deep learning and attempted imitation. Read the first installment: Chars74k, GANs, and Why Deep Learning Training Sets Matter.

When you speak a sentence you’re following rules of whatever language you’re speaking. You’re using explicit and implicit standards to convey an idea, and you’re trying to lose the least amount of information in the process. Most of the words that you use—or at least a large percentage of them, statistically — are leaving your mouth based on the probability a given word will come after or before another word you’re thinking of using. This probability can be reverse-engineered from a corpus of things you say or write. Deep learning theory and practice has shown us any regular, well-structured, data can be expressed as a probability of the existence of a token, given a token, but in this article I’m sticking to text.

As an experience of the business that couldn’t be able to can be sure it a lot of people who were putting their people at the pare that are a sort of crowds and content than they are a lot. To the advices and personal comment for a compliancy to context their artists and some people are seen as advertising process to a price as interactively cloud.

Whether you’re reading this for fun, or because I sent you this link and you’d feel bad if you didn’t read at least some of this: there’s a mathematical formula that does a pretty good job of explaining how one speaks or writes. It is your “voice”— it’s what makes a Sedaris book a Sedaris book and not a Fitzgerald book. There’s a distinct way a person communicates and it is the fingerprint for how they get their point across.

Even you, Scatman John. Even you.

The really interesting part of this is that if you sit down and pick any word at all, let’s say “cat,”  is it easy to think of what word is most likely to come after it?

“Scratch?”

“Nap?” Maybe. Seems like it makes sense.

“Pictures.” Also good.

Ok, you’ve got this “what comes next” thing going pretty well, so let’s change it up. What about before “cat”?

“The?”

“A?”

“Curious?”

“My?”

What if I let you pick the word that comes after “cat” first… let’s say “cat pictures.” What comes before it?

“My” or “the” definitely seem like they’d be the most likely from the above choices.

It turns out that language and words and syntax all follow this kind of rule: It’s easier to know what comes before something once you know what comes after something. This is an unexpected side effect of attempting to convey a completely abstract thought (“In my brain, there is an abstract concept of a picture of a picture of my cat.”) as something consumable by another person (“My cat picture is the cutest.”) I can’t cram the idea into someone else’s brain, so I have to follow the rules of how to get the idea out of my head and into theirs. Write enough language down according to the rules, and convey enough ideas—what writers do—and the specific pattern you use to accomplish this will emerge.

These patterns (like what we saw in the datasets in the previous post) can be learned and simulated mathematically by a neural network. They’re unnerving. Not creepy, exactly, but unnerving, like a parrot talking on a cell phone. When we think of communication, we lean on an abstract idea : it’s a thing that only a human mind could come up with. What these neural networks (Recurrent Neural Networks, or RNNs) do is keep links to the future and the past so they can modify them both around a kernel at the same time. Once a word is chosen, that choice influences the past and the future with something like short-term memory. These networks give the appearance of abstract idea generation because we give them the map that we as communicators use to convert ideas to words. By showing a network enough rules, we give them turn-by-turn instructions to a destination, without them needing to understand much else.

So, going back to our initial analogy, the word “picture” comes after “cat” which comes after “My,” but in code.

The power can be seen that the power of technical, and are the librarian, and it’s a posting technology that they are not constantly completely coming at the party process and products and putting the campaign that are also completed as a complacence.

When Nick Parish joined Uncorked Studios, I realized that he’s a person who has produced a nontrivial amount of well-structured, diverse, self-generated text. This has come in the form of years of articles about the advertising industry and sports, as well as his book on tech culture, Cool Code, Bro. All of this meant it should theoretically be possible for me to take everything he had written, run it through an RNN, and—voila!—no need for Nick.

So we went to work. Nick’s website has a trove of links to his previous works—he also graciously volunteered works in progress and draft copies. Writerly ego is stronger than all that “machines will take your job” stuff, it seems.

Implementations of RNN’s when dealing with text generally follow a pattern whereby every unique word (or in some cases, character) is replaced by a number known as an embedding, and then operated on and swapped back in. Think of it as a very large decoder ring, but instead of letters, it’s every single unique word you’ve ever written down.

First off, you run through and figure out what those number mappings are, convert everything Nick has ever written into numbers, and then get the network to start attempting to generate strings of words that reasonably resemble the patterns and content that Nick uses.

With an RNN, it’s the structure of the data set (like articles, books, speeches, and captions) that it’s learning how to simulate, not the content, not the ideas behind the content. It can reasonably ape what it’s been looking at and appear to create new content from those rules. Content yes—ideas no—is a concept that’s missing from much of the headlines around RNNs.

You’ve no doubt seen the clickbait headlines “Computer Learns to Make Recipes,” “Computer to Replace Your Writing Job Because Look At What It Can Do By Reading Hardy Boys Books,” “Computer Writes Hardy Boys Book,” etc. The thing is: these networks can learn how to follow the rules of those books, the mechanics of sentence structure, character names, colors, and antiquated vocab. They can also be made to be advanced enough to pull in newer concepts, and other styles to make the networks a little more “alive.” But the fact of the matter is they’re not writing “Hardy Boys” books. They’re doing the computer equivalent of a dog barking “I Love You.”

Computers still need someone with an abstract idea to turn that abstract idea into a rules-following, structured, collection of information that can be inspected, reverse engineered, and stored. Neural networks may appear to be “dreaming” as the Dick short story questioned, but in reality they are going through the (very detailed) motions of a thing that they’ve seen.

The company is the company is strangers and public to them. Instead of technology is trading that they can be able to defend their people to have to dig them theyre complexed to the campaign, and in the companys primicy today, and the social network around to a people that can save in anded to see a programming program and purchases and private content.

I hope by now that you’ve noticed that the pull quotes in this article don’t make sense, but also sort of do. The very TV quality hacker image in this post is me grabbing pull quotes that have been generated by a script that learned about writing about technology by a (very small) dataset of Nick Parish articles, Tweets, Facebook shares, and about 100 TechCrunch articles that were blindly scraped. Some interesting observations:

  • I had about 2MB of Nick’s writing, but the average length (given that so many were Tweets) was painfully small, and the RNN wasn’t really learning sentence structure very well. This is why I beefed it up with a bunch of articles from various authors on TC — primarily to give a slight bias toward the baseline grammar and structure.

  • Nick’s total dataset was larger than the TC dataset by about 10 to 1 when it comes to raw bytes.

  • Nick (having worked in the advertising and technology industry for so long) writes “company” and “campaign” a lot in this corpus.

Overall, Synthetic Nick is OK at generating something you can skim and feel smart about. The interesting bit here is that even with such a small dataset, the network can generate text that “feels” real. Under scrutiny, it reads like a co-worker or a parent might sound in a dream or nightmare: somewhat familiar and hitting all the right notes but largely incoherent. Without Nick somewhere creating the voice, no generator can do much beyond regurgitate words that it has seen before according to the rails of the ruleset of the text it’s been shown. Want tweets? Teach a computer that all language exists in 140 characters. It won’t dream up an idea, but it will definitely follow that rule.

My overall feeling here is that as we saw in the first post, Deep Learning is very sensitive to its dataset even when that data is just text. You can’t write a program to be Nick Parish, but you can write a program to reasonably imitate him. If you’ve got enough data for it to look at, it might even fool a reader into thinking it’s him. The creator can’t be cut out of the process, and no matter how much of Nick’s writing a neural network learns from, it won’t be able to imagine the next great story or the next perspective on what happens in a given industry. It won’t have a perspective on those topics it wasn’t handed, either on purpose, or subliminally, by whoever aggregated its training set.

It’s all amazing and mechanical and in mere infancy — and it’s only just begun being taught how to learn.

Share Via

View Comments