Text Generation
Natural language generation (NLG) can actually tell a story – exactly like that of a human analyst – by writing the sentences and paragraphs for you. It can also summarize reports.
“Conversations with systems that have access to data about our world will allow us to understand the status of our jobs, our businesses, our health, our homes, our families, our devices, and our neighborhoods — all through the power of NLG. It will be the difference between getting a report and having a conversation. The information is the same but the interaction will be more natural". **
Algorithms
Text Generation with Markov Chain
Markov chains are a stochastic process that are used to describe the next event in a sequence given the previous event only. In our case the state will be the previous word (unigram) or 2 words (bigram) or 3 (trigram). These are more generally known as ngrams since we will be using the last n words to generate the next possible word in the sequence. A Markov chain usually picks the next state via a probabilistic weighting but in our case that would just create text that would be too deterministic in structure and word choice. You could play with the weighting of the probabilities, but really having a random choice helps make the generated text feel original.
Corpus: The dog jumped over the moon. The dog is funny.
Language model:
(The, dog) -> [jumped, is]
(dog, jumped) -> [over]
(jumped, over) -> [the]
(over, the) -> [moon]
(the, moon) -> [#END#]
(dog, is) -> [funny]
(is, funny) -> [#END#]
import random
import string
class MarkovModel:
def __init__(self):
self.model = None
def learn(self,tokens,n=2):
model = {}
for i in range(0,len(tokens)-n):
gram = tuple(tokens[i:i+n])
token = tokens[i+n]
if gram in model:
model[gram].append(token)
else:
model[gram] = [token]
final_gram = tuple(tokens[len(tokens) - n:])
if final_gram in model:
model[final_gram].append(None)
else:
model[final_gram] = [None]
self.model = model
return model
def generate(self,n=2,seed=None, max_tokens=100):
if seed == None:
seed = random.choice(self.model.keys())
output = list(seed)
output[0] = output[0].capitalize()
current = seed
for i in range(n, max_tokens):
# get next possible set of words from the seed word
if current in self.model:
possible_transitions = self.model[current]
choice = random.choice(possible_transitions)
if choice is None: break
# check if choice is period and if so append to previous element
if choice == '.':
output[-1] = output[-1] + choice
else:
output.append(choice)
current = tuple(output[-n:])
else:
# should return ending punctuation of some sort
if current not in string.punctuation:
output.append('.')
return output
- Natural Language Generation Part 1: Back to Basics
- MaskGAN - Fill in the blank technique
- RankGAN
- LeakGAN
- BART
- CTRL: A Conditional Transformer Language Model for Controllable Generation
- Neural Assistant: Joint Action Prediction, Response Generation, and Latent Knowledge Reasoning
Important Papers
The survey: Text generation models in deep learning [2020]
Survey of the State of the Art in Natural Language Generation: Core tasks, applications [2017]
Neural Text Generation: A Practical Guide [2017]
Neural Text Generation: Past, Present and Beyond [2018]
Experiments
- Do Massively Pretrained Language Models Make Better Storytellers?
- Build your own WhatsApp text generator
- A beginner’s guide to training and generating text using GPT2
- How to Build a Twitter Text-Generating AI Bot With GPT-2
- Tensorflow guide on Text generation with an RNN
- Generating Text with TensorFlow 2.0
- Accelerated Text
- Texygen: A text generation benchmarking platform
- How to create a poet / writer using Deep Learning (Text Generation using Python)
- Text generation with LSTM
- Title Generation using Recurrent Neural Networks
- Pun Generation with Surprise
- Neural text generation: How to generate text using conditional language models
- Encode, Tag and Realize: A Controllable and Efficient Approach for Text Generation
Text Generation with char-RNNs
The Unreasonable Effectiveness of Recurrent Neural Networks
- How to Quickly Train a Text-Generating Neural Network for Free
- Natural Language Generation using LSTM-Keras
References
BoredHumans.com - Fun AI Programs You Can Use Online
ChenChengKuan/awesome-text-generation
Tianwei-She/awesome-natural-language-generation
Papers with Code - Text Generation
Eulring/Text-Generation-Papers
Decoding techniques - Greedy search, Beam search, Top-K sampling and Top-p sampling with Transformer
How to generate text: using different decoding methods for language generation with Transformers
Controlling Text Generation with Plug and Play Language Models (PPLM)
PPLM lets users combine small attribute models with an LM to steer its generation. Attribute model scan be 100,000 times smaller than the LM and still be effective insteering it, like a mouse sitting atop our wooly mammoth friend and telling it where to go.The mouse tells the mammoth where to go using gradients.
Controlling Text Generation with Plug and Play Language Models
Plug and Play Language Models: A Simple Approach to Controlled Text Generation
GPT-2 Fine Tuning
Autoregressive Language Generation
It based on the assumption that the probability distribution of a word sequence can be decomposed into the product of conditional next word distributions. The way these models actually work is that after each token is produced, that token is added to the sequence of inputs. And that new sequence becomes the input to the model in its next step.
Word-Level Generation vs Character-Level Generation
In general, word-level language models tend to display higher accuracy than character-level language models. This is because they can form shorter representations of sentences and preserve the context between words easier than character-level language models. However, large corpora are needed to sufficiently train word-level language models, and one-hot encoding isn't very feasible for word-level models. In contrast, character-level language models are often quicker to train, requiring less memory and having faster inference than word-based models. This is because the "vocabulary" (the number of training features) for the model is likely to be much smaller overall, limited to hundreds of characters rather than hundreds of thousands of words.