Skip to content

Every Novel Generator #120

Open
Open
@cpressey

Description

In the spirit of all those "every X" bots on that bird site, this is a novel generator generator; it generates every novel generator, one by one, in sequence. It also runs these generators, interleaving their outputs, so it is itself also a novel generator, generating a novel consisting of every possible novel generated in every possible way.

Code

This novel generator generator is written in 204 lines of Python 2.7 code. I can't make you read them, but I can at least make you scroll down over them to get to the novel.

class Collector(object):
    def __init__(self, title, stream, limit=None):
        self.stream = stream
        self.count = 0
        self.limit = limit
        self.closed = False
        self.write('<html><head><title>{}</title></head><body><h1>{}</h1>'.format(title, title))

    def write(self, s):
        if self.closed:
            return
        self.stream.write(s)
        self.stream.flush()

    def recv(self, word, sender=None):
        if sender:
            self.write('<span title="{}">{}</span>\n'.format(sender.replace('<', '&lt;').replace('>', '&gt;'), word))
        else:
            self.write('{} '.format(word))
        self.count += 1
        if self.limit is not None and self.count >= self.limit:
            self.close()

    def close(self):
        self.write('</body></html>')
        self.closed = True

class Buffer(object):
    def __init__(self, title, collector, printable):
        self.title = title
        self.collector = collector
        self.printable = printable
        self.word = ''

    def flush(self):
        if self.word:
            self.output()
            self.word = ''

    def accum(self, chars):
        for char in chars:
            if char in self.printable:
                self.word += char
            else:
                self.flush()

    def output(self):
        self.collector.recv(self.word, sender=self.title)

class Alphabet(object):
    def __init__(self, chars):
        self.chars = list(chars)
        self.succ_map = dict([(c, chars[i + 1] if i < len(chars) - 1 else chars[0]) for i, c in enumerate(chars)])
        self.pred_map = dict([(c, chars[i - 1] if i > 0 else chars[-1]) for i, c in enumerate(chars)])

    def __contains__(self, c):
        return c in self.chars

    def first(self):
        return self.chars[0]

    def last(self):
        return self.chars[-1]

    def succ(self, c):
        return self.succ_map[c]

    def pred(self, c):
        return self.pred_map[c]

class IncrementableString(object):
    def __init__(self, alphabet, value):
        self.alphabet = alphabet
        self.value = list(value)

    def __str__(self):
        return ''.join(self.value)

    def zero(self):
        return self.alphabet.first()

    def succ_value(self, value):
        if not value:
            return [self.zero()]
        if value[0] == self.alphabet.last():
            return [self.zero()] + self.succ_value(value[1:])
        else:
            return [self.alphabet.succ(value[0])] + value[1:]

    def incr(self):
        self.value = self.succ_value(self.value)

class Interpreter(object):
    def __init__(self, program, buffer_, alphabet):
        self.program = program
        self.buffer = buffer_
        self.alphabet = alphabet
        self.pc = 0
        self.tape = {}
        self.head = 0L
        self.halted = False

    def read_tape(self):
        return self.tape.get(self.head, self.alphabet.first())

    def write_tape(self, symbol):
        self.tape[self.head] = symbol

    def step(self):
        instruction = self.program[self.pc]

        if instruction == '<':
            self.head -= 1L
        elif instruction == '>':
            self.head += 1L
        elif instruction == '+':
            self.write_tape(self.alphabet.succ(self.read_tape()))
        elif instruction == '-':
            self.write_tape(self.alphabet.pred(self.read_tape()))
        elif instruction == '.':
            self.buffer.accum(self.read_tape())
        elif instruction == '[':
            if self.read_tape() == self.alphabet.first():
                depth = 0
                while True:
                    if self.program[self.pc] == '[':
                        depth += 1
                    if self.program[self.pc] == ']':
                        depth -= 1
                        if depth == 0:
                            break
                    self.pc += 1
        elif instruction == ']':
            depth = 0
            while True:
                if self.program[self.pc] == '[':
                    depth -= 1
                if self.program[self.pc] == ']':
                    depth += 1
                self.pc -= 1
                if depth == 0:
                    break

        self.pc += 1
        if self.pc >= len(self.program):
            self.buffer.flush()
            self.halted = True

class ProgramGenerator(object):
    def __init__(self, start):
        self.source = IncrementableString(Alphabet('+-<>[].'), start)

    def next(self):
        program = str(self.source)
        self.source.incr()
        while not self.is_balanced(program):
            program = str(self.source)
            self.source.incr()

        return program

    def is_balanced(self, s):
        level = 0
        for c in s:
            if c == '[':
                level += 1
            elif c == ']':
                level -= 1
                if level < 0:
                    return False
        return level == 0

class Orchestrator(object):
    def __init__(self, collector, printable, starting_at):
        self.collector = collector
        self.printable = Alphabet(printable)
        self.alphabet = Alphabet(" " + printable)
        self.generator = ProgramGenerator(starting_at)
        self.interpreters = {}
        self.halted = False

    def step(self):
        reap = set()
        for program, interpreter in self.interpreters.iteritems():
            interpreter.step()
            if interpreter.halted:
                reap.add(program)
        for program in reap:
            del self.interpreters[program]

        program = self.generator.next()
        buffer_ = Buffer(program, self.collector, self.printable)
        self.interpreters[program] = Interpreter(program, buffer_, self.alphabet)

    def run(self):
        while not self.collector.closed:
            self.step()

if __name__ == '__main__':
    import string
    import sys
    title = "The Collected Works of Every Novel Generator Ever (Abridged Version)"
    collector = Collector(title, sys.stdout, limit=50000)
    Orchestrator(collector, string.lowercase + """.,!:;'"-""" + string.uppercase, starting_at='+').run()

Result

The first 50,000 words of output have been collected into a single document. This is, itself, a novel, to which I have given the title "The Collected Works of Every Novel Generator Ever (Abridged Version)". You can view or download it here:

The Collected Works of Every Novel Generator Ever (Abridged Version)

Note that, as a convenience, hovering over any word shows which novel generator produced it.

Discussion

Does it really generate every novel generator?

Yes, if given sufficient resources, and if by "novel generator" you mean "program in a minor variant of Everybody's Favourite Esolang".

Since it runs the novel generators that it generates, is this novel generator generator also a novel generator?

Yes.

Since this novel generator generator is also a novel generator, will it eventually generate itself?

Yes, given sufficient resources.

Since some novel generators run on and on forever, how does this run them all?

It keeps track of a set of generators that have been generated thus far, runs each generator one step, then adds a new generator to the set, and does this over and over again. The generators that never halt just keep "wasting cycles in the background", but this doesn't stop the overall process.

As a point of trivia, this is also the method used in the standard proof that a non-deterministic Turing machine is no more powerful than a deterministic one. It's worth studying the code if you're curious about how it works in detail.

Since it only generates programs in Everybody's Favourite Esolang, it won't generate my generator.

Everybody's Favourite Esolang is Turing-complete, so if your novel generator is written in some other language which is no more than Turing-complete (which is, y'know, all known programming languages), this will generate its equivalent in Everybody's Favourite Esolang.

So what it produces is only an equivalent of my generator in Everybody's Favourite Esolang. It's not my generator.

Well, actually, given enough resources, it will eventually generate e.g. a Python interpreter (in Everybody's Favourite Esolang) and an initial sequence of code that writes your-generator-written-in-Python to the tape and then runs the Python interpreter on it. So, yes. It will eventually produce your generator.

But my generator reads an input corpus from text files.

Given sufficient resources, this generator will eventually generate a program in Everybody's Favourite Esolang that, before running your novel generator, writes the content of those text files to the tape, where they can be read by your novel generator.

But my generator reads random files on the Internet to generate its result.

Given sufficient resources, this generator will eventually generate a program in Everybody's Favourite Esolang that, before running your novel generator, writes THE ENTIRE CONTENT OF THE INTERNET to the tape, where it can be read by your novel generator.

And actually it generates an infinite number of such generators, each of which writes a different content of the Internet to the tape.

But my generator writes a novel with interrobangs! This doesn't output interrobangs!

Indeed, these generators do not even output digits. I chose the character set that I did for aesthetic purposes (it makes the "early" novels look slightly more sensible); still, I am comforted that many well-known novels could in fact be rendered fully in it.

I'm sure with a little thought you will see the lack of interrobangs here is either a basically insurmountable problem (can we not always define "text" to include symbols which currently no computer can represent?) or a trivially surmountable problem (define an encoding between strings in the character set you have and the character set you want -- just like Unicode already does with bits, really).

Will this eventually generate a generator which generates all the novels of Ernest Hemingway?

Given sufficient resources, yes. Of course, you realize that such a generator could be just a big print statement.

Will this eventually generate Ernest Hemingway?

This is a philosophical question. If you are David Deutsch, or someone else who firmly believes the universe is just a big ol' Turing machine (quantum or otherwise), then presumably you believe it will. Including writing, not the entire contents of the Internet, but THE ENTIRE CONTENTS OF HEMINGWAY'S LIVED EXPERIENCE, to the tape at the beginning.

But not everyone shares Deutsch's views.

You keep saying "sufficient resources". What does that mean?

Well, there are an infinite number of novel generators, so if you actually wanted to generate them all you'd need infinite time and infinite storage. Whether that's even possible is another philosophical question. But it's safe to say that few people claim to have access to these.

The time and memory required to generate any particular generator might be difficult to estimate, but certain bounds can be drawn, e.g. a generator that writes the entire contents of the Internet to its tape will require a tape at least as big as the Internet, and will take at least as many steps as that simply to write it out.

Metadata

Assignees

No one assigned

    Labels

    completedFor completed novels!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions