Turing Completeness, Go, Ethereum and Bitcoin

I'd read a lot about Turing machines, a language being Turing complete, and in general about Turing completeness, but never actually looked into it in detail. I've spent the past couple of weeks learning more about it, and it's importance, both in literature and in practice. Here is a log for myself to look back upon my findings.

What is Turing Completeness?

As per Wiki, In computability theory, a system of data-manipulation rules (such as a computer's instruction set, a programming language, or a cellular automaton) is said to be Turing-complete or computationally universal if it can be used to simulate any Turing machine.

But what does this mean in layman terms? The question was how to verify if a mathematical task is computable. To answer this question, Turing came up with an abstract idea of a machine (Turing machine) which can simulate any problem if it can be coded (or logically constructed). It doesn't take into account how much time, or resources (CPU, memory) it will take to solve the problem, but is certain that for a given input, the machine will produce an output.

via GIFER

So, if somebody says "x is Turing Complete", this means in principle that it could be used to solve any computation problem. Turing machines are similar to finite automata/finite state machines but are unbounded by memory or runtime constraints.


Turing Machine, in detail

A Turing machine consists of an infinite tape (as the memory), a tape head (a pointer to the currently inspected cell of memory), and a state transition table (to govern the behavior of the machine). Each cell of the tape can have one of a predetermined finite set of symbols, one of which is the blank symbol.

A Turing machine, like an algorithm, executes on an input string of bits. At the beginning, the input string is written on the tape, the tape head points to the first cell of the string, and all other cells are blank.

During operation, the tape head is in a certain state. Every step, it consults the table (the state transition table), based on only the state it's in and the symbol underneath the head, to obtain its next choice: either halt (end the operation), or resume by writing a symbol to the current cell, changing its state, and moving to the left or the right. This way, A Turing machine can simulate the fact that a program is made of many lines and thus it depends on what line a program is executing, and it can also simulate the fact that a program can react differently with different data in memory.

A Turing machine can thus either halt at some point or run forever. If it halts, the contents of the tape are the output.

Example

There are many Turing machine simulators online, one such is this simulator. Here is a simple example of a machine which checks if Binary numbers are divisible by 3

Here is the state transition table for the same.


Are Programming Languages Turing Complete?

The Church-Turing thesis claims that any computable problem can be computed by a Turing machine. This means that a computer more powerful than a Turing machine is not necessary to solve computable problems. The idea of Turing completeness is closely related to this. A system is Turing complete if it can compute every Turing computable function. A programming language that is Turing complete is theoretically capable of expressing all tasks accomplishable by computers; nearly all programming languages are Turing complete.

In general, for an imperative language to be Turing-complete, it needs:

  1. A form of conditional repetition or conditional jump (e.g., loops while, conditionals if+goto)
  2. A way to read and write some form of storage (e.g., variables, tape)

To check to see if something is Turing complete, see if you can implement a Turing machine inside it. In other words, check for the following conditions:

1) Can make decisions - The 'language' that only supports +, -, *, and / on integers is not Turing complete because it can't make a choice based on its input, but a Turing machine can.

2) Can run forever - Turing proved that you cannot predict whether a program will terminate by simulating it on a computer. In simple terms, we cannot predict the path of a program without running it. Turing-complete systems can run in "infinite loops," a term used (in oversimplification) to describe a program that does not terminate.

If we took Java, Javascript, or Python and removed the ability to do any sort of loop, GOTO, or function call, it wouldn't be Turing complete because it can't perform an arbitrary computation that never finishes. Coq is a theorem prover that can't express programs that don't terminate, so it's not Turing complete.

3) Can use infinite memory - A language that was exactly like Java but would terminate once it used more than 4 Gigabytes of memory wouldn't be Turing complete, because a Turing machine can use infinite memory. This is why we can't actually build a Turing machine, but Java is still a Turing complete language because the Java language has no restriction preventing it from using infinite memory. This is one reason regular expressions aren't Turing complete.

4) Can read and write data to RAM- A language that only lets you work with memory through push and pop operations to a stack wouldn't be Turing complete. If I have a 'language' that reads a string once and can only use memory by pushing and popping from a stack, it can tell me whether every ( in the string has its own ) later on by pushing when it sees ( and popping when it sees ). However, it can't tell me if every ( has its own ) later on and every [ has its own ] later on (note that ([)] meets this criteria but ([]] does not). A Turing machine can use its random access memory to track ()'s and []'s separately, but this language with only a stack cannot.

5) Can simulate any other Turing machine - A Turing machine, when given an appropriate 'program', can take another Turing machine's 'program' and simulate it on arbitrary input. If you had a language that was forbidden from implementing a Python interpreter, it wouldn't be Turing complete.

In short, if a language has infinite RAM, conditional execution, and some form of repeated execution, it's probably Turing complete.

Most modern programming languages (e.g. Go, Python, Java, JavaScript, Perl, etc.) are all Turing complete because they each implement all the features mentioned above. Then there are “computing environments” that you would not expect to be Turing Complete, but really are.

  • For example, Excel spreadsheets are Turing complete. That’s not entirely a surprise, since they include a full-fledged programming language for writing macros/extensions. But even without using that, the formula language of Excel can be argued to be Turing complete
  • The redstones in the game Minecraft define a language that is Turing-complete.
  • Conway’s Game of Life (with a demo here ), a much more amusing form of automaton than the ones we have focused on in this course, is also Turing complete

What is the advantage of a Turing complete machine?

Let's take the example of Ethereum. Ethereum’s ability to execute a stored program, in a state machine called the Ethereum Virtual Machine, while reading and writing data to memory makes it a Turing-complete system and therefore a UTM. Ethereum can compute any algorithm that can be computed by any Turing machine, given the limitations of finite memory.

Now, knowing that a language is Turing complete assures that if a function (or a script is run), for a given input, the desired output can be achieved. This is important for a language like Solidity, because you can understand the impact of Turing Completeness on Smart Contracts.

Turing completeness is very easy to achieve; in fact, the simplest Turing-complete state machine known has 4 states and uses 6 symbols, with a state definition that is only 22 instructions long. Indeed, sometimes systems are found to be "accidentally Turing complete." A fun reference of such systems can be found at http://bit.ly/2Og1VgX.

However, Turing completeness is very dangerous, particularly in open access systems like public blockchains, because of the halting problem. For example, modern printers are Turing complete and can be given files to print that send them into a frozen state. The fact that Ethereum is Turing complete means that any program of any complexity can be computed by Ethereum. But that flexibility brings some thorny security and resource management problems. An unresponsive printer can be turned off and turned back on again. That is not possible with a public blockchain.


Implications of Turing Completeness


Turing proved that you cannot predict whether a program will terminate by simulating it on a computer. In simple terms, we cannot predict the path of a program without running it. Turing-complete systems can run in "infinite loops," a term used (in oversimplification) to describe a program that does not terminate. It is trivial to create a program that runs a loop that never ends. But unintended never-ending loops can arise without warning, due to complex interactions between the starting conditions and the code. In Ethereum, this poses a challenge: every participating node (client) must validate every transaction, running any smart contracts it calls. But as Turing proved, Ethereum can’t predict if a smart contract will terminate, or how long it will run, without actually running it (possibly running forever). Whether by accident or on purpose, a smart contract can be created such that it runs forever when a node attempts to validate it. This is effectively a DoS attack. And of course, between a program that takes a millisecond to validate and one that runs forever are an infinite range of nasty, resource-hogging, memory-bloating, CPU-overheating programs that simply waste resources. In a world computer, a program that abuses resources gets to abuse the world’s resources.


Is Ethereum Turing Complete?

Not exactly. Solidity is, EVM is not. As Gavin Wood mentions, the Ethereum Virtual Machine is a quasi–Turing-complete state machine; "quasi" because all execution processes are limited to a finite number of computational steps by the amount of gas available for any given smart contract execution.

Ethereum can’t predict if a smart contract will terminate, or how long it will run, without actually running it (possibly running forever). Ethereum introduces a metering mechanism called gas. As the EVM executes a smart contract, it carefully accounts for every instruction (computation, data access, etc.). Each instruction has a predetermined cost in units of gas. When a transaction triggers the execution of a smart contract, it must include an amount of gas that sets the upper limit of what can be consumed running the smart contract. The EVM will terminate execution if the amount of gas consumed by computation exceeds the gas available in the transaction. Gas is the mechanism Ethereum uses to allow Turing-complete computation while limiting the resources that any program can consume.

So there are 2 limits imposed on any I/O on the Ethereum blockchain. These are gas cost and block gas limit.


Is Bitcoin Turing Complete?

Bitcoin scripts currently do not enable loops. Therefore, they are commonly considered to be not Turing Complete. This limits the types of algorithms the Bitcoin scripts can execute to linear or tree-like instructions.


Does a system have to be Turing Complete to be useful?

Not really. ADAM VARTANIAN writes a good case of how using a Turing-incomplete DSL can have a host of advantages—from predictable resource usage to improved analysis. Let's take the example of Vyper, a contract-oriented, pythonic programming language that targets the Ethereum Virtual Machine (EVM). Vyper doesn’t support the following features, which makes it Turing incomplete:

1. Modifiers
2. Inline assembly
3. Function overloading
4. Operator overloading
5. Recursive calling
6. Infinite-length loops
7. Binary fixed point

But the tradeoff of this DSL is that instead it provides added security. Fewer features, more auditability, more security, more predictable resource usage.


Resources

Farhan Aly

Farhan Aly