Byzantine Fault Tolerance

Consider the simple (!) problem of getting a value of a variable x from a distributed (containing multiple servers) storage system that uses replication to ensure correctness.

Basic problem in such a system would be inaccessible servers. Assuming that there would be f inaccessible servers, as long as there are f + 1 servers in the system, value of x queried from this system would be correct.

However, things get uglier if servers can behave the worst possible way: lie about the value of a variable.

If in addition to f inaccessible servers, f servers can lie about a value of a variable, the number of servers we would need to provide correctness would be 3f + 1 due to the fact that it is not possible to differentiate between right and wrong answers. The solution is to use majority (f + 1 > f \rightarrow |Right| > |Wrong|).

Systems tolerant to lying parts are called “Byzantine fault-tolerant” systems. The term was coined in  this paper and the name comes from a representation of the problem at hand in a war scenario: Byzantine generals trying to get to a consensus for a decision on “attack” or “retreat”.

Leave a comment

search previous next tag category expand menu location phone mail time cart zoom edit close