eip-1153 reentrancy protections.md

Do you write smart contracts? Want them to be safe and efficient? Read on!

The state of smart contract languages could historically be categorized as lacking constructs that drive programmers to write safe code and being inefficient due to poor optimizations. Oftentimes, programmers write lower level code riddled with footguns in pursuit of gas savings. What if safety and efficiency weren’t at odds?

Here’s how we can eliminate an entire class of bugs without spending an exorbitant amount of gas on safety checks thanks to EIP-1153!

For example, take the following smart contract (Figure 1) which exhibits “read-only reentrancy”. Currently, nothing prevents the following call sequence from succeeding despite there being ambiguity about what value will be returned by DataRace.price during call sequence, X.

Callstack [DataRace.withdraw, msg.sender, X, token.transfer]

If any call in sequence, X, calls DataRace.price, it will receive an incorrect value (the token balance has not yet been updated). A contract that uses this intermediate value may be exploited if the user can manipulate it to favor them.

X = [..., DataRace.price, …]

interface Token {
    function balanceOf(address) external view returns (uint);
    function transfer(address, uint) external returns (bool);
}


contract DataRace {
    Token token;


    function price() external view returns (uint) {
        return address(this).balance / token.balanceOf(address(this));
    }


    function withdraw(uint amount) external {
        msg.sender.call{value: amount}("");
        token.transfer(msg.sender, amount);
    }
}

Figure 1: Example of read-only reentrancy

Using EIP-1153 and static analysis would allow for a smart contract language to define this behavior and prevent it at runtime. If data races cannot be statically proven to not exist, require annotation by the programmer (data_race_safe) or disable it (data_race_safe=false).

Assuming we have source code, we can infer that Token.transfer writes to balances and Token.balanceOf reads from balances. This intersection means there is a read-write contention and we will create a key: BALANCE_MUTEX. We want to ensure that our contract does not use token.balanceOf in a calculator such as price prior to performing the token transfer. Thus, we remove the possibility of a data race by inserting additional safeguards at compile time.

interface Token {
    function balanceOf(address) external view returns (uint);
    function transfer(address, uint) external returns (bool);
}


contract DataRace {
    Token token;


    function price() external view data_race_safe returns (uint) {
        // if tload(BALANCE_MUTEX) revert();
        return address(this).balance / token.balanceOf(address(this));
    }


    function withdraw(uint amount) external {
        // tstore(BALANCE_MUTEX, 1)
        msg.sender.call{value: amount}("");
        token.transfer(msg.sender, amount);
    }
}

Figure 2: Read-only reentrancy automatically prevented with EIP-1153

Another example of a unique type of reentrancy was dubbed “destructive write” reentrancy in the SAILFISH paper. The following example demonstrates another type of read-write contention: splitFunds may read from an non-deterministic value from splits depending if a call in sequence, Y, calls updateSplit Callstack [DestructiveWrite.splitFunds, a.call, Y, b.transfer]

If any call in sequence Y, reenters updateSplit and updates the proportion of funds sent to address b, funds may be stolen. Y= [..., DestructiveWrite.updateSplit(b,100), …]

contract DestructiveWrite {
    mapping(uint => uint) splits;
    mapping(uint => uint) deposits;
    mapping(uint => address payable) payee1;
    mapping(uint => address payable) payee2;
    uint lock;
    modifier nonReentrant() {
	lock = 1;
	_;
	lock = 0;
    }
    function updateSplit(uint id, uint split) public{
	require(split <= 100);
	splits[id] = split;
    }
    // [Step 1]: Set split of ’a’ (id = 0) to 100(%)
    // [Step 4]: Set split of ’a’ (id = 0) to 0(%)
    function splitFunds(uint id) public nonReentrant {
        address payable a = payee1[id];
        address payable b = payee2[id];
        uint depo = deposits[id];
	 deposits[id] = 0;
        // [Step 2]: Transfer 100% fund to ’a’
        // [Step 3]: Reenter updateSplit
	  a.call{value:(depo * splits[id] / 100)}("");
        // [Step 5]: Transfer 100% fund to ’b’
        b.transfer(depo * (100 - splits[id]) / 100);
 	}
}

Figure 3: Example of destructive write reentrancy

Once again, the following tx-level mutex can be inserted and enforce that writes are not performed on splits during calls to splitFunds.

contract DestructiveWrite {
       [...]
	function updateSplit(uint id, uint split) data_race_safe public {
             // if tload(SPLITS_MUTEX) revert();
		require(split <= 100);
		splits[id] = split;
	}
	function splitFunds(uint id) public nonReentrant {
             // tstore(SPLITS_MUTEX, 1)
		address payable a = payee1[id];
		address payable b = payee2[id];
		uint depo = deposits[id];
		deposits[id] = 0;
		a.call{value:(depo * splits[id] / 100)}("");
		b.transfer(depo * (100 - splits[id]) / 100);
 	}
}

Figure 4: Destructive write automatically prevented with EIP-1153

Limitations:

EVM languages transfer execution to arbitrary code and behavior cannot be statically guaranteed. Thus, any sound analysis would be the worst case and potentially annoy programmers.
Untrusted contracts may not enforce lack of data races and themselves return inconsistent values i.e. your guarantees are only as strong as the least trusted contract.

References:

Initial Thoughts

I think transient mutexes will also allow for selective granularity on reentrancy guards. For example, Open Zeppelin's reentrancy guard abstract contract writes a lock to the contract level. Vyper's @nonreentrant modifier takes a string as an argument, allowing for key-based mutexes on the function level (or contract level, provided all functions use the same reentrancy key), showcased in Snekmate's batch distributor.

However, variable-level mutexes may also create security guarantees based on the storage slot itself versus having to check every code path that accesses the slot.

Granular Mutex Design

A possible design of such a system may be as follows.

Note: All reference implementations will use the following library implicitly:

library t {
    function load(uint256 key) internal returns (uint256 value) {
        assembly { value := tload(key) }
    }

    function store(uint256 key, uint256 value) internal {
        assembly { tstore(key, value) }
    }
}

Contract Level Mutex

The slot of keccak256(bytes("contract")) + 1 can serve as a slot to write in transient storage to set a contract-level lock.

abstract contract Mutex {
    error Locked();

    modifier contractMutex {
        uint256 slot = uint256(keccak256(bytes("contract"))) + 1;
        if (t.load(slot) != 0) revert Locked();
        t.store(slot, 1);
        _;
        t.store(slot, 0);
    }
}

Function Level Mutex

The slot of keccak256(abi.encodePacked(msg.sig)) + 1 can serve as a slot to write in transient storage to set a function-level lock.

abstract contract Mutex {
    error Locked();

    modifier functionMutex() {
        uint256 slot = uint256(keccak256(abi.encodePacked(msg.sig))) + 1;
        if (t.load(slot) != 0) revert Locked();
        t.store(slot, 1);
        _;
        t.store(slot, 0);
    }
}

Variable Level Mutex

Variables may be assigned a transient storage layout in the same way that they are laid out in storage.

abstract contract Mutex {
    error Locked();

    uint256 item;

    modifier variableMutex() {
        uint256 slot;
        assembly { slot := item.slot }
        if (t.load(slot) != 0) revert Locked();
        t.store(slot, 1);
        _;
        t.store(slot, 0);
    }
}

Thoughts?

0xalpharush/eip-1153 reentrancy protections.md

jtriley-eth commented Apr 28, 2023