Obsess over every detail. Ask why it works. Ask why it isn’t built another way.

Bit Shifting in Assembly

We’re going to talk about bit shifting!!

1
2
3
4
5
6
7
8
9
int main()
{
    unsigned int a, b, c;
    a = 0x5;
    b = a << 4;
    c = b >> 3;

    return c;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
main:
    push    rbp {__saved_rbp}
    mov     rbp, rsp {__saved_rbp}
    sub     rsp, 0x30
    call    __main
    mov     dword [rbp-0x4 {var_c}], 0x5
    mov     eax, dword [rbp-0x4]
    shl     eax, 4  {0x50}                           ; new instruction (SHL)
    mov     dword [rbp-0x8 {var_10}], eax  {0x50}
    mov     eax, dword [rbp-0x8]
    shr     eax, 3  {0xa}                            ; new instruction (SHR)
    mov     dword [rbp-0xc {var_14}], eax  {0xa}
    mov     eax, dword [rbp-0xc]  {0xa}
    add     rsp, 0x30
    pop     rbp {__saved_rbp}
    retn    {__return_addr}

SHL - Shift logical left

C uses « operator when doing this, the first operand (source and destination) is an r/mX, second operand is either cl (lowest byte of rcx), or a 1 byte immediate. The 2nd operand is the number of places to shift. It multiplies the register by 2 for each place the value is shifted. It’s more efficient than a multiply instruction. Bits shifted off the left hand side are shifted into (set) the carry flag. For purposes of determining if the CF is set at the end, think of it as n independent 1 bit shifts. When you’re going to the left the LSB are filled in with zeros.

shl bl, 2

1
2
3
        00110011b (bl - 0x33)

RESULT  11001100b (bl - 0xCC) CF = 0

Confused how it worked? Let’s start with the basics. Let’s say we have the decimal 5.

So in binary it looks like this and each position is a power of 2

1
2
3
4
5
+---------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |     |
+---------------------------------------------+
| 0   | 0  | 0  | 0  |  0 |  1 | 0  | 1 | = 5 |
+---------------------------------------------+

So if we did left shift in C:

1
2
int x = 5;
x = x << 1;

Visually it looks like this before:

+---------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |     |
+---------------------------------------------+
| 0   | 0  | 0  | 0  |  0 |  1 | 0  | 1 | = 5 |
+---------------------------------------------+

Now shift left by 1:

+----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |      |
+----------------------------------------------+
| 0   | 0  | 0  | 0  |  1 |  0 | 1  | 0 | = 10 |
+----------------------------------------------+

As you can see here we indeed literally shifted it left by 1!

SHR - Shift Logical right

C uses » operator, first operand (source and destination) is an r/mX. Second operand is either cl (lowest byte of rcx), or a 1 byte immediate. The 2nd operand is the number of places to shift. It divides the register by 2 for each place the value is shifted. More efficient than a divide instruction. Bits shifted off the right hand side are shifted into (set) the carry flag (CF). For purposes of determining if the CF is set at the end think of it as n independent 1 bit shifts. When you’re shifting to the right the MSB are filled with zeros.

Let’s say we have 00110011b (bl - 0x33) and we shift it right by 2!

Before:
+----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |      |
+----------------------------------------------+
| 0   | 0  | 1  | 1  |  0 |  0 | 1  | 1 | = 51 |
+----------------------------------------------+

After (shift right by 2):
+----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |      |
+----------------------------------------------+
| 0   | 0  | 0  | 0  |  1 |  1 | 0  | 0 | = 12 |
+----------------------------------------------+

And the CF is 1! because the last bit that fell off the right edge was a 1.

Unsigned shifting

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
#include <stdlib.h>

int main ( int argc, char **argv )
{
    unsigned int a, b, c;

    a = atoi ( argv[1] );
    b = a * 8;
    c = b / 16;

    return c;
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
main:
    push    rbp {__saved_rbp}
    mov     rbp, rsp {__saved_rbp}
    sub     rsp, 0x30
    mov     dword [rbp+0x10 {argc_1}], ecx
    mov     qword [rbp+0x18 {arg_10}], rdx
    call    __main
    mov     rax, qword [rbp+0x18 {arg_10}]
    add     rax, 0x8
    mov     rax, qword [rax]
    mov     rcx, rax
    call    atoi
    mov     dword [rbp-0x4 {var_c}], eax
    mov     eax, dword [rbp-0x4 {var_c}]
    shl     eax, 0x3
    mov     dword [rbp-0x8 {var_10}], eax
    mov     eax, dword [rbp-0x8 {var_10}]
    shr     eax, 0x4
    mov     dword [rbp-0xc {var_14}], eax
    mov     eax, dword [rbp-0xc {var_14}]
    add     rsp, 0x30
    pop     rbp {__saved_rbp}
    retn    {__return_addr}

Where are the multiply and divide? :0

You may see shifts in asm when no one wrote any shifts in C. If a multiply by power of 2 or a divide by power of 2 exist in C, a compiler may just turn it into a shift left or a shift right. This particular assembly was compiled with optimization.

Differentiation between signed and unsigned in shifting

1
2
3
4
5
// unsigned
unsigned int a, b, c;

// signed
         int a, b, c;

The sign can cause integer overflows btw.

Let’s take a look at an example (this one is signed)

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
    push    rbp {__saved_rbp}
    mov     rbp, rsp {__saved_rbp}
    sub     rsp, 0x30
    mov     dword [rbp+0x10 {argc_1}], ecx
    mov     qword [rbp+0x18 {arg_10}], rdx
    call    __main
    mov     rax, qword [rbp+0x18 {arg_10}]
    mov     rax, qword [rax]
    mov     rcx, rax
    call    atoi
    mov     dword [rbp-0x4 {var_c}], eax
    mov     eax, dword [rbp-0x4 {var_c}]
    shl     eax, 0x3
    mov     dword [rbp-0x8 {var_10}], eax
    mov     eax, dword [rbp-0x8 {var_10}]
    lea     edx, [rax+0xf]
    test    eax, eax
    cmovs   eax, edx
    sar     eax, 0x4  ; sar? interesting! if this is unsigned it'll be shr
    mov     dword [rbp-0xc {var_14}], eax
    mov     eax, dword [rbp-0xc {var_14}]
    add     rsp, 0x30
    pop     rbp {__saved_rbp}
    retn    {__return_addr}

SAR - Shift arithmetic right

Can be used with » operator too, if operands are signed. First operand (source and destination) is an r/mX, second operand is either cl (lowest byte of rcx) or a 1 byte immediate. The 2nd operand is the number of places to shift. It divides the register by 2 for each place the value is shifted. More efficient than divide instruction. Each bit shifted off the right side is placed in CF.

sar bl, 1

+-----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |       |
+-----------------------------------------------+
| 1   | 0  | 1  | 1  |  0 |  0 | 1  | 1 | = 179 |
+-----------------------------------------------+

So if we shift right by 1:

+-----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |       |
+-----------------------------------------------+
| 1   | 1  | 0  | 1  |  1 |  0 | 0  | 1 |       |
+-----------------------------------------------+
 (   MSB   )  The MSB must be filled in with whatever the sign bit is

Why should the MSB be filled in with whatever the sign bit is? Well, to preserve the negativeness or positiveness of the value that you’re shifting.

SAL - Shift Arithmetic left

Behaves exactly the same as SHL. First operand (source and destination) is an r/mX, second operand is either cl (lowest byte of rcx), or a 1 byte immediate. The 2nd operand is the number of places to shift. It multiplies the register by 2 for each place the value is shifted. More efficient than a multiply instruction. Each bit shifted off the left side is placed in CF.

+-----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |       |
+-----------------------------------------------+
| 1   | 0  | 1  | 1  |  0 |  0 | 1  | 1 |       |
+-----------------------------------------------+

So if we shifted this left by 1:

+-----------------------------------------------+
| 128 | 64 | 32 | 16 |  8 |  4 | 2  | 1 |       |
+-----------------------------------------------+
| 0   | 1  | 1  | 0  |  0 |  1 | 1  | 0 |       |
+-----------------------------------------------+