# Effective Assembly: Bitwise Shifts

Most people, when first starting assembly, still carry over a lot of high level constructs in their assembly programs. A common pattern is to multiply and divide when a bit shift would suffice.

For example, a lot of people would write a program to write out the binary representation of an integer using the divide and modulo operations. This is rather inefficient compared to using shifts. For example, the divide by 2 can be replaced with a right shift by 1, and modulo 2 can be replaced by a bitwise AND with 1.

*Aside: interestingly, taking any number modulo a power of two m is equivalent to doing a bitwise AND with m-1. The proof of this is left as an exercise for the reader.*

This post will address the basics you need to know about shifts to get up to speed on writing good assembly.

## The Types of Shifts

There are three types of shifts in most processors. These are:

- Left shift: shifting all the bits in an integer to the left, dropping the highest-ordered bits, and fill the lowest-order bits with
`0`

. - Logical right shift: shifting all the bits in an integer to the right, dropping the lowest-ordered bits, and fill the highest-order bits with
`0`

. - Arithmetic right shift: like a logical right shift, but the highest-order bits are filled with copies of the most significant bit, or the sign bit.

In general, a left shift by *x* is equivalent to multiplication by *2 ^{x}*, regardless of sign. Right shifts are equivalent to floor division by

*2*. You would use logical right shift to process unsigned integers, and arithmetic right shifts to process signed integers.

^{x}## Example

In x86 assembly, the instructions `shl`

, `shr`

, `sar`

performs left shift, logical right shift, and arithmetic right shift, respectively.

Consider the bit pattern `1101 1110 1010 1101 1011 1110 1110 1111`

stored in the register `eax`

.

## Applications

Bit shifts manifest itself in many places. For example, in memory addressing instructions. Many processors provide some instruction to index into an array where the elements are a power-of-two in size, with a base address, and perhaps also letting you add a constant offset.

The x86 instruction looks like `mov eax, [ebx+4*ecx]`

. While this looks like a multiplication, the fact that you can only multiply by powers-of-two is a dead giveaway: it is actually a shift. The multiplication by 4 is encoded as a left shift of 2.

The ARM processor is much less subtle. The equivalent instruction would be: `LDR R0, [R1, R2, LSL #2]`

. There is no abstraction to hide the shift in a power-of-two multiplication on this platform.

As an example, consider the following C code:

`a = b[c]`

is equivalent to `a = *(b + c)`

. And assuming `sizeof(int) == 4`

, then this effectively does `a = *(int*)((char*)b + 4 * c)`

. In assembly, pointer arithmetic is always done on by byte level, hence we need to consider casting to `char*`

for an effective demonstration.

Now, consider that `a`

is mapped to `eax`

(or `R0`

for the ARM example), `b`

is mapped to `ebx`

(`R1`

), and `c`

to `ecx`

(`R2`

). Then, `a = b[c]`

would be compiled to `mov eax, [ebx+4*ecx]`

(or `LDR R0, [R1, R2, LSL #2]`

).

## Fancy Arithmetic

On the x86, there is a very interesting instruction called `lea`

. It has a similar syntax to the aforementioned version of the `mov`

instruction, which adds two registers with an optional left shift on one of them, and a constant offset. However, instead of loading the value at that memory address into the destination register, it instead loads the *address* itself into the destination register.

More interestingly, the address that is loaded doesn’t even have to be valid. So, in effect, this is a fancy arithmetic instruction.

Consider this code:

Using the register assignment `a -> eax, b -> ebx, c -> ecx`

, then this can be done in one single instruction: `lea a, [b + 8*c + 10]`

.

The takeaway is: effective use of bitwise shifts and instructions based on bitwise shifts can help you optimize your handwritten assembly code dramatically. Use them if possible.