Boxing / Unboxing and passing by 'ref'.

I recently heard a question stating something along the lines of:

 "What is Boxing?"

My initial thoughts were the classic example using an ArrayList, whereby storing a value type (int etc) is 'boxed' into an object so the ArrayList can contain it.

But the question was followed up with:

 "Ok, so is this how you think passing a value by ref to a method works?"

Personally I hadn't thought about that, I've used 'ref' on numerous occasions, but hadn't really considered how the CLR does this....

So. How does it do it?

Lets get a simple class:

  class BoxUnbox
    public int PassByValue(int theInt)
      return theInt;
    public int PassByRef(ref int theInt)
      return theInt;

**(Yes, I am fully aware of the pointless nature of returning the int from the PassByRef method)**

and something to call it:

  static void Main()
    BoxUnbox b = new BoxUnbox();
    int theInt = 10;
    int valResult = b.PassByValue( theInt );
    Console.WriteLine( "Val: " + valResult );
    Console.WriteLine( "theInt: " + theInt );
    b.PassByRef( ref theInt );
    Console.WriteLine( "theInt: " + theInt );

So.. when run we get:

  Val: 11
  theInt: 10
  theInt: 11

Which is exactly as we'd expect.
Now we know the code is working, lets dig into it a bit and see what's there... bring out the ILDasm!
   ildasm /adv BoxUnbox.exe

First, lets compare the methods themselves, we'll start with the PassByValue method.

.method public hidebysig instance int32  PassByValue(int32 theInt) cil managed
  // Code size       12 (0xc)
  .maxstack  2
  .locals init ([0] int32 CS$1$0000)
  IL_0000:  nop 
  // Load argument 1 onto the stack ('theInt')
  IL_0001:  ldarg.1 
  // Pushes '1' onto the stack
  IL_0002:  ldc.i4.1
  // Add IL_0001 and IL_0002 together (theInt + 1)
  IL_0003:  add     
  // Pops value from stack, places it in the argument: 'theInt'
  IL_0004:  starg.s    theInt   
  // Load argument 1 onto the stack
  IL_0006:  ldarg.1  
  // Popped from stack, stored in local var
  IL_0007:  stloc.0  
  // Transfer control to next instruction
  IL_0008:  br.s       IL_000a  
  // Push local var 0 back onto the stack
  IL_000a:  ldloc.0
  // Return top value of stack. 
  IL_000b:  ret    
} // end of method BoxUnboxTests::PassByValue

Easy stuff, get the argument, add 1 to it, return it... I admit the 'br.s' call confuses me slightly, 'br' acts as a 'jump' or 'goto' call, but in this case we're just going to the next address, so seems a little arbitrary.

Anyhews - onto 'PassByRef', straight off, we can see that the code is 2 lines longer, and the argument instead of being 'int32' is now 'int32&' .. an address! (Right - at this moment, we can pretty much scratch the 'boxing' principle we're investigating, the PassByRef isn't using an 'object' - you could stop reading now and move along... :))

.method public hidebysig instance int32  PassByRef(int32& theInt) cil managed
  // Code size       14 (0xe)
  .maxstack  3
  .locals init ([0] int32 CS$1$0000)
  IL_0000:  nop     
  // Load argument 1 onto the stack.     
  IL_0001:  ldarg.1 
  // Load argument 1 onto the stack (I'm at a loss as to why this is done twice
  // I *think* it's something to do with the int32 being passed as an address (i.e.
  // int32&)
  IL_0002:  ldarg.1
  // Loads the value on the stack as an int32 (i4 = Int32, i8=Int64 etc)
  IL_0003:  ldind.i4 
  // Pushes '1' onto the stack.
  IL_0004:  ldc.i4.1 
  // Adds the top two members of the stack (IL_0003 and IL_0004).
  IL_0005:  add 
  // Stores the result of IL_0005 at a given address..
  IL_0006:  stind.i4 
  // Load argument 1 onto the stack
  IL_0007:  ldarg.1 
  // Loads the value on the stack as an int32
  IL_0008:  ldind.i4
  // Popped off of stack, stored into local var 0.
  IL_0009:  stloc.0 
  // Transferring control to next instruction.
  IL_000a:  br.s       IL_000c    
  // Push local var 0 back onto stack
  IL_000c:  ldloc.0 
  // Return top value of stack.
  IL_000d:  ret      
} // end of method BoxUnboxTests::PassByRef

Ooook, aside from my inability to explain the two callings of ldarg.1 (hopefully someone will explain - and I'll update the post), it's pretty much the same as before, with the extra calls to 'ldind.i4' which gets us the actual value of the int. I'm also a little confused by the lack of a call to 'stind.i4' (the opposite of ldind.i4).. Possibly I'm missing an optimisation here.

As it stands, we can see that boxing isn't what occurs when passing by 'ref' to a method, but to further prove that, lets peruse the 'Main' method that actually calls these things!

I won't show the whole method, as there are some repetitous bits that we don't need to cover.. Firstly though - we create 3 variables:

  //Create local vars
  .locals init ([0] class BoxUnboxRefValueTypes.Program/BoxUnbox b,
           [1] int32 theInt,
           [2] int32 valResult)
So when we do any loading from the evaluation stack of (say) '0' we're actually referencing the 'BoxUnbox' instance 'b' etc.

  //int theInt = 10;
  IL_0007:  ldc.i4.s   10
  IL_0009:  stloc.1

Here we've set 'theInt' to be 10.

  //Load 'b'
  IL_000a:  ldloc.0
  //Load 'theInt'
  IL_000b:  ldloc.1

We've now loaded our stack with 'b' and 'theInt' ready for callin' :)

  //Call PassByValue with top of stack - theInt
  IL_000c:  callvirt   instance int32 BoxUnboxRefValueTypes.Program/BoxUnbox::PassByValue(int32)
  //Store top of stack (result of method) to 'valResult'
  IL_0011:  stloc.2

Ok, at this point we've got 'valResult' set to the result of the call to PassByValue. Next we output the data to the screen. This step occurs at 3 points in the code, so I'll only show the first time....

  //Writing 'Val: 11' to screen...
  IL_0012:  ldstr      "Val: " // Add 'Val: ' to the stack.
  IL_0017:  ldloc.2            // Add the 'valResult' variable to the stack.
  // BOX valResult... (ah HA! A Box!!)
  IL_0018:  box        [mscorlib]System.Int32  
  //Concat top two stack entries: 'Val: ' and boxed valResult, store result into top of stack
  IL_001d:  call       string [mscorlib]System.String::Concat(object,
  //Pop stack, and write val to screen..
  IL_0022:  call       void [mscorlib]System.Console::WriteLine(string)
Right, so, yes, we have a boxing situation! This is because the 'Concat' method takes 2 objects. The Box command pops the 'valResult' from the stack, puts it into an object reference and right back onto the stack. The newly allocated object contains the data that was stored in the 'valResult'. The [mscorlib]System.Int32 argument is a tag to indicate the type of the data stored.

  IL_003e:  ldloc.0 //Load 'b'
  IL_003f:  ldloca.s   theInt  //Load the *address* of the 'theInt' var to the stack
  IL_0041:  callvirt   instance int32 BoxUnboxRefValueTypes.Program/BoxUnbox::PassByRef(int32&)
  //Just pop off the result as we're not using it.
  IL_0046:  pop

And now we've called the PassByRef method. The key differences between the PassByRef and PassByValue calls is actually in the lines *above* the call, for PassByValue:

  IL_000b:  ldloc.1

and for PassByRef:

  IL_003f:  ldloca.s   theInt

The PassByValue 'loading' of the 'theInt' variable loads the actual value. The PassByRef loads the *address* of the 'theInt' variable.

So there you have it.

Passing a 'ref' parameter of a value type (such as int) causes the *address* of the int to be passed in. The variable isn't boxed in any way.

Print | posted @ Wednesday, September 24, 2008 12:30 PM

Comments on this entry:

No comments posted yet.

Post A Comment