結構二三事 (1)

  • 849
  • 0
  • 2020-04-10

聊聊 C# 中結構的有趣行為  -- 臨時區域變數

在 C# 中,結構 (structure) 在某些狀況下會出現臨時區域變數的情況,先看看以下程式碼:

程式碼 1
  static void Main(string[] args)
  {      
      var s1 = 1.ToString();
      Console.WriteLine(s1);
      var s2 = 1.ToString();
      Console.WriteLine(s2);
  }    

程式碼 1 表面上看起來只有兩個區域變數 s1 與 s2,但實際上在編譯的過程中會產生一個變數用來儲存 1 這個 int 常值;觀察程式碼 1 編譯後的 IL Code:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // 程式碼大小       36 (0x24)
  .maxstack  1
  .locals init ([0] string s1,
           [1] string s2,
           [2] int32 V_2)
  IL_0000:  nop
  IL_0001:  ldc.i4.1
  IL_0002:  stloc.2
  IL_0003:  ldloca.s   V_2
  IL_0005:  call       instance string [mscorlib]System.Int32::ToString()
  IL_000a:  stloc.0
  IL_000b:  ldloc.0
  IL_000c:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0011:  nop
  IL_0012:  ldc.i4.1
  IL_0013:  stloc.2
  IL_0014:  ldloca.s   V_2
  IL_0016:  call       instance string [mscorlib]System.Int32::ToString()
  IL_001b:  stloc.1
  IL_001c:  ldloc.1
  IL_001d:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0022:  nop
  IL_0023:  ret
} // end of method Program::Main

在 .local init 就可以看到多出了一個 int32 型別的區域變數 V_2,這個 V_2 就是所謂的臨時區域變數,如果我們把這個程式碼的流程類比成 C# 的話,就會像以下這樣:

程式碼 2
 static void Main(string[] args)
 {
     int V_2;
     V_2 = 1;
     var s1 =V_2.ToString();
     Console.WriteLine(s1);
     V_2 = 1;
     var s2 = V_2.ToString();
     Console.WriteLine(s2);
 }

很明顯 V_2 變數被指派了兩次相同的值,也就是 1;也就是說,如果我們將程式碼改成以下這樣,指派值的行為將會減少一次:

程式碼 3
 static void Main(string[] args)
 {
     int V_2;
     V_2 = 1;
     var s1 = V_2.ToString();
     Console.WriteLine(s1);
     var s2 = V_2.ToString();
     Console.WriteLine(s2);
 }

在 C# Language Specification 5.0 版 7.5.5 Function member invocation 中提到這個行為的準則,節錄如下:

If M is an instance function member declared in a value-type:

  If E is not classified as a variable, then a temporary local variable of E’s type is created and the value of E is assigned to that variable. E is then reclassified as a reference to that temporary local variable. The temporary variable is accessible as this within M, but not in any other way. Thus, only when E is a true variable is it possible for the caller to observe the changes that M makes to this.

大意約莫是:

假設 M 是一個定義在實值型別中的執行個體方法成員

當 E 並沒有被歸類成變數的時候,則會創建一個和 E 相同型別臨時區域變數,並且將 E 的值指派給該變數。接著將 E 的部分重新分類成對該臨時區域變數的引用。M 方法可以存取此臨時區域變數,但不能用其他的方式存取該臨時區域變數。因此,只有當 E 是真的變數時,呼叫端才能夠觀察到 M 方法對於 E 的改變。

上面這段敘述恰好就是程式碼 1 出現的情況,當然,極少有機會這樣寫程式,而且這個情況很明顯;那麼,有沒有甚麼情況是在我們不經意的時候發生這個情形?觀察以下的程式碼:

程式碼 4
  class MyClass
  {
      private readonly int x;
      public MyClass(int a)
      {
          x = a;
      }

      public void Request()
      {
          Console.WriteLine(x.ToString());
      }
  }

  struct MyStruct
  {
      private readonly int x;
      public MyStruct(int a)
      {
          x = a;
      }

      public void Request()
      {
          Console.WriteLine(x.ToString());
      }
  }
MyClass 建構式的 IL Code:
.method public hidebysig specialname rtspecialname 
        instance void  .ctor(int32 a) cil managed
{
  // 程式碼大小       16 (0x10)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  call       instance void [mscorlib]System.Object::.ctor()
  IL_0006:  nop
  IL_0007:  nop
  IL_0008:  ldarg.0
  IL_0009:  ldarg.1
  IL_000a:  stfld      int32 ConsoleApp2.Program/MyClass::x
  IL_000f:  ret
} // end of method MyClass::.ctor
MyClass.Request 方法的 IL Code:
.method public hidebysig instance void  Request() cil managed
{
  // 程式碼大小       22 (0x16)
  .maxstack  1
  .locals init ([0] int32 V_0)
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  ldfld      int32 ConsoleApp2.Program/MyClass::x
  IL_0007:  stloc.0
  IL_0008:  ldloca.s   V_0
  IL_000a:  call       instance string [mscorlib]System.Int32::ToString()
  IL_000f:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0014:  nop
  IL_0015:  ret
} // end of method MyClass::Request

在建構式內,並沒有創建臨時區域變數;但在 Request 方法內,看得出來又多了個 V_0。MyStruct 的情況也是一樣,在 Request 方法中會產生臨時區域變數。這個行為被稱為防禦型複製 (Defensive copy)。

回到 C# Language Specification 5.0 版,在 7.6.4 Member Access 有提到這樣的行為,節錄部分如下:

o    If T is a class-type and I identifies an instance field of that class-type:
•    If the value of E is null, then a System.NullReferenceException is thrown.
•    Otherwise, if the field is readonly and the reference occurs outside an instance constructor of the class in which the field is declared, then the result is a value, namely the value of the field I in the object referenced by E.
•    Otherwise, the result is a variable, namely the field I in the object referenced by E

o    If T is a struct-type and I identifies an instance field of that struct-type:
•    If E is a value, or if the field is readonly and the reference occurs outside an instance constructor of the struct in which the field is declared, then the result is a value, namely the value of the field I in the struct instance given by E.
•    Otherwise, the result is a variable, namely the field I in the struct instance given by E.

上述的內容關於 readonly field 的敘述大意是,不論在 class-type 或 struct-type 中定義的  readonly field,在該型別的建構式之外引用到此唯讀欄位之時,會將它視為值 (也就是沒有被歸類為變數)

搭配上 7.5.5 的規定,如果這個唯讀欄位又是個實值型別的時候,就會產生臨時區域變數,而且出現值複製的行為,MyClass 的 Request 方法編譯後的 IL Code 顯示了這一點。

我們做個反向的操作,如果將 MyClass 中 x 欄位的 readonly 宣告移除,理論上就應該不會在 Request 方法中產生臨時變數,移除 readonly 宣告後的 IL Code 如下,的確沒有出現臨時變數:

.method public hidebysig instance void  Request() cil managed
{
  // 程式碼大小       19 (0x13)
  .maxstack  8
  IL_0000:  nop
  IL_0001:  ldarg.0
  IL_0002:  ldfld      object ConsoleApp2.Program/MyClass2::x
  IL_0007:  callvirt   instance string [mscorlib]System.Object::ToString()
  IL_000c:  call       void [mscorlib]System.Console::WriteLine(string)
  IL_0011:  nop
  IL_0012:  ret
} // end of method MyClass2::Request
結語

一般的情形下,這個行為造成的影響並不大,假若我們有一個唯讀欄位是一個大型的 strcut-type,在效能上的影響就會很顯著了。

參考來源:Micro-optimization: the surprising inefficiency of readonly fields