EntityFramework 基本效能簡介與調校

本篇筆記記錄如何使用Include的方式避開EntityFramework所造成的效能地雷問題,並用SQL server profiler 觀察產生的SQL

初學者在使用EntityFramekwork的人通常會遇到陷能地雷陷井

在EntityFramework在關聯式資料庫非常好用

又尤其是當你的資料表有做正規化的時候且有做Forign key時,你會愛上它

而什麼是效能地雷陷井呢?

假設我有三個table,ExpenseDetail是記錄我的消費記錄,Users記錄了所有的的User 

CurrencyMapping則記錄了幣別的對應關係(ex: NTD => 台幣)

Detail 則分別有FK: CurrencyCode, UserID

我們若是想要針對目前所有的消費記錄抓出姓名跟幣別的話,下面則展示了兩種做法

 using (var db = new NorthwindEntities())
            {
                Stopwatch sw = new Stopwatch();
                sw.Start();

                var expense = db.ExpenseDetail
                    .Select(m => new
                    {
                        user = m.Users.userName,
                        itemDetail = m.description,
                        currency = m.CurrencyMapping.CurrentDescription
                    }).ToList();
                sw.Stop();
                Console.WriteLine("method 1 takes " + sw.ElapsedMilliseconds);

                sw.Reset();
                sw.Start();
                var expense_usingInclude = db.ExpenseDetail.Include("Users").Include("CurrencyMapping")
                    .Select(m => new
                    {
                        user = m.Users.userName,
                        itemDetail = m.description,
                        currency = m.CurrencyMapping.CurrentDescription
                    }).ToList();
                sw.Stop();
                Console.WriteLine("method 2 takes " + sw.ElapsedMilliseconds);
            }

這兩個做法僅差在db.ExpenseDetail.Include("Users").Include("CurrencyMapping")

這邊用的是1000筆的消費記錄,兩個結果都是一樣,但在筆者的測試結果效能居然差36倍

若是今天筆數更大,相信效能一定差的更多

這就是所謂的效能陷井

第一種做法是直接將ExpenseDetail先撈出來,等到我們要抓User的資料或是Currency的資料時,才去一筆一筆再抓一次

為求證據,我們使用SQL Server Profiler即可知道運作方式

                var mehtod1 = db.ExpenseDetail.Where(m => m.seq <= 3).ToList();

                var result1 = mehtod1.Select(m => new
                    {
                        user = m.Users.userName,
                        itemDetail = m.description,
                        currency = m.CurrencyMapping.CurrentDescription
                    }).ToList();

的確一開始只抓ExpenseDetail

慘不忍睹的事情要發生啦,先抓User Table再抓CurrencyMapping

證明了這樣的效能是非常之差的啊,換言之就是你若是要抓N筆,以這個例子來說,我要存取的是兩個FK的table

總共就會有1(抓ExpenseDetail) + N*2 (FK) 次的SQL被產生

總結: 若是你用這種寫法,將需要產生 1+ N(rows of records in the table)*M ( number of FK you are going to access)次SQL

 

第二種做法則是我在抓ExpenseDetail的時候,我就把User跟Currency的資料一併帶出來

                var mehtod2 = db.ExpenseDetail.Where(m => m.seq <= 3).Include("Users").Include("CurrencyMapping").ToList();

                var result2 = mehtod2.Select(m => new
                   {
                       user = m.Users.userName,
                       itemDetail = m.description,
                       currency = m.CurrencyMapping.CurrentDescription
                   }).ToList();

而這樣的效能又會如何呢?

各位觀眾,只有一段啊!!! 這就是我們希望EntityFramework幫我們產生的SQL不是嗎?

另外再多介紹Include的巧用方式

因為需求改變,我們需要在消費記錄中增加每一筆消費都要有發票檔案,一筆的消費可能會有兩個以上發票檔案

輸入格式跟之前的一樣,這項需求應該不難,你依然可以用Include避開效能問題,那問題會是什麼?

首先我們需要從ExpenseDetailAttachment串回去ExpenseDetail 再串回去User, CurrencyMapping表

從ExpenseDetailAttachment到User, CurrencyMapping並沒有直接的Link關係,若是我們要用Include的話會發生什麼事?

               var dbResult = db.ExpenseDetailAttachment
                    .Where(m => m.ExpenseDetail.currencyCode == "US")
                    .Include("ExpenseDetail")
                    .Include("CurrencyMapping")
                    .Include("Users").ToList();


                var result = dbResult
                    .Select(m => new
                   {
                       user = m.ExpenseDetail.Users.userID,
                       itemDetail = m.ExpenseDetail.description,
                       currency = m.ExpenseDetail.CurrencyMapping.CurrentDescription
                   }).ToList();

A specified Include path is not valid. The EntityType 'NorthwindModel.ExpenseDetailAttachment' does not declare a navigation property with the name 'CurrencyMapping'.

這時候可以發現CurrencyMapping 並不是ExpenseDetailAttachment的FK,在include時會錯

所以我們也許會改成這樣,把非FK的註解掉,如果這麼做,我們又陷入效能陷井了

                var dbResult = db.ExpenseDetailAttachment
                   .Include("ExpenseDetail")
                   .Where(m => m.ExpenseDetail.currencyCode == "US")
                   .ToList();
                   //.Include("CurrencyMapping")
                   //.Include("Users").ToList();


                var result = dbResult
                    .Select(m => new
                    {
                        user = m.ExpenseDetail.Users.userID,
                        itemDetail = m.ExpenseDetail.description,
                        currency = m.ExpenseDetail.CurrencyMapping.CurrentDescription
                    }).ToList();

一樣來看產生出來的SQL

我們已經使用Include了避開效能陷井了,但卻又因為要存取到關聯式資料表的另一個連結的資料,而又掉入了陷井

所以我突發奇想改成這樣的寫法避開效能陷井

                var dbResult = db.ExpenseDetailAttachment
                    .Where(m => m.ExpenseDetail.currencyCode == "US")
                    .Select(m => m.ExpenseDetail)
                    .Include("CurrencyMapping")
                    .Include("Users")
                    .SelectMany(m => m.ExpenseDetailAttachment)
                    .ToList();


                var result = dbResult
                    .Select(m => new
                   {
                       user = m.ExpenseDetail.Users.userID,
                       itemDetail = m.ExpenseDetail.description,
                       currency = m.ExpenseDetail.CurrencyMapping.CurrentDescription
                   }).ToList();

我們先繞回去ExpenseDetail 把CurrencyMapping 與 User帶入

再串回去ExpenseDetailAttachment 就可以解決掉這個問題

看產生出來的SQL卻並沒有做到我想要的事情,並沒有在一開始就把Users, CurrencyMapping帶進來

 

所以最後改成用Linq的方式讓IQueryalbe的東西處理掉,即可解決

                var result = (from a in db.ExpenseDetailAttachment
                              join d in db.ExpenseDetail.Include("CurrencyMapping").Include("Users") on a.Seq equals d.seq
                              where d.currencyCode == "US"
                              select new
                              {
                                  user = d.Users.userName,
                                  itemDetail = a.fileName,
                                  currency = d.CurrencyMapping.CurrentDescription
                              }).ToList();

最後可以看一下SQL長成什麼樣子

 

但實務上其實直接採用ExpenseDetail 再include 這三個table就行了

在兩個為關節點角色的主表(ExpenseDetail, ExpenseDetailAttachment)裡,就適合用這樣的方式來處理,如下圖

供大家參考