本篇筆記記錄如何使用Include的方式避開EntityFramework所造成的效能地雷問題,並用SQL server profiler 觀察產生的SQL
初學者在使用EntityFramekwork的人通常會遇到陷能地雷陷井
在EntityFramework在關聯式資料庫非常好用
又尤其是當你的資料表有做正規化的時候且有做Forign key時,你會愛上它
而什麼是效能地雷陷井呢?
假設我有三個table,ExpenseDetail是記錄我的消費記錄,Users記錄了所有的的User
CurrencyMapping則記錄了幣別的對應關係(ex: NTD => 台幣)
Detail 則分別有FK: CurrencyCode, UserID
我們若是想要針對目前所有的消費記錄抓出姓名跟幣別的話,下面則展示了兩種做法
using (var db = new NorthwindEntities())
{
Stopwatch sw = new Stopwatch();
sw.Start();
var expense = db.ExpenseDetail
.Select(m => new
{
user = m.Users.userName,
itemDetail = m.description,
currency = m.CurrencyMapping.CurrentDescription
}).ToList();
sw.Stop();
Console.WriteLine("method 1 takes " + sw.ElapsedMilliseconds);
sw.Reset();
sw.Start();
var expense_usingInclude = db.ExpenseDetail.Include("Users").Include("CurrencyMapping")
.Select(m => new
{
user = m.Users.userName,
itemDetail = m.description,
currency = m.CurrencyMapping.CurrentDescription
}).ToList();
sw.Stop();
Console.WriteLine("method 2 takes " + sw.ElapsedMilliseconds);
}
這兩個做法僅差在db.ExpenseDetail.Include("Users").Include("CurrencyMapping")
這邊用的是1000筆的消費記錄,兩個結果都是一樣,但在筆者的測試結果效能居然差36倍
若是今天筆數更大,相信效能一定差的更多
這就是所謂的效能陷井
第一種做法是直接將ExpenseDetail先撈出來,等到我們要抓User的資料或是Currency的資料時,才去一筆一筆再抓一次
為求證據,我們使用SQL Server Profiler即可知道運作方式
var mehtod1 = db.ExpenseDetail.Where(m => m.seq <= 3).ToList();
var result1 = mehtod1.Select(m => new
{
user = m.Users.userName,
itemDetail = m.description,
currency = m.CurrencyMapping.CurrentDescription
}).ToList();
的確一開始只抓ExpenseDetail
慘不忍睹的事情要發生啦,先抓User Table再抓CurrencyMapping
證明了這樣的效能是非常之差的啊,換言之就是你若是要抓N筆,以這個例子來說,我要存取的是兩個FK的table
總共就會有1(抓ExpenseDetail) + N*2 (FK) 次的SQL被產生
總結: 若是你用這種寫法,將需要產生 1+ N(rows of records in the table)*M ( number of FK you are going to access)次SQL
第二種做法則是我在抓ExpenseDetail的時候,我就把User跟Currency的資料一併帶出來
var mehtod2 = db.ExpenseDetail.Where(m => m.seq <= 3).Include("Users").Include("CurrencyMapping").ToList();
var result2 = mehtod2.Select(m => new
{
user = m.Users.userName,
itemDetail = m.description,
currency = m.CurrencyMapping.CurrentDescription
}).ToList();
而這樣的效能又會如何呢?
各位觀眾,只有一段啊!!! 這就是我們希望EntityFramework幫我們產生的SQL不是嗎?
另外再多介紹Include的巧用方式
因為需求改變,我們需要在消費記錄中增加每一筆消費都要有發票檔案,一筆的消費可能會有兩個以上發票檔案
輸入格式跟之前的一樣,這項需求應該不難,你依然可以用Include避開效能問題,那問題會是什麼?
首先我們需要從ExpenseDetailAttachment串回去ExpenseDetail 再串回去User, CurrencyMapping表
從ExpenseDetailAttachment到User, CurrencyMapping並沒有直接的Link關係,若是我們要用Include的話會發生什麼事?
var dbResult = db.ExpenseDetailAttachment
.Where(m => m.ExpenseDetail.currencyCode == "US")
.Include("ExpenseDetail")
.Include("CurrencyMapping")
.Include("Users").ToList();
var result = dbResult
.Select(m => new
{
user = m.ExpenseDetail.Users.userID,
itemDetail = m.ExpenseDetail.description,
currency = m.ExpenseDetail.CurrencyMapping.CurrentDescription
}).ToList();
A specified Include path is not valid. The EntityType 'NorthwindModel.ExpenseDetailAttachment' does not declare a navigation property with the name 'CurrencyMapping'.
這時候可以發現CurrencyMapping 並不是ExpenseDetailAttachment的FK,在include時會錯
所以我們也許會改成這樣,把非FK的註解掉,如果這麼做,我們又陷入效能陷井了
var dbResult = db.ExpenseDetailAttachment
.Include("ExpenseDetail")
.Where(m => m.ExpenseDetail.currencyCode == "US")
.ToList();
//.Include("CurrencyMapping")
//.Include("Users").ToList();
var result = dbResult
.Select(m => new
{
user = m.ExpenseDetail.Users.userID,
itemDetail = m.ExpenseDetail.description,
currency = m.ExpenseDetail.CurrencyMapping.CurrentDescription
}).ToList();
一樣來看產生出來的SQL
我們已經使用Include了避開效能陷井了,但卻又因為要存取到關聯式資料表的另一個連結的資料,而又掉入了陷井
所以我突發奇想改成這樣的寫法避開效能陷井
var dbResult = db.ExpenseDetailAttachment
.Where(m => m.ExpenseDetail.currencyCode == "US")
.Select(m => m.ExpenseDetail)
.Include("CurrencyMapping")
.Include("Users")
.SelectMany(m => m.ExpenseDetailAttachment)
.ToList();
var result = dbResult
.Select(m => new
{
user = m.ExpenseDetail.Users.userID,
itemDetail = m.ExpenseDetail.description,
currency = m.ExpenseDetail.CurrencyMapping.CurrentDescription
}).ToList();
我們先繞回去ExpenseDetail 把CurrencyMapping 與 User帶入
再串回去ExpenseDetailAttachment 就可以解決掉這個問題
看產生出來的SQL卻並沒有做到我想要的事情,並沒有在一開始就把Users, CurrencyMapping帶進來
所以最後改成用Linq的方式讓IQueryalbe的東西處理掉,即可解決
var result = (from a in db.ExpenseDetailAttachment
join d in db.ExpenseDetail.Include("CurrencyMapping").Include("Users") on a.Seq equals d.seq
where d.currencyCode == "US"
select new
{
user = d.Users.userName,
itemDetail = a.fileName,
currency = d.CurrencyMapping.CurrentDescription
}).ToList();
最後可以看一下SQL長成什麼樣子
但實務上其實直接採用ExpenseDetail 再include 這三個table就行了
在兩個為關節點角色的主表(ExpenseDetail, ExpenseDetailAttachment)裡,就適合用這樣的方式來處理,如下圖
供大家參考