摘要:Lucene.Net 使用 * operator
Lucene.Net 支援一個和多個字元的萬用字元搜尋 ,
單個萬用字元可以使用 ? 符號 , 而多個萬用字元可以使用 * 符號 ,
範例如下 :
Step 1 : Build RAMDirectory
static RAMDirectory dir = new RAMDirectory();
Step 2 : Build Index
private void BuildIndex()
{
IndexWriter iw = new IndexWriter(dir, new StandardAnalyzer(Version.LUCENE_30), true, IndexWriter.MaxFieldLength.UNLIMITED);
Document doc = new Document();
doc.Add(new Field("PROD_ID", "", Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
doc.Add(new Field("PROD_Name", "", Field.Store.YES, Field.Index.ANALYZED, Field.TermVector.NO));
for (int i = 1; i <= 10; i++)
{
doc.GetField("PROD_ID").SetValue(Guid.NewGuid().ToString());
doc.GetField("PROD_Name").SetValue("Lucene.Net" + i.ToString());
iw.AddDocument(doc);
}
iw.Optimize();
iw.Commit();
iw.Close();
}
Step 3 : Search
private void Search(string KeyWord)
{
IndexSearcher search = new IndexSearcher(dir, true);
QueryParser parser = new QueryParser(Version.LUCENE_30, "PROD_Name", new StandardAnalyzer(Version.LUCENE_30));
Query query = parser.Parse(KeyWord);
var hits = search.Search(query, null, search.MaxDoc).ScoreDocs;
foreach (var res in hits)
{
Response.Write(string.Format("PROD_ID:{0} / PROD_Name{1}"
, search.Doc(res.Doc).Get("PROD_ID").ToString()
, search.Doc(res.Doc).Get("PROD_Name").ToString() + "
"));
}
}
Result :
當我們使用 * operator 時 ,
BuildIndex();
Search("Lucene.Net*");
結果如下 : 所有關鍵字 Lucene.Net 的都出現在清單上
PROD_ID:8049da84-4525-4c62-8956-8be89e28d3a3 / PROD_NameLucene.Net1
PROD_ID:7ec7f72a-f526-43eb-b18b-f565b57d44dc / PROD_NameLucene.Net2
PROD_ID:7f666428-5616-4e32-8a2a-a3b91a81c460 / PROD_NameLucene.Net3
PROD_ID:54619b70-70d1-46ed-a3cc-27b77acfee13 / PROD_NameLucene.Net4
PROD_ID:684e8a52-4122-48e3-be86-45cbf2c1c583 / PROD_NameLucene.Net5
PROD_ID:b1d8c8ad-bc7e-4072-92c9-7d808a4cd4dd / PROD_NameLucene.Net6
PROD_ID:3a625e79-04a0-408e-acdb-c2c1af58108b / PROD_NameLucene.Net7
PROD_ID:c9e25f50-9d1e-48d5-bb61-7f4c1bb51173 / PROD_NameLucene.Net8
PROD_ID:493327fc-a52c-4dd7-a4c5-85917014bbfa / PROD_NameLucene.Net9
PROD_ID:92a6703a-14cf-41be-988d-514922a4115a / PROD_NameLucene.Net10
使用 ? operator 時 : 發現缺了一個 Lucene.Net10
BuildIndex();
Search("Lucene.Net?");
結果如下 :
PROD_ID:79ed77df-a03f-439c-9287-2f2a13187513 / PROD_NameLucene.Net1
PROD_ID:264327aa-f8a4-4928-93ab-68fdad7fca66 / PROD_NameLucene.Net2
PROD_ID:1ca1911a-3014-45c9-9690-ac4000cd6e3c / PROD_NameLucene.Net3
PROD_ID:876a9d5a-f8d4-44aa-936e-49aee291e39f / PROD_NameLucene.Net4
PROD_ID:20ec797e-aa5d-49b3-9cb2-a8c91e7e9713 / PROD_NameLucene.Net5
PROD_ID:9a5c57e9-cac4-4719-be39-bf5f38137c39 / PROD_NameLucene.Net6
PROD_ID:1e9f68b8-16c3-4a76-956b-232461bc3fbc / PROD_NameLucene.Net7
PROD_ID:c1b79968-2b38-4b43-9838-67a0618a27d9 / PROD_NameLucene.Net8
PROD_ID:5db4455d-602c-4fb3-89ac-6f3e86e42765 / PROD_NameLucene.Net9
可是若是把 * 跟 ? 這兩個 operator 放在 Lucene.Net 前面 , 則會發生 Exception
'*' or '?' not allowed as first character in WildcardQuery
這時候就必須在 Search 階段時 , 使用下列程式碼
QueryParser parser = new QueryParser(Version.LUCENE_30, "PROD_Name", new StandardAnalyzer(Version.LUCENE_30));
parser.AllowLeadingWildcard = true;
這樣子就不會再產生 Exception , 不過通常也伴隨著 Performance 的消耗