My Reading...

Tuesday, March 14, 2006

Vertical search Engine

Vertical search Engine

Vertical Search Engine is called local search Engine sometimes. They are designed to search through structured or semi-structured database and return the users search results. Such as
search local pizza hut, restruants.

The main challenge to veritcal search engine is that the query itself could be unstructured. The good thing about local search is that the data is structured or semi-structured, and hance the ontologies built-in could be applied to improve the search performance.

Even though the data is structured, structured data might contains unstructured data. For example, in the company table, the company name typicall is not well structured compared to company code.

The difficulties facing search unstructure data elements.
1) Synonym, a term or phrase has the similar meaning to another term.
2) Tranditional search technology could not capture the relationships between entities.
3) Co-occurence terms. User might simply omit the Co-occurence terms. This issue could be address by query expansion mechanism, by adding the Co-occurence terms collected manully or statistically to orignal query. However the expanded query might return unrelevant results and deteriorate the search precision.

A possible cure to those problems is applying domain dependent conceptural model. Ontologies are example of such conceptual models. In this context, ontology is a collection of concepts (similar with class in the object-oriented world) and their interrelationships which can collectively provide an abstract view of an application domain.

To some extend, the ontology is very similar with database schema which capture mainly parent and child relationship, while ontology is meant to capture any possible types of relationships.

However, the difficulty here is to "understand" , "parse" or 'tag' the input query string with the pre-defined ontology, namely ontology-based information extraction. In the IE(Information extraction) world, this is so-called named entity recognization.

There are serveral way to recognize a name entity.
1) maintain a list of string pair. For example ,
2) recognize a name entity by pattern matching. For example, Dr. SomeOne is a person. the pattern is Dr [.] String.
3) hybrid the first and the second method. In Chinese, the last name of person usually use a very special set of Chinese character, and typically 3 Chinese character for one person's name. By storing this set of characters for the last name, the rest of them could be calculated.

Vertical search usually needs deal with organization name which is not well structured. Especially, the query string is very like will use short form of the organization name. For this case, there is other option but keeping it in a list.

How ontology help vertical Search?

Providing Context to Web Searches:
The Use of Ontologies to Enhance Search Engine's Accuracy


co-occurence word could be added to original query, this is the query expanding process. Ontologies could be used for control the process of query expansion.

On the other hand, from ontology's perspective view, entities captured in the system are linked together. PageRank algorithm applied by google could be used, an iterative algorithm to determines' one entity's "importance" based upon the importance of its related entites.





0 Comments:

Post a Comment

<< Home