Changes between Version 13 and Version 14 of adeiSEARCH

Show
Ignore:
Author:
csa (IP: 141.52.232.84)
Timestamp:
09/14/09 03:54:40 (15 years ago)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • adeiSEARCH

    v13 v14  
    2020  * ''title'' - short title describing the result item 
    2121  * ''description'' - the longer description of the result item, an HTML content is allowed 
    22   * ''props'' - the associative array with standard ADEI properties describing the item 
     22  * ''rating'' - the value between ''0'' and ''1'' indicating the result quality, the higher value the better match is 
     23  * ''props'' - the associative array with [wiki:adeiDG standard ADEI properties] describing the item. For example, for channel searches the props array will contain ''db_server'', ''db_name'', ''db_group'', and ''db_mask'' properties. For interval searches, it would be just property ''window'' 
    2324  * ''certain'' - this option indicates what the search module is completely certain what it is this record what the user is actually looking for 
    2425  * Arbitrary number of other properties which are used by the search engine internally (for record matching, for example) 
    3940 
    4041== Default Implementation == 
    41  
    42  
    43 The search modules are implemented using [wiki:adeiClassSEARCHEngine SEARCHEngines]. Each SEARCHEngine could provide one or more search module. The !SEARCHEngines are placed in ''classes/search'' folder in ADEI source tree. They should implement a ''Search'' function which accepts four parameters (module, search string, search filter, global options) and returns the [wiki:adeiClassSEARCHResults SEARCHResults] object with results or ''false'' if nothing found. However, standard modules can reuse default ''Search'' function implemented in base class ''classes/searchengine.php''. The following procedure is executed in this case: 
    44  * ''Search'' function of ''Search Engine'' is executed with four parameters: module, search string, search filter, global options. 
    45  * ''GetList'' function is called to get complete associative list of elements. In this list the key is element identificator and value contains an associative array with terms to check against the search terms. Besides  
     42 The ''Search'' function of search engine may implement the search in arbitrary way. However, there is a standard procedure defined for search of textual data. It intended to standardize handling of search modifiers and simplify coding of new search engines for many standard cases. 
     43The default implementation of engine ''Search'' function is using following procedure: 
     44 * ''GetList'' function is called to get complete associative list of available elements. In this list the key is an element identificator and value contains the presented above associative array describing the item. Besides the described standard properties, this array should contain  
     45Of course, the ''rating'' and ''certain'' fields are not set yet. So, we have the following list of properties: 
     46  * ''title'' - title  
     47  * ''description'' - description 
     48  * ''props'' - ADEI properties 
    4649  * ''uid'' - record unique identificator if any (used for matching) 
    4750  * ''name'' - record short name (used for matching) 
    48   * ''title'' - title to use to present this record in the results 
    49   * ''description'' - longer description (html content is allowed) 
    50   * ''props'' - an associative array containing [wiki:adeiDG standard ADEI properties] fully describing this record. Fro example, for found data item the props array will contain: ''db_server'', ''db_name'', ''db_group'', and ''db_mask'' properties. For found interval, it would be just property ''window''. 
    51  * ''CheckString'' function is called on each element of the list, the elements for which the non-zero rating is returned are checked against filters and added to the search results 
    52  * To prevent duplicating results, the ''SEARCHResults::Accept'' function is used. The results are compared using '''GetCmpFunction'''. 
     51  * arbitrary number of other properties used by custom matching functions 
     52 * Each item of the received string is matched against search terms using ''CheckString'' function. This function returns the match rating (''0'' means there is no match) 
     53 * The items for which the non-zero rating is returned are checked against filters and added to the search results 
     54 * Finally, the duplicating results are filtered using ''Accept'' function of ''SEARCHResults'' object. The comparisons between results are carried out using compare function returned by ''GetCmpFunction'' function of search engine. The default function just compares ''title'' members of the associative arrays describing compared items. 
    5355 
    54 The ''CheckString'' is working in following way: 
    55  * The search string is splited in phrases and for each phrase ''CheckPhrase'' function is called. 
    56  * Depending on the used module, the ''CheckPhrase'' function is selecting  from the associative array describing record a single string value and passes it to the ''CheckTitlePhrase'' function. 
    57  * ''CheckTitlePhrase'' checks if passed string is fitting to the current search phrase and returns the rating. The matching is performed in one of 4 supported modes depending on the match modifiers and global options 
    58   * ''defualt'' - The beginning of any word should match search phrase. The '''word sinus cosinusfff''' matches the phrase '''sinus cosinus''', but '''xsinus cosinus''' - not. 
     56The default ''CheckString'' function is working in following way: 
     57 * The search string is split in phrases and for each phrase ''CheckPhrase'' function is called (see [wiki:adeiSEARCH/String] for splitting algorithm) 
     58 * Depending on the used module, the ''CheckPhrase'' function is expected to construct from the associative array a string value to match against search terms. The default behavior is to use ''name'' member of associative array. If the ''name'' member is empty or non-existing, the match fails. 
     59 * The constructed string value is passed to the ''CheckTitlePhrase'' function. 
     60 * ''CheckTitlePhrase'' function checks if passed string is fitting to the current search phrase and returns the rating.  
     61 * Finally the rating computed for all search phrases are reconciled in overall rating using rules described in the [wiki:adeiSEARCH/String] 
     62 
     63The matching is performed in one of 4 supported modes depending on the match modifiers and global options specified in the search string (see [wiki:adeiSEARCH/String]) 
     64  * ''standard match'' - The beginning of any word should match search phrase. The '''word sinus cosinusfff''' matches the phrase '''sinus cosinus''', but '''xsinus cosinus''' - not. 
    5965  * ''word match'' - The words should match completely. The '''word sinus cosinus fff''' matches, and '''sinus cosinusfff''' - not. 
    6066  * ''fuzzy match'' - The words boundaries are not important and even '''xsinus cosinusx''' matches the '''sinus cosinus''' search phrase. 
    6167  * ''regex match'' - In this mode the search phrase considered regular expression and this regular expression is matched against passed string 
    62  * Finally the rating computed for all search phrases are reconciled in overall rating using rules described in the section above. 
    6368 
    64 == Search Filters == 
    65 The filters are used to reject part of the search results as well as to add/modify information associated with found record. The filters are specified at the search string as follows: 
    66 {{{ 
    67    interval:June 2005 
    68 }}} 
     69== Search Engine == 
     70 So, the search modules are provided by classes implementing the [wiki:adeiClassSEARCHEngine SEARCHEngine] interface. Each class could provide one or more search module. To implement a new search engine the following actions should be taken: 
     71 * The class implementing ''!SEARCHEngine'' interface and extending ''!SEARCHEngine'' base class should be implemented 
     72 * This class should provide a list of supported modules in the ''modules'' member of class. It is associative array where the key is module id and the value is module title. 
     73 * It should define either custom ''Search'' function or provide ''GetList'' function to be used in conjunction with the default approach described above. 
     74 * The implemented class should be stored under ''classes/search'' directory (the file name should be lower-cased class name with ''.php'' extension). 
     75 * The search engine should be enabled in the configuration. A new element should be appended into the $SEARCH_ENGINE associative array. The key is a class name and the value is default initialization parameters. 
    6976 
    70 If such filter is found, the ''INTERVALSearchFilter'' object (from ''classes/search/intervalfilter.php'') is constructed. This object will get the filter value (''June 2005'') as a single parameter to its constructor. And it should implement a single function: ''FilterResult'' which should return ''true'' if the current record should be filtered out or ''false'' otherwise. The ''FilterResult'' receives two parameters: 
    71  * associative array with information on current record 
    72  * a number between 0 and 1 with the rating of match 
    73 Both these parameters can be altered by ''FilterResult'' function. 
    74  
    75 '''Example'''. Lets consider standard ''item'' search used in conjunction with ''interval'' filter. The search will provide multiple records describing found item (i.e. the associative array with information will contain standard properties: ''db_server'', ''db_name'', ''db_group'', and ''db_mask''). The ''interval'' filter is intended to limit the display interval. Therefore, when the ''FilterResult'' function is called, it will add the ''window'' property to the associative array limiting display window to ''June 2005''. 
    76  
    77 If multiple filters are specified they executed sequentially until any filter will not reject the current record. 
    78   
    79 == New Search Engine ==  
    80  * The search engine should provide a list of supported modules in the ''modules'' member of class. It is associative array where the key is module id and the value is module title. 
    81  * It should define either special ''Search'' function or provide at least the ''GetList'' function to be used in conjunction with the approach described above. 
    82  
    83 ''GetList'' function should return array containing the records. Each record is represented by associative array with following members: 
     77=== Standard Engine === 
     78''GetList'' function should return an array containing a descriptions of all all available items (through which we would search). This description is an associative array described in sections above. It should contain the following members: 
    8479 * ''title'' - the title used to describe record in the search results 
    8580 * ''description'' - the longer description of the record, HTML content is allowed 
    8681 * ''props'' - the associative array with standard ADEI properties describing the record 
    87  * ''certain'' - this option indicates what the search module is completely certain what it is this record what he user is actually looking for 
    88  * Arbitrary properties used by the search engine for record matching 
     82 * ''name'' or arbitrary number of other properties used by the search engine for record matching  
    8983 
    90 Example
     84For example, the following array could be returned by the ''GetList'' function
    9185{{{ 
    9286 array( 
    9892     'description' => false, 
    9993     'certain' => true 
    100   ) 
     94  ), 
     95  ... 
    10196) 
    10297}}} 
    10398 
    104 Besides ''GetList'' function it is highly desirable to provide ''CheckPhrase'' function which will check the record info against the search phrase and return the match rating, from ''0'' (not matched) to ''1'' (fully matched). The ''CheckPhrase'' function accepts the following parameters 
    105  * The associative array with information described above 
    106  * The phrase to match 
    107  * Type of match: ''SEARCH::WORD_MATCH'', ''SEARCH::FUZZY_MATCH'', ''SEARCH::REGEX_MATCH'', ''false'' (default) 
    108  * The search module 
    109  * The global options 
     99Of course, all other functions described in the ''Default Implementation'' section could be overridden as well. For example, if the engine is intended to support more than one search module, it needs to override ''CheckPhrase'' function. The default implementation constructs the match string just by taking ''name'' member of associative array. This string is, then, matched against search phrases. However, if multiple modules are used, for each module the algorithm for construction of match string should be specified. The ''CheckPhrase'' function in this case should construct a match string depending on the search module specified and pass it to the default ''CheckTitlePhrase'' function (or perform string matching by itself). 
    110100 
    111 The special search engines intended to return custom XHTML content should use following approach in the ''Search'' function: 
     101=== Custom Engine === 
     102The custom search engine needs to provide ''Search'' function. It should create the ''SEARCHResults'' object and fill it with results using ''Append'' function call. Then return the object or ''false'' if nothing is found.  
     103Just a simple example: 
     104{{{ 
     105function Search($search_string, $module, SEARCHFilter $filter = NULL, $opts = false) {  
     106   $res = new SEARCHResults($filter, $this, $module); 
     107   $res->Append(array( 
     108       'title' => 'January 2005', 
     109       'props' => array( 
     110         'window' => "1104537600-1107216000" 
     111       ), 
     112       'description' => false 
     113     ) 
     114   ); 
     115   if ($res->HaveResults()) return $res; 
     116   return false 
     117
     118}}} 
     119 
     120The special search engines intended to return custom XHTML content should use following approach instead (the XHTML strings are passed to the ''Append'' function instead of the associative arrays describing found items): 
    112121{{{ 
    113122  $result = new SEARCHResults(NULL, $this, $module, ""); 
    118127The <?xml?> should not be included into the content. 
    119128 
     129== Search Filters == 
     130The filters are used to reject part of the search results as well as to add/modify information associated with found items. The filters are specified at the search string as follows: 
     131{{{ 
     132   interval:June 2005 
     133}}} 
    120134 
    121 == INTERVALSearch Engine == 
     135If such filter is found, the ''INTERVALSearchFilter'' object (defined in the  ''classes/search/intervalfilter.php'') is constructed. This object will get the filter value (''June 2005'') as a single parameter to its constructor. And it should implement a single function: ''FilterResult'' which should return ''true'' if the current record should be filtered out or ''false'' otherwise. The ''FilterResult'' receives two parameters: 
     136 * associative array with associative array describing the current item 
     137 * current rating of match 
     138Both these parameters can be altered by ''FilterResult'' function. 
     139 
     140'''Example'''. Lets consider standard ''item'' search used in conjunction with ''interval'' filter. The search will provide multiple records describing found item (i.e. the associative array with information will contain standard properties: ''db_server'', ''db_name'', ''db_group'', and ''db_mask''). The ''interval'' filter is intended to limit the display interval. Therefore, when the ''FilterResult'' function is called, it will add the ''window'' property to the associative array limiting display window to ''June 2005''. 
     141 
     142If multiple filters are specified they are executed sequentially until any filter will not reject the current item by returning ''true'' from ''FilterResult'' function. 
     143  
     144To implement a new filter it is necessary: 
     145 * Choose a not used name, for example: ''site'' 
     146 * Implement ''SITESearchFilter'' class extending  ''BASESearchFilter'' (site is capitalized for class name) 
     147 * Implement ''FilterResult(&$info, &$rating)'' function which returns ''true'' if the value should be filtered out or ''false'' otherwise. The filter value is accessible using ''$this->value''. 
     148 * Place the implemented class in the ''classes/search/sitefilter.php'' (the lowercase filter name is used for the file name) 
     149 
     150== Format of the search string == 
     151[wiki:adeiSEARCH/String] 
     152 
     153== Implemented Engines == 
     154 
     155=== INTERVALSearch Engine === 
    122156'''Provided Modules''': 
    123157 * ''interval'' - Tries to parse the time interval from textual representation given in search string. The only property ''window'' is returned with interval of UNIX timestmaps. 
    126160 * ''interval''' - allows to find intersection of two intervals 
    127161 
    128 == ITEMSearch Engine == 
     162=== ITEMSearch Engine === 
    129163'''Provided Modules''': 
    130164 * ''channel'' - Searches items by uid only 
    139173 * ''interval''' - adds window property to the items specification 
    140174 
    141 == PROXYSearch Engine == 
     175=== PROXYSearch Engine === 
    142176'''Provided Modules''': 
    143177 * ''proxy'' - downloads XML document from the specified location and applying XSLT stylesheet to convert it into the XHTML. Accepts several parameters: 
    158192At the moment performed by ''DetectModule'' funcion defined in classes/search.php. Should be extended by searchengines claiming the search string. 
    159193 
    160 [wiki:adeiSEARCH/String] 
     194