Table of Contents

ToDo

also look at the thinktank

urgent features

  1. *:DB:FI needs a more elegant retrieval than regex-match !! consider special threatment of this pattern !!
  2. only allow searchterm>2 chars : must not apply to numerical values and compares ... this is more complex then I thought !!
  3. wildcard : only allowed if more characters : dont count db- and field-specifier as character !!
  4. phrasesearch including “:” -> : interpreted as db/fieldspec-seperator in ml_apply() and wildcardcount !!!
  5. escape special chars like + . / \ [ ] ( ) | ^ $ { }
  6. wildcards should be normals chars in phrase search

geek-playground

  1. only collect matched_terms if they are needed !!! establish new parameter for this !!!
  2. cache results. caching on each node ?
  3. $searchconfig->{oc} um zusätzliche configs zu spezifizieren
  4. $searchconfig->{table}, um alternative views zu spezifizieren
  5. $si stores indexvalues uncompressed - compressing could save memory !!
  6. process_s_tree : check the operators for speed and convergence
  7. process_s_tree : NOT is ALL if second is zero
  8. process_s_tree : NOT is zero if first term is zero
  9. process_s_tree : AND is zero if one of the terms is zero

finished

  • 2005-03-25 added to cron.day rebuild-index every night and reboot server
  • 2005-03-24 adapted get_val_dbi to deal with labels: for labeled database-entry : also store the whole term (including whitespace) in searchindex and query this term in phrasesearch to be able to look for “vhs mit sonderfunktion”:sd:st
  • 2005-03-20 date search: vergleichsoperator für datum !! (convert to epoch and cmp then??)
  • 2005-02-17 how to deal with " : phrase-search, " are escaped in the url
  • 2005-02-17 numeric search: how to deal with integer?
  • 2005-02-17 phrase search: search for longer terms “stefan vater”
  • 2005-02-17 just saw, that this is not true any more: matched_term list has duplicates
  • 2005-02-12 strukturdaten: nur aktuell und 1995, daten statt aktuell, sortierung falsch
  • 2005-02-12 searchconfig->{modules_short},$ptr->{name} configname/tablename/viewname/SIGLE ... ?
  • 2005-02-12 new addon: restrict_db : restrict search on database to safe memory on database-view (and prev/next inside a databaseview) by implementing a more general/global cml !!
  • 2005-01-17 restrict wildcards * only with 3chars. one ? per char only limit max. number of matches internally. a query like ‘* or * or * or * or * or * or * or * or *’ might slow down the system too much.
  • 2005-01-15 eliminated all values like ' and other utf8-” as well elimination of " at term-bounds
  • 2005-01-13 search.v2 : fails if no searchresults
  • 2005-01-10 autodetect and global type : term-search, wildcardsearc, regex-search - wie wähle ich implizit aus? interface?
  • 2004-12-?? logical search
  • 2004-12-14 how to restrict search to database/fields (interface of search-method)
  • 2004-12-14 eliminate duplicates in searchstructure : 1-1-114@1-1-114@1-2-234
  • 2004-12-14 enhanced term-splitting in the build-engine (delimeters per field), a-b, a/b, 19xx/xy usw. DIFFICULT.. which fields?
  • 2004-12-14 new splitfeatures : each feature configureable per field
  • 2004-12-13 new splitfeatures : even sel-values should be splitted proper !!
  • 2004-12-11 create searchmodule
  • 2004-12-10 build web-interface for the search-engine

abandoned

  • 2005-03-28 confusing, unelegant and maybe even stupid :implement addon getlist=>1 to override the “no wildcards when less then three chars”-feature !!
  • 2005-01-11 too cpu-expensive. just replace !! : resultlist keeps field-id’s !!!??? this fools the LOGICAL OPERATORS !!!!
  • 2005-01-11 too cpu-expensive. just replace !! : search.v2 : resultlist keeps field-id’s !!!??? this fools the remove-duplicates !!!!
  • 2004-12-14 too complicated ** :change structure of searchconfig. database->feature instead of feature->database !!
 
kb/searchengine/todo.txt · Last modified: 2006/02/27 19:52 by peter