Table of Contents
ToDo
also look at the thinktank
urgent features
- *:DB:FI needs a more elegant retrieval than regex-match !! consider special threatment of this pattern !!
- only allow searchterm>2 chars : must not apply to numerical values and compares ... this is more complex then I thought !!
- wildcard : only allowed if more characters : dont count db- and field-specifier as character !!
- phrasesearch including “:” -> : interpreted as db/fieldspec-seperator in ml_apply() and wildcardcount !!!
- escape special chars like + . / \ [ ] ( ) | ^ $ { }
- wildcards should be normals chars in phrase search
geek-playground
- only collect matched_terms if they are needed !!! establish new parameter for this !!!
- cache results. caching on each node ?
- $searchconfig->{oc} um zusätzliche configs zu spezifizieren
- $searchconfig->{table}, um alternative views zu spezifizieren
- $si stores indexvalues uncompressed - compressing could save memory !!
- process_s_tree : check the operators for speed and convergence
- process_s_tree : NOT is ALL if second is zero
- process_s_tree : NOT is zero if first term is zero
- process_s_tree : AND is zero if one of the terms is zero
finished
- 2005-03-25 added to cron.day rebuild-index every night and reboot server
- 2005-03-24 adapted get_val_dbi to deal with labels: for labeled database-entry : also store the whole term (including whitespace) in searchindex and query this term in phrasesearch to be able to look for “vhs mit sonderfunktion”:sd:st
- 2005-03-20 date search: vergleichsoperator für datum !! (convert to epoch and cmp then??)
- 2005-02-17 how to deal with " : phrase-search, " are escaped in the url
- 2005-02-17 numeric search: how to deal with integer?
- 2005-02-17 phrase search: search for longer terms “stefan vater”
- 2005-02-17 just saw, that this is not true any more: matched_term list has duplicates
- 2005-02-12 strukturdaten: nur aktuell und 1995, daten statt aktuell, sortierung falsch
- 2005-02-12 searchconfig->{modules_short},$ptr->{name} configname/tablename/viewname/SIGLE ... ?
- 2005-02-12 new addon: restrict_db : restrict search on database to safe memory on database-view (and prev/next inside a databaseview) by implementing a more general/global cml !!
- 2005-01-17 restrict wildcards * only with 3chars. one ? per char only limit max. number of matches internally. a query like ‘* or * or * or * or * or * or * or * or *’ might slow down the system too much.
- 2005-01-15 eliminated all values like ' and other utf8-” as well elimination of " at term-bounds
- 2005-01-13 search.v2 : fails if no searchresults
- 2005-01-10 autodetect and global type : term-search, wildcardsearc, regex-search - wie wähle ich implizit aus? interface?
- 2004-12-?? logical search
- 2004-12-14 how to restrict search to database/fields (interface of search-method)
- 2004-12-14 eliminate duplicates in searchstructure : 1-1-114@1-1-114@1-2-234
- 2004-12-14 enhanced term-splitting in the build-engine (delimeters per field), a-b, a/b, 19xx/xy usw. DIFFICULT.. which fields?
- 2004-12-14 new splitfeatures : each feature configureable per field
- 2004-12-13 new splitfeatures : even sel-values should be splitted proper !!
- 2004-12-11 create searchmodule
- 2004-12-10 build web-interface for the search-engine
abandoned
- 2005-03-28 confusing, unelegant and maybe even stupid :implement addon getlist=>1 to override the “no wildcards when less then three chars”-feature !!
- 2005-01-11 too cpu-expensive. just replace !! : resultlist keeps field-id’s !!!??? this fools the LOGICAL OPERATORS !!!!
- 2005-01-11 too cpu-expensive. just replace !! : search.v2 : resultlist keeps field-id’s !!!??? this fools the remove-duplicates !!!!
- 2004-12-14 too complicated ** :change structure of searchconfig. database->feature instead of feature->database !!
kb/searchengine/todo.txt · Last modified: 2006/02/27 19:52 by peter



