| Language Extension Packs
For use with dtSearch version 6.5 or
later.
dtSearch Engine/Web is supplied with stemming
rules and a noise-word file for English(US). Stemming is
the only search expansion option which is 'on' by default
in the dtSearch end-user products; the reason for this is
that stemming is almost always useful when making a
search, and adds little to the time required to make a
search. Unlike some other search engines, dtSearch
applies stemming at search time, there is no need to
build indexes specifically to apply stemming and no need
to build separate indices for each language in use.
The problem
With the stemming option selected dtSearch will find plurals
and many other variations; for example a search on
print will find
printers,
printing, printed automatically.
However, if you are searching documents written
in other languages, the English stemming rules will cause
you to miss many word variations which do not occur in
English (e.g. verb and noun changes with gender), and you
may find that words which are unrelated are found in
error.
Furthermore, the English noise word list, which
is designed to remove unwanted English words from your
index to keep the index size small, is not suitable for
other languages; your indexes may contain many words
which will not be useful in searches and which will add
to the size of your indexes.
The solution
Use language specific files in place of the default US English
files. These are supplied in the form of Language Extension
Packs which contain files for many languages, see list below.
All files are in Unicode format.
Language Extension Packs
* LEP400 and LEP402 also include unique
bi-lingual French/English and German/English stemming and
noise word files which enables search expansion on
indexes and documents containing a mix of French/German
and English text.
License: Licensed for use on a single
server or workstation for use with dtSearch Engine or
dtSearch Web, OR up to 5 workstations for use with
dtSearch Desktop or Network. Please ask for other licensing
options.
-
Stemming rule files and noise
word files for each supported
language
- Test files to check the operation of
stemming in all the supplied
languages.
- Stemming Language Selector application,
changes stemming rules from the Windows Start
menu*.
- Multilingual Installer (English, French,
Spanish, German, Dutch)
- One year of on-line technical support and
updates.
*User must have administrator permissions
Needs:
- dtSearch 6.5 or later (License covers use
with dtSearch Engine or Web on a single server, or
dtSearch Desktop\Network for up to 5 users); other
licensing available.
- Needs Windows 2000, XP, Vista,
2003/2008
- Supplied on CDROM
Evaluation
A 30-day evaluation version is available; this allows
English and any single language to be tried for comparison
tests. Please complete the Enquiry
Form. Please enquire for languages not
listed.
|