![]() |
![]() |
Home / Success Stories / Success Stories / | ![]() |
|
![]() |
||||
![]() |
![]() |
|||
![]() |
![]() |
|||
![]() |
||||
![]() |
![]() |
|||
![]() |
TERMIUMplus trilingual database | ![]() |
||
![]() |
||||
![]() |
![]() |
![]() |
||
![]() |
||||
![]() |
||||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
![]() |
![]() |
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
![]() |
||
|
|
||
![]() |
||
Date: Tue, 08 Jan 2002 13:55:07 -0800 (PST)
TERMIUMplus (www.termium.com) is a trilingual application that
allows translators and terminologists to search a collection of
1.5 million entries in English, French and Spanish. The system is
freely available to any employee of the Canadian Federal government as
well as by subscription to individuals and organizations outside. The
terms and the user interface are both trilingual.
mod_perl plays an integral role in the success of this system.
Because the server experiences significant amounts of traffic during
the middle of the day effecient request handling is of paramount
concern. It is not uncommon to be servicing over 100 concurrent
requests at 2pm. Not only does the system perform very well but it is
also very stable. I don't think our httpd's have ever crashed - and
almost all requests are in the sub-second response range.
If great performance and stability were not enough - mod_perl (Perl) -
has allowed us to provide a very easy to use and enjoyable interface
to our database servers. The servers are actually on NT running a
proprietary database software package. The database software is very
good at performing both full text and exact term searches of the term
data. However, the software interface to the databse engines is weak
and unusable at best. By using Perl to talk to the database server's
HTTP interface we were able to extract the desired results data and
then use Perl's power to reformat the results into something pleasing
and tailored to the user's preferences. Because each record has over
100 fields and each field can have a number of sub components - I
don't think the job would be doable in any other language than Perl!
In addition to reformatting the output of the database we also employ
some processing of search terms. This processing is unique to our data
collection but helps increase recall by eliminating stopwords such as
"a", "an", "le", "les", etc.
In addition to the fancy user interface TERMIUMplus also offers a
server-to-server term translation service. This allows other search
engines to offer on-the-fly term translation as part of their
service. An excellent feature when dealing with a bi or tri-lingual
document corpus. You are welcome to see this yourself by visiting:
http://strategis.ic.gc.ca/engdoc/search.html
Check on Bilingual search and try a word such as "turbofan". As a
note, I am not aware of what software the Strategis search system was
built with.
The entire system runs on a dual processor Sun 250 with 2G of RAM (We
discovered how important lots of RAM is for this level of concurrent
user activity) for the front end of the request processing. For the
database queries we have 2 quad Xeon NT boxes which we divide between
Extranet and Internet traffic. We will be replacing the Sun 250 with a
quad processor Sun 450 with 8G of RAM.
In addition to mod_perl we use MySQL as our user sessions database and
intend to start replacing many functions of our proprietary back end
database with functions developed using mod_perl and MySQL. Linux is
our front-line development system and CVS is our versioning management
system. We use CVS to then move our work on to a Sun staging system
for pre-release testing and then finally rsync to push final code on
to production servers. All of our code runs as well on Linux as it
does on Solaris - with no modifications other than compile time
options for the major packages of the application.
I feel that using mod_perl to build TERMIUMplus has allowed for the
construction of a high quality service which is capable of handling a
significant user load. It is very rare (never?) that we experienced
any major problems with the Apache, mod_perl, and Perl portion of our
system. Most of our operational difficulties are coming from our
vendor supplied software at the database backend where daily server
problems are experienced.
Software costs aside I wouldn't build this appliation using
anything but mod_perl, Apache and MySQL!
|