Support.im1.net
 
 

The ICE Search Engine

 



NOTE: If you have Microsoft® FrontPage® account: FrontPage 2000 has it's own built-in Search engine . If you have a FrontPage account, you should use the Web Bots included in Microsoft FrontPage. Please go to our FrontPage Tips page for more information

 


QuickJump Directory
Intro: What is ICE
What do you need
Step-By-Step Instructions
Quick Instructions (for the impatient)
Options and addendum

About ICE

The ICE Search Engine allows users to search your web server by keywords. You can easily configure an option to search only specific directories instead of the whole server. ICE is an index based search engine; every time you upload new pages you'll want to update the index it keeps.

 The ICE Search Engine is free software for individuals, schools, and universities. If yours is a commercial server, there may be a small shareware fee. Please see the author's home page for details. Note: As of October 1998 the authors website cannot be found at this address any longer.

To see it in action, go here.

 

You can take a sneak peek at the scripts using your web browser.
View ice-form.pl
View ice-idx.pl
Top of Page
im1 User Support Page

 


Tools needed
An FTP client, such asWS_FTP
Telnet Software

Top of Page
im1 User Support Page
 

 


How to Install the ICE Search Engine
    Download a copy of the two scripts you will need.

     

    ice-form.pl
    ice-idx.pl
    Edit the CGI script (" ice-form.pl ") with a Text Editor. There are 3 things you must change.

     

    $domain="DOMAIN";
    
    $userid="USERID";
    
    $websitename="WebSite";
    
    $bodytag=" <BODY> ";
    In the $websitename variable, put the name you use for your site. This is descriptive only. For instance if your domain name is gadgets.com but your company is called Gadgets Limited, then you would replace WebSite with "Gadgets Limited"

     

    In the $userid variable, replace USERID with your UserID, webxxxxx, e.g. web2011f

     

    In the $domain variable, replace DOMAIN with your domain name, e.g. domain.com

     

    Optional: Replace <BODY> with your own BODY tag of your web site. For instance, if your web site uses a background image, you would replace <BODY> with <BODY BACKGROUND="imagename.jpg">

     However, there are few restrictions. You must use full paths, and you must put a backslash ( \ ) before every quote.

     Example:

     <BODY BACKGROUND=\"http://domain.com/images/image.jpg\">  

    Edit the CGI script ("ice-idx.pl") with a Text Editor. There is only one thing you need to change.

     As above, just replace USERID with your UserID in $userid="USERID";
     
     

    FTP to your Virtual Server. Once connected to your web site directory (/usr/local/www/data/UserID), create a subdirectory called "cgi-bin" (minus the quotes of course!).

     You should now have a directory called: /usr/local/www/data/UserID/cgi-bin

     

    Upload the edited CGI scripts "ice-form.pl" and "ice-idx.pl" to this directory.

    IMPORTANT
    99% of all script problems occur when you do not upload the files in 'ASCII/Text' format!

     These script files MUST be uploaded in ASCII/TEXT format. "RAW DATA" transfers or other types of transfer extensions will not work! "Auto" mode on WS_FTP will not do this for you. Make sure the "Auto" box is UNCHECKED and the ASCII button is CHECKED.

     
     
     

    More help is vailable over FTP here.
    **********
     
    Telnet to your "/usr/local/www/data/UserID/cgi-bin" directory.

     

    Set the permissions for the "ice-form.pl" and "ice-idx.pl" by typing in the following commands at the Telnet prompt:

     cd website/cgi-bin

     chmod 755 ice-form.pl

     chmod 700 ice-idx.pl

     

    THIS IS IMPORTANT: type perl ./ice-idx.pl

     This creates the index file. Every time you upload new pages that you want to be searchable, you'll want to Telnet in and type that command.

     Now type: logout

     

    That's it! Now, in your web pages you'll want to create a link to your new Search Engine page. Put something like this in your web pages:

     

    <a href="http://DOMAIN/cgi-bin/USERID/ice-form.pl">Search Engine</a>
    NOTE: Replace DOMAIN with your own domain name and USERID with your own UserID!
     
    To test your Search Engine, upload your completed web page with the new link into your normal Virtual Server web directory, and follow the link to the Search page. Type in a keyword and hit the "Start" button and it will give you links to all the pages where that word appears.
    Top of Page
    im1 User Support Page

     


    QuickStart Instructions (for experienced users)
    Shift-click here to download ice-form.pl

    Shift-click here to download ice-idx.pl

     

    Edit in ASCII mode. For ice-form.pl, there are 4 variables that need to be changed; for ice-idx.pl, there is one. It is very simple. The documentation in the script will tell you what to do.

     

    Upload (in ASCII format) to your cgi-bin directory

     

    Telnet into your cgi-bin directory

     

    Type chmod 755 ice-form.pl and chmod 700 ice-idx.pl to set the permissions.

     

    Type perl ice-idx.pl to create the index.
    That's it! Now, in your web pages you'll want to create a link to your new Search Engine page. Put something like this in your web pages:

     

    <a href="http://DOMAIN/cgi-bin/USERID/ice-form.pl">Search Engine</a>
    Top of Page
    Internet Marketing 1 User Support Page
     
    ICE options and special notes
    Default settings:

     

    ICE excludes from the search words 3 letters or less.

     

    If a word appears in over 60 percent of your documents, ICE excludes it from being searched.
    To Search Subdirectories Only: you can configure ICE to let you choose whether to search the whole site, or a subdirectory. Edit (in ASCII!!) ice-form.pl and look at this part:

     

    # To Search Subdirectories Only
    #
    # To search subdirectories only, change the Directory Name to whatever name you want,
    #   and change "subdir" to your subdirectory.  Make sure there is a slash on the end, and
    #   *not* at the front.  Then, uncomment the code block below (delete the '#' from the 
    #   beginning of the line.
    
    
    #
    # local(@directories)=(
    #    "Directory Name (subdir/)",
    # );
    It should be self-explanatory.

     

    To return no more than a maximum number of hits:

     

    # Maximum number of hits to return
    # Example:
    #   $MAXHITS=100;
    
    # Delete '#' from next line to use this
    # $MAXHITS=100;
    Just change the $MAXHITS variable to a number of your choice, and delete the '#'

     

    To Exclude Directories from the search:
    # ADVANCED USERS:
    #
    # To exclude directories from the search, put the full server paths of the
    #   subdirectories.  NOTE: once you exclude a directory, all of *its* subdirectories
    #   are excluded.  Just change the SUBDIR to your subdirectory and leave the rest alone.
    #
    # You must UNCOMMENT (remove the '#' from each line) for this to work.
    
    
    
     @excludedirs=(
    
     "/usr/local/www/data/$userid/cgi-bin",
    
    # "/usr/local/www/data/$userid/SUBDIR",
    # "/usr/local/www/data/$userid/SUBDIR",
    # "/usr/local/www/data/$userid/SUBDIR",
    
     );
    The cgi-bin directory is already excluded for security reasons. Do not change this. You can exclude directories from being publicly searchable by changing SUBDIR to the path of your subdirectory, and uncommenting the line. Add more lines if you have multiple directories you wish to exclude.

     

    Other Options: in ice-idx.pl, there are options for international characters, word length exclusion, and common word exclusion. These are self-explanatory:
     # The ICE indexer will support full international characters by
     # converting them to their html equivalent if $ISO is set.
     # This has a slightly negative impact on the indexing speed, so
     # set it to "y" only if you index files with 8 bit international
     # characters. OTHERWISE DON'T! iso2html seems to cause a memory
     # leak, causing the indexer to run forever. I'm working on it.
    $ISO="n";
    
     # Type of system (for figuring out the path delimiting character)
     # that ice-idx.pl runs on. Select one of "UNIX", "MAC", or "PC"
    $TYPE="UNIX";
    
     # Minimum length of word to be indexed
    $MINLEN=3;
    
     # Stop indexing a word that appears in over X percent of all files
    $MAXPERCENT=60;
    Top of Page
    im1 User Support Page

     

 

 

 

   

 
© copyright - 1996-2002 - im1.com             home   |   FAQ   |   search   |   dialup   |   hosting   |   contact