Module 5 : Searching the Web
Readings
Searching

First, let me point out that a 2000 study found that there are over 1 billion web pages and it's been growing since then! And if that's not intimidating enough, further research shows that there is a much larger "hidden web" (The Invisible Web). Google, the current leader in search engines, claims to index 3 billion pages but the hidden web is estimated to have about 500 billion pages. A good resource for examining some of these can be found at http://www.invisible-web.net/.

But what we care about is how to use the available tools to find the information we need without being sidetracked or completely lost. What we are focusing on here is how to find what we are really looking for either in a general subject area or specific word or phrase. That will guide us in which tool we pick. 

Directories vs. Engines

Most people start searching in Yahoo!, which is an index or directory. Don't be distracted or confused by the search box at the top of the screen; the heart of this service is the way sites are organized in very people-friendly categories.  Sites are organized by people into categories that are meaningful to people. This task is started with a robot, but some person must look at and categorize each site. Consequently, indexes and directories such as Yahoo! index many fewer sites than search engines. Although these indexes are not exhaustive, they do provide a rich assortment of sites.  This indexing is very similar to the task done by librarians, who select the books that the library will acquire and then put them into the appropriate locations. Obviously, this is labor intensive so the number of sites that can actually be included is limited. The search box on Yahoo! only searches the Yahoo! categories and sites so you are in a limited area of the Web.  Once you are finished there, Yahoo! offers you to search engine links to explore other parts of the Web.

On the other hand, there are search engines. The current leader in this area is Google. You'll notice that the whole interace there consists of a single box where you can type your target words. Complex computer-readable databases are built by software programs that wander through the web locating new pages and then index every word on the page. Because these are automatic, they can index many more sites. But because they are automatic, users must be more savvy about what they want to find.  Note: the Google directories are human-built just like the Yahoo! directories.

Because of these differences, each of the tools requires a different search strategy.  If you're using an index, it is best to try to understand the categories that the index uses.  Starting at the top you can drill down through the various categories narrowing down on the number of sites that would be appropriate.  For example, it you were looking for lesson plans for elementary students studying about the water cycle on an index page, you would first choose the category education, then elementary, then science and finally water.  There you would probably find a number of sites dealing with water and all appropriate for elementary school children. The good news is that you would find not only what you had in mind but you'd get other ideas from sites that indexers had placed on the same page.  Compare this to browsing in the library and being able to see similar books on adjacent shelves.

On the other hand, if you were using a search engine then you would need to carefully choose words; your query might be "water cycle".  The pages that would be returned would specifically contain the phrase "water cycle", but they would be appropriate for many different audiences not confined to lesson plans for elementary school. 

Note: it's often hard to tell at a glance whether a site is providing an index or a search engine. Since the web is more and more driven by commercial demands, many sites offer both. For example, you'll see the links to search engines as well as the directory at Yahoo! and you'll find a directory on the Google page. For this reason, experience (relying on your own or that of experts) is important in finding the best tools.

The Price article and the Search Engine Showdown do a good job of touring you through the capabilities of the best known search tools.

Composing Better Searches
There are certain tricks that will help you to conduct better searches.  (These rules apply to most search engines, but the syntax may vary slightly.) 
  1. Is your search more appropriately done using an index or a search engine?

  2. If it's an index, what categories might be best to browse?

  3. If it's a search engine, what are the right words? (These work in most but not all search engines and the syntax can vary from tool to tool.)
    • Use capital letters for proper names
    • Use quotes around words to be matched exactly.
    • Use + before words that must be included. 
    • Use - before words that must be excluded. 

Pandia's 17 Recommendations for Net Searching from your reading provides a good set of guidelines. The good news is that search tools are getting much more clever and will often incorporate some assumptions as it returns hits. For example, if you search for Bill Clinton (no quotes), items with the phrase Bill Clinton will be returned closer to the top of your hits than those with Bill and Clinton in different parts of the document.

Tricks Search Engines Know
Besides all of this, search engine developers have a number of tricks operating behind the scenes.  Among these are fuzzy searching (allowing for misspellings), stemming (multiple forms of a word), thesauruses (to find synonyms), concept searching (to find closely linked ideas), and proximity (pages with required words close together are considered more relevant).  These tricks show up by the order in which "hits" are reported.  You'll also notice concept searching at work when the ad that is shown closely relates to the term you are searching for.  Try some searches of your own and see if you can notice these tricks at work. 

Another service is the ability to query using natural language.  Try asking a question such as "Where do gorillas live?" to see how well this works. You may want to try this at Ask Jeeves or Ask Jeeves for Kids.

Safe Searching for Students
Now you know how to search better, but you still have the problem of what kids may find! To deal with this there are several things you can do. 
  1. Make sure the responsibility of students is clearly defined. 

  2. This is always the first line of defense.  It makes using the Internet a normal responsibility of teachers, students, and parents to use Internet resources appropriately. 
     
  3. Make use of Internet filtering software or services, but remember these are not foolproof.

  4. One of the most widely used is CyberPatrol, which can be used on a whole network or a single computer. The Delaware public schools use SmartFilter, which is installed at the firewall. However, even if your filter keeps students from visiting certain pages, they can still can do a search on any of the search engines that will return a forbidden page in its list of hits.  Also in its list will be the first couple of sentences from the page and by that time the damage may be done.  To avoid that, there are some tools on the web that filter the hits before they are returned on the page.  Take a look at AV Family Filter and  Lycos Parental Controls
     
  5. For young students, confine searches to appropriate indexes.

  6. A great collection of safe search tools for students can be found at Kid's Tools for Searching the Net
Experts on the Internet

Sometimes no amount of searching will help or you have a question that needs a more specific answer. The Internet has made it very easy to find experts on almost any subject who are willing to answer your students' or your own questions. Some of the best of these are listed below in my order of preference.

When using these in the classroom, it's important to work closely with students to create questions that can't be found in other ways, to be respectful of the experts' time and to use the resources appropriately.

Assignments
  1. Recommended search tools

    There are many more tools than the ones we've covered here. Add to your assignments web page, three Web search tools that you would find most useful to use with the students you teach or that you would target. For each tool, include its name linked to the site and your assessment of what audience the tool would be most useful with.

    For each tool you review, give an example of a search relevant to some curricular topic for your target group that would yield good results from this tool. For example, if you were using Yahooligans with 3rd grade social studies students, you might use the directory to find out about foods eaten at Christmas around the world.

    As always, make sure that your contributions are unique. No credit will be given for a search tool that was mentioned in the text, on the Interlit site or in the syllabus. Leave the URL to your assignments page in the WebCT forum under the appropriate heading to indicate that you have completed this and so others can take a look.

Copyright © 2002 by Pat Sine.
Send comments to sine@udel.edu