|
Module
5 : Searching the Web
|
| Readings |
|
| Searching |
|
First, let me point out that a 2000 study found that there are over 1 billion web pages and it's been growing since then! And if that's not intimidating enough, further research shows that there is a much larger "hidden web" (The Invisible Web). Google, the current leader in search engines, claims to index 3 billion pages but the hidden web is estimated to have about 500 billion pages. A good resource for examining some of these can be found at http://www.invisible-web.net/. But what we care about is how to use the available tools to find the information we need without being sidetracked or completely lost. What we are focusing on here is how to find what we are really looking for either in a general subject area or specific word or phrase. That will guide us in which tool we pick.
Most people start searching in Yahoo!, which is an index or directory. Don't be distracted or confused by the search box at the top of the screen; the heart of this service is the way sites are organized in very people-friendly categories. Sites are organized by people into categories that are meaningful to people. This task is started with a robot, but some person must look at and categorize each site. Consequently, indexes and directories such as Yahoo! index many fewer sites than search engines. Although these indexes are not exhaustive, they do provide a rich assortment of sites. This indexing is very similar to the task done by librarians, who select the books that the library will acquire and then put them into the appropriate locations. Obviously, this is labor intensive so the number of sites that can actually be included is limited. The search box on Yahoo! only searches the Yahoo! categories and sites so you are in a limited area of the Web. Once you are finished there, Yahoo! offers you to search engine links to explore other parts of the Web. On the other hand, there are search engines. The current leader in this area is Google. You'll notice that the whole interace there consists of a single box where you can type your target words. Complex computer-readable databases are built by software programs that wander through the web locating new pages and then index every word on the page. Because these are automatic, they can index many more sites. But because they are automatic, users must be more savvy about what they want to find. Note: the Google directories are human-built just like the Yahoo! directories. Because of these differences, each of the tools requires a different search strategy. If you're using an index, it is best to try to understand the categories that the index uses. Starting at the top you can drill down through the various categories narrowing down on the number of sites that would be appropriate. For example, it you were looking for lesson plans for elementary students studying about the water cycle on an index page, you would first choose the category education, then elementary, then science and finally water. There you would probably find a number of sites dealing with water and all appropriate for elementary school children. The good news is that you would find not only what you had in mind but you'd get other ideas from sites that indexers had placed on the same page. Compare this to browsing in the library and being able to see similar books on adjacent shelves. On the other
hand, if you were using a search engine then you would need to carefully
choose words; your query might be "water cycle". The pages that
would be returned would specifically contain the phrase "water cycle",
but they would be appropriate for many different audiences not confined
to lesson plans for elementary school. Note: it's often hard to tell at a glance whether a site is providing an index or a search engine. Since the web is more and more driven by commercial demands, many sites offer both. For example, you'll see the links to search engines as well as the directory at Yahoo! and you'll find a directory on the Google page. For this reason, experience (relying on your own or that of experts) is important in finding the best tools. The Price article and the Search Engine Showdown do a good job of touring you through the capabilities of the best known search tools. |
| Composing Better Searches |
There are
certain tricks that will help you to conduct better searches. (These
rules apply to most search engines, but the syntax may vary slightly.)
Pandia's 17 Recommendations for Net Searching from your reading provides a good set of guidelines. The good news is that search tools are getting much more clever and will often incorporate some assumptions as it returns hits. For example, if you search for Bill Clinton (no quotes), items with the phrase Bill Clinton will be returned closer to the top of your hits than those with Bill and Clinton in different parts of the document. |
| Tricks Search Engines Know |
| Besides all
of this, search engine developers have a number of tricks operating behind
the scenes. Among these are fuzzy searching (allowing for misspellings),
stemming (multiple forms of a word), thesauruses (to find synonyms), concept
searching (to find closely linked ideas), and proximity (pages with required
words close together are considered more relevant). These tricks show
up by the order in which "hits" are reported. You'll also notice concept
searching at work when the ad that is shown closely relates to the term
you are searching for. Try some searches of your own and see if you
can notice these tricks at work.
Another service is the ability to query using natural language. Try asking a question such as "Where do gorillas live?" to see how well this works. You may want to try this at Ask Jeeves or Ask Jeeves for Kids. |
| Safe Searching for Students |
Now you know
how to search better, but you still have the problem of what kids may find!
To deal with this there are several things you can do.
This is always the first line of defense. It makes using the Internet a normal responsibility of teachers, students, and parents to use Internet resources appropriately. One of the most widely used is CyberPatrol, which can be used on a whole network or a single computer. The Delaware public schools use SmartFilter, which is installed at the firewall. However, even if your filter keeps students from visiting certain pages, they can still can do a search on any of the search engines that will return a forbidden page in its list of hits. Also in its list will be the first couple of sentences from the page and by that time the damage may be done. To avoid that, there are some tools on the web that filter the hits before they are returned on the page. Take a look at AV Family Filter and Lycos Parental Controls. A great collection of safe search tools for students can be found at Kid's Tools for Searching the Net. |
| Experts on the Internet |
|
Sometimes no amount of searching will help or you have a question that needs a more specific answer. The Internet has made it very easy to find experts on almost any subject who are willing to answer your students' or your own questions. Some of the best of these are listed below in my order of preference. When using these in the classroom, it's important to work closely with students to create questions that can't be found in other ways, to be respectful of the experts' time and to use the resources appropriately. |
| Assignments |
|
| Copyright © 2002 by Pat Sine. |