Internet Search Tips and Strategies

0
4363

Overview

The Internet has an enormous quantity of information, with thousands of newsgroups and billions of web pages. The two questions that face any information seeker are, (1) How can I find what I want? and (2) How can I know that what I find is any good? This article treats the first question. A companion article, “Evaluating Internet Research Sources,” treats the second. Through the use of a little creativity, some patience, and a few search engines, you will be able to find just about anything you want. Many of the search tools and tool types mentioned in this article are available at “World Wide Web Research Tools.”

Let me say just a brief word or two on each of these items (and then I will go into detail later). “A little creativity” means simply that you must be able to generate some synonyms (words or phrases similar in meaning to each other) for the idea or topic you are searching for. “Some patience” means that you must remember that when you search a library collection, you must plan to spend more than five minutes looking in the card catalog or on the shelves. People who say, “There is nothing on the Internet about my topic,” are often those who sit down in front of the computer, type one phrase into one search engine and find nothing relevant. If you are willing to spend an hour looking around, however, you will almost certainly be well rewarded. Lastly, “a few search engines” means that you must use a variety of tools to find what you want.

Categories of Information on the Web

Before you begin searching, you first need a little understanding about how information is stored and accessed on the Web. There are basically three categories of information on the Web:

  • The Free, Visible Web. This category includes all the publicly mounted Web pages. These pages are indexed by search engines. To find information from this category, use a good search engine or directory.
  • The Free, Invisible Web. This category includes the contents of sites that provide their articles or information free to users, but that content may be accessible only by going directly to the site. In other words, search engines cannot index it. Some magazines, newspapers, reference works, and other sites are in this category. Many databases such as legal, medical, and financial are here, too. To find information from this category, you must go to the appropriate database.
  • Paid Databases over the Web. This category includes commercial databases that libraries subscribe to, containing scholarly journals, newspapers, court cases and the like. Providers like Lexis-Nexis, UMI Proquest, Infotrac, JSTOR and others are in this group. To find information from this category, you must have access to the database (through password or an on-campus computer) and search on the database directly.

Search Tool Types

Search tools fall into three main categories. (Note that a given search site may combine tools, since sites are in a constant state of change, merging, and partnering.)

  • Search Engines. A search engine consists of the interface you use to type in a query, an index of Web sites that the query is matched with, and a software program (called a spider or bot) that goes out on the Web and gets new sites for the index. The bot crawls the Web at certain intervals, in order to index new material. When you use a search engine, you are asking it to look in its index to find matches with the words you have typed in. Some search engines index not only the World Wide Web, but also Usenet newsgroups. Many search engines are now becoming reference sites which contain much more than just search capability. They may also have news, weather, free software, picture indexes, ratings of web sites, and other features. There are several hundred search engines, but they fall into a handful of types:
    • Global. This type of engine, typified by Google, Fast Search, Northern Light, HotBot, AltaVista, and others, reads pages from all over the world in many languages. These engines may index more than a billion pages.
    • Regional. Some search engines are limited geographically. For example, only information on Web sites in Australia may be indexed.
    • Targeted. These search engines limit themselves to one subject, like biography, medicine, graphics, art, fishing, and so forth.
    • Reference. These provide information from a set of reference works, such as an encyclopedia. Britannica, Bartleby and xrefer are examples.
  • Directories. Directories are categorized lists of sites picked out by human editors. Directory databases are therefore much smaller than those of search engines. However, the fact that the sites are hand picked often means that you will find very high quality sites or articles in the results. Example directories are Yahoo, Look Smart, and Snap.

Quick Guide to Choosing a Starting Place

Here are some suggestions about where to start a search.

  • I’ve got a broad subject in mind, but don’t know exactly what I’m interested in or how to look it up. For broad subjects, such as information processing, Tibet, photosynthesis, volcanoes, Samuel Johnson, rope making and so on, try an encyclopedia or similar reference site.
  • I need information about a general subject area. To find information about business, law, medicine, old master paintings, government documents and so forth, use a targeted search engine
  • I have a well-known specific subject in mind and want to find some relevant sites. To find sites for subjects such as Hawaiian vacations, exercise routines, colleges and universities in California, dietary guidelines, art museums, try a directory.
  • I need information on a very specific and unusual subject. To find information about tea tree oil, Chihuahua coughing, paradichlorobenzine, or the Airbus 320, use a general search engine.
  • I am doing a research paper and need substantial information from many sources. For serious research, use a combination of several search engines, directories, and other tools. Read on for further help.

 


Quick Tip
For many questions, you can find excellent information by going to Google and typing in four to six words related to your subject.


 

Word Searches With Search Engines

As mentioned above, search engines index the exact words found on Web pages. Thus, when you search with a search engine, you are looking for exact words. “Choose your words well,” says the proverb. Since you do not know exactly what words may be on a particular Web page that covers the subject you are interested in, you must be creative in anticipating the possibilities.

FOREST LOG. Here is a scheme, called FOREST LOG, to help generate search terms. Suppose, for example, you wish to research the validity of testing. Here is how you might use the Forest Log scheme to generate terms.

FO
<align=”right”>Forms</align=”right”>
Forms or variants of the words you are thinking of. For example, if you search on validity of testing, you may miss a page that discusses validity of tests. So you should include the forms: test, testing, tests. Many search engines allow wild cards to cover most forms by using a wild card, often an asterisk, as in test*.
RE
<align=”right”>Related Terms</align=”right”>
Related to testing is measurement, assessment, performance, criteria, judgment, evaluation, and so forth. A search only on validity of test* will miss validity of performance measures.
ST
<align=”right”>Synonymous Terms</align=”right”>
What other words are synonymous or used in place of the word test? What about exam, examination, assessment, quiz, midterm? As another example, suppose you search under apple growing, and find a few items but not what you want. What other possible phrases might be used in an article that would cover this topic? You might search on fruit tree farming, fruit orchards, Washington Red Delicious, etc.
LOG
<align=”right”>Ladder of Generalization</align=”right”>
The higher on the ladder, the more general or comprehensive the term, while the lower on the ladder, the more specific the term. Thus, for example, a list of terms related to testing, from most general to most specific, might include measurement, assessment, testing, performance test, weightlifting performance test. When you are developing a set of search terms and searching with them, if you do not get the results you want, move up or down the ladder of generalization and generate some more terms.

There are several ways to type in a word search. Knowing the differences can help you get better results.

Keyword Search. Many search engines by default offer a keyword search. This kind of search will find all pages that contain any of the words you have specified. Moreover, the search will find the words in any order and in any location. For example, suppose you are looking for information about the formulas in shampoo. If you perform a keyword search using the phrase consumer product chemistry, the engine will return every page that mentions any one of these three words anywhere on the page. Thus, you will see pages about “Consumer Protest over Dangerous Toys,” and so forth. Fortunately, most engines list their findings (hits) in a ranked order, so that hits with all of the words will be listed before hits with only one or two of the words. And usually, pages where the words are close together will be listed earlier. However, that still means that your keyword search for consumer product chemistry will return a page containing, “Some consumer groups are advocating product warning labels on children’s chemistry sets.”

Phrase Search. Many search engines allow you to perform an exact phrase search, so that pages with only the words you type in, in that exact order and with no words in between them, will be found. The exact phrase search is often a remedy for too many irrelevant hits. To perform an exact phrase search at a search engine that permits it, put the phrase in quotation marks: “consumer product chemistry.” If you get zero results, go back to the Forest Log and do some work!

Boolean Operators. Named after mathematician George Boole, Boolean logic involves the operators AND, OR, NOT, and occasionally NEAR. These operators are available in some engines to expand or contract your search results. The operator OR expands the search, while the others contract it. Let me explain by examples. If you type into the search engine, “summer OR flowers,” you will get a hit on every page that has either the word “summer” or the word “flowers” on it. (For the technically minded, the OR is an inclusive OR so that pages with both words will also be returned.) On the other hand, if you type in, “summer AND flowers,” only pages with both terms will be returned. As you might imagine, this will be a smaller set of hits. If you type in “summer NOT flowers,” then only pages with the word “summer” and not with the word “flowers” will be found. And similarly, if you type in “summer NEAR flowers,” only pages with the word “summer” in the close vicinity of the word “flowers” will be returned. Some pages allow you to specify the nearness of the words, such as not more than 15 or 50 words apart.

Search Tips

1. Use several search tools. Because of the constant indexing that search sites do, and because of the way their indexes work, some will find content that others will not. No engine has the entire Web indexed. Use several engines and several directories, and look at reference and targeted engines as well for the most thorough search. (And remember the Invisible Web!). See “World Wide Web Research Tools” for a list of good engines and other tools.

2. Read the search tips or help information at each search engine. You will learn how to perform more sophisticated searches, how to restrict or expand searches, and how to use the site more efficiently. For example, AltaVista, Excite, and InfoSeek allow the use of quotation marks to create an exact phrase search, a plus sign to indicate a word that must occur to yield a hit page, and a minus sign to exclude pages that include the word. By combining these items, you can create a very powerful and specific search: “lesson plans” +K-12 -science tells the engine to search for the exact phrase lesson plans (rather than just the two words anywhere in the document, on pages that must include the term K-12 and that must not include the word science.

3. For keyword searches, use several words. If you type in a general topic word as if you were looking through a card catalog, you are likely to receive tens or even hundreds of thousands of hits. A search for “insomnia,” for example, will return more than 20,000 hits from one engine. You might try a search on “causes insomnia cure better sleep sleepless” or something similar. Try to imagine what words might occur in an article you want and type in several of those words. Use synonyms, too. Alternatively, use an exact phrase search on “treatment of insomnia” or a directory search, starting with the subject Health or Lifestyle. Note that you can combine various phrases into one search to improve your yield. If you are researching apple growing, you might combine several phrases from different places on the ladder of generalization and type in fruit tree apple farming growing Washington Red Delicious orchards Gala Fuji Granny Smith and see what you get.

4. Guess a location. The address or URL (uniform resource locator) of a web site is, surprisingly, often guessable. Many companies use a standard form of URL, which is http://www.companynamehere.com, where “companynamehere” is the name of the company. Thus, for example, Sony is found at http://www.sony.com and you can figure out how to get to Disney, Honda, and NBC the same way. Companies with long names often abbreviate them in some guessable way, as for example http://www.nytimes.com is the New York Times, and the standard form with “adage” in the middle will bring up Advertising Age and with “popsci” in the middle will get you Popular Science.

Moreover, that middle word is often the key to a site’s content even when not identifying a famous company. What, for example, do you think you would find at http://www.fraud.com or http://www.weather.com? Try the standard form with “salami” or even “search” and see what happens. Try a few of your own areas, based on a topic (like news) an item (like a fruit), or a place.

5. Select the wheat from the chaff by thinking about what you want. What are you looking for? Facts, opinions (anyone’s opinion or an expert’s), statistics, narratives of personal experience, eyewitness descriptions, new ideas, proven solutions, reference material? See the companion article, “Evaluating Internet Research Sources” for more information.

6. Back up to find out where you are. When you click on a hit from a search engine, you are connected directly to the page where the search terms were matched. It is not always clear exactly where you have arrived. You may see something like, “Chapter 5: The Triumph of Palladian Architecture.” Who wrote this? What is the book about, of which this is Chapter 5? To answer these questions, look at the URL (address) you have connected to. If may look something like this: http://www.some.edu/faculty/jones/architec/ch5.htm. The page you are reading is the lowest level of a series of directories beginning with the web site (www.some.edu), dropping to a “faculty” directory, then to a “jones” directory, and so on. To learn more about the context of your Chapter 5, go to the right end of the URL and chop off the “ch5.htm” and then press return. This will have the effect of backing you up one level in the hierarchy and you can see what the higher directory (“architec” in this case) is all about. You might find a book title, author, or other information. Now chop off the “architec” directory and back up to “jones” to see what is there. Back up as far as you want or need to in order to find out the information you need about the work and its author.

If you are doing research for a report or project, be sure to write down the URL and all pertinent information so that you can cite the source properly. (See below for example citation style for this page.) If you print directly from your browser, the URL will be printed for you in an upper or lower corner. But if you save the page to disk, the URL will not be included, so be sure to copy it down.

Lifesaver tip: If you accidentally neglect to write down the URL, but have a printed or disk copy of the article, you can either see the URL printed at the top of the page or you can use a search engine to locate the article for you again. To do this, type in from the article a four or five-word phrase that contains the most unusual words you can find. Then perform an exact phrase search for these words. The engine will almost always take you right to the article.