The Search Strategy

How to find information

Why do we search for information?

Information, especially the knowledge we can obtain from it (which has to be the most important part), is a useful commodity. It is there to help people make better decisions, adapt to a seemingly changing world, and solve personal and world problems. It can also advance human knowledge by improving on what we know, or seeing something new and interesting what we have never seen before. And information can, dare we say it, make money for some of us.

Who doesn't need some information if it can do all these things for a person?

Is information a need or a want?

It can be both.

Our brain has been designed to process information we receive through our eyes, ears and other senses to help us make sense of the world, and eventually grab the essential patterns we need to recognise quickly in order to survive. These patterns are essentially snippets of knowledge that we understand to be important to us. So, on a survival basis, information can be a need. On the other hand, there are times when we see information and realise, "Do we really need to know all of that?" In the latter situation, information becomes more of a want, and potentially it could be nothing more than garbage.

We all have a tendency to assume information is a need. In truth, it is up to us (or the information-seeking specialist such as the librarian) who does the searching for ourselves or someone else to determine whether the information is truly a need or a want.

If it is a want, and you are the librarian helping people to find information, it would be better to help users find that information for themselves rather than you doing the work all the time. If it is a need, it would be best to help the clients directly and then create a database of results and method of searching to help other potential clients, and so increase the time by the librarian to achieve more important things.

Where there is an information need and you are asked to conduct a search on behalf of someone else, it is a good idea to develop a search strategy.

What is this search strategy?

  1. INTERVIEW PROCESS
    This is the moment when you begin gathering information about the client and his/her information needs.

    The aim here is to get an accurate picture of the client's requirements. If necessary, get the requirements down in writing (or ask the client to give you an example of what he/she is after) so that you and the client know exactly what information has to be found.

    During the interview, it is not necessary for you to show how much knowledge you may have for the subject. In other words, depending on your specialisation, you may know what information is needed by a client. However, it is far better to assume the client knows best and has a better understanding of what information he/she needs. Your job is therefore nothing more than to locate the information for the client.

    There will be times when the client can't ask precisely for what they want. This is because:

    1. The client may not know the extent of the information resources you may have available.

    2. The client is not used to asking for what they want and might be too shy.

    3. The client feels the information could be too sensitive to disclose to you in a complete manner.

    4. The client does not know for sure what they want.

    When interviewing the client, be relaxed, make regular eye contact while concentrating in the "here and now" moment of listening to the client. Then occasionally look away and think about what you have heard and what you may be able to do for your client using the resources available to you (i.e. begin developing the draft search strategy).

    The critical part of the interview is listening, asking various neutral questions, and recording each item of the request from the client. Consider having a notepad and pen and write down the request. Then, in your own words, try to summarise the request you've heard in order to make sure you have got it right.

    If at any stage you are still not quite confident in what the client requires, then do a search with him/her. Make an appointment for the client to be with you when you conduct a search. Or alternatively, you could send a variety of different materials covering the subject; then ask the client to select the ones he/she wants.

  2. CONSTRAINTS
    Next, you may need to set boundaries as to what you can and cannot do for your client given the resources and skills available to you. Perhaps these boundaries might be negotiable depending on what other resources are suddenly made available to you by the client.

  3. PREPARE A SEARCH STRATEGY TABLE
    Prepare a search strategy table. This is essentially a set of decisions to forming a search query organised in a table format.

    The table consists of various columns and rows. Your first column is your index. It lists all the possible primary words or phrases to search on a database. The second column contains a list of additional terms which you may wish to use to help narrow down your search results in case you have too many relevant items to select using the first term. Then all other consecutive columns may have additional terms if you feel it might be necessary to further refine the search strategy.

    Remember, these extra terms are connected with the first term using an appropriate boolean operator such as OR, AND, NOT and so on. More details about boolean operators will be discussed later on.

    Before you can create this table, it is probably a good idea to look through the notes you have written during the interview and start picking out the relevant words that will be used in your search strategy. Then go to the library and check the words against the Library of Congress Subject Headings (LCSH) reference books and write down their preferred subject heading terms. In that way you will have a high probability of finding what you want from a database using those preferred terms.

    The list of all critical words needed for your search which you extract from your notes is called an "index".

  4. CONDUCT THE SEARCH
    Now the time has come to put your search strategy table into action by conducting your search on various databases. For good searching, try to cover both print and online information. The best sources for finding information include:


    * Personal experience
    * Books
    * Journal articles
    * Expert opinions
    * Encyclopedias
    * The Web
    * Databases
    * Search engines and directories.

  5. PRESENT RESULTS AND GET FEEDBACK
    After going through your search strategy table, present all your search results and get feedback from your client.

  6. REFINE SEARCH STRATEGY
    Refine the search strategy based on the feedback you've received from your client and repeat steps 1 to 5.

More details about asking neutral questions to a client during an interview

Is the client really telling you everything? Has the client told you the core of his/her needs?

To make sure the client has told you everything and you have got to the heart of the matter, use a combination of open-ended, close-ended and neutral questions. But particularly focus on the neutral questions when finding out things like, "What does the client really need?"

Neutral questions are a powerful way of eliciting detailed responses from your client. They help you to get specific information you need in developing a good search strategy.

What are neutral questions? Well, the questions you would ask a client come in three basic types: closed, open and neutral.

If you use a closed question like "Is this what you want?", you will usually get what is known as a "closed" answer - namely, "Yes" or "No" with nothing else to glean for the client's response. Here is another example: "Would you like a magazine or book?" What you are effectively doing when using the closed question approach is to restrict your client to answering with one of two choices.

Now closed questions are acceptable if all you need to do is get a simple answer for confirming certain aspects of the search request you have written down from your client. However, if you want more detailed information (especially in the early stages of an interview where you need to know exactly what the client wants), you have a choice of going for an "open" or "neutral" question approach.

With the open question approach such as "How will this information help you in your work?", your client will give you lots of information to think about and may give you ideas as to exactly what the client wants.

Yet there will be times when the information you receive from your client using the open-ended question approach will sound like goobledygook to you simply because you may not be entirely familiar with the subject matter. Furthermore, you may get more information than you ever need to do the search properly.

So you take the "neutral" question approach. These are questions that narrow down the choices but is open enough to guide the client along certain paths as a way of clarifying in his/her head and your own what the information need is really about.

Here is an example of a neutral question: "Tell me a bit more about how you plan to use ____________?"

As you can see, it is pretty hard to give a "Yes" or "No" answer to this question. The client is forced to give a bit more detail in his/her answer. Yet the question is not too open-ended that the client will give any answer. Clearly the client has to focus on the particular aspect of the question it is referring to.

Remember, the aim of neutral questions is to give a reasonable answer within an appropriate range.

Other examples of neutral questions include the following:

"What do you mean by ____________?"
"If I can't find ____________, what would be the next best option?"
"So you are saying you like me to find ____________?"
"How will you be using the information?"
"I'm not familiar with ____________. Can you explain it to me?"

More details about conducting a search

For the purposes of this document, we will assume all searching is conducted on an electronic database.

When conducting a search using a database, you will (i) learn the search language it uses to find information including boolean logic; (ii) to know whether the information you will gather comes from a reputable source; and (iii) whether the information is relevant to your client's needs.

The process of searching for information is called "retrieval of information" or "online interrogation". It consists of three main steps, namely:

  1. THE QUERY
    To type a search string called a "query" into a search field on a database which will result in a good probability of uncovering relevant information for your client.

  2. THE ANALYSIS
    To compare the information obtained from a database via records with the needs of the client.

  3. THE OUTPUT
    To present your search results (also called the retrieval set) of the search after using your query, and to identify the records of highest relevance to the client (if any).

Now how we actually do the searching will depend on the nature of the information we are gathering and the type of database we are using, but it is usually based on four main types:

  1. Brief search
  2. Building blocks (multiple searches connected with AND)
  3. Successive fractions (combination of NOT and AND)
  4. Citation pearl-growing (You search and find a few articles and then do another search on these articles).

Which specific search technique you should apply is up to you to decide. Just choose a searching technique and see what results you get. For example, if a brief search strategy successfully brings up lots of relevant information, you may not need to go any further. However, if you don't have the right information you want from a brief search, you may need to consider other techniques.

How much searching should I do?

While this may be up to the client to decide, you should keep in mind the "law of diminishing returns" which states that the more you search, the less information you will get back.

This may seem a little ironical because we have a tendency of thinking the more we search on a particular topic, the more information we get. It is rather the opposite. The more we check through the information we have retrieved and realise what kind of knowledge it contains, then the more searching we do, the less information there is because there is not so much information available which is different from what we've already gathered.

Therefore, you will know at what point searching should end as soon as you have grabbed enough information.

If I have retrieved a lot of what looks like relevant information, should I download it all for my client?

There is a golden rule for every search you conduct. The rule is, "Always sample a few records before downloading".

In other words, don't download all your information just because it looks like it is all relevant. Just make sure you have the right information first by checking one or two records from your search results and then let the client decide whether or not he/she wants all the records.

This golden rule is particularly important when information will cost the client.

What kind of searching is available on a database?

The type of searching available on a database is based on two main search languages: (i) the controlled indexing language; and (ii) the natural indexing language.

A database that uses the natural indexing language is able to search through all the words of every document stored in the database. This is equivalent to doing a full-text search. It can be slow to find what you want, but it does look at all the words of every document stored in the database and pick out every instance of the words you are looking for.

Databases that use the controlled indexing language tend to rely on an index. This means the index contains a summary of what is contained in all the documents stored in the database including how to "point" to the right documents. It is usually quicker and more accurate, but sometimes this index may not be comprehensive enough to ensure all the information in the documents are properly represented.

You will also find hybrid databases containing both full-text and controlled indexing language.

If you want to find lots of relevant information quickly, do a search on a controlled indexing database. Alternatively, if you are using a natural language database, see if there is a facility to do your searching on only the title of the documents only. But if you want a more thorough search through all the documents and are prepared to spend a little more time doing so, try a natural "free text" database.

More details about Boolean logic

The aim in conducting a search using a database is to retrieve a sufficient number of relevant records.

While it is important to choose the right words that matches the client's needs precisely, sometimes you will discover in your search the number of relevant records are too high and need some way to narrow down the choices to the most highly relevant records. Or sometimes you may not have enough records and need some way to broaden the choices using a variety of other words in your search.

This is best achieved using a few additional words (or shorthand symbols) known as boolean logic.

Boolean logic is part of the database's search language. It is already built-in and ready to help you find precisely what you want. The only time when boolean logic will not help you to find what you want is when the database does not have enough information stored in it to give you the results you want. But otherwise, it is a simple and yet powerful way of quickly selecting the records of high relevance.

Now boolean logic (or boolean operators) are the means of specifying combination of terms in a search query to help find more specific and relevant items. In other words, we use the boolean logic to link terms together. The boolean operators are:

A AND B - Those items that have A and B together
A OR B - Includes A and B and AB together
A NOT B - Includes A but not B.

Some variations and alternative symbols used in some databases to represent the same boolean operators are shown below:

The symbols * and + may be used instead of AND
AND NOT=NOT

There are several other clever little operators you can use to help make searching just that little bit more flexible and easier:

  1. THE TRUCATION FACILITY
    For example, if you want to do a search on "city" and use the truncation symbol "?" in a search query like "cit?", you will get "citadel", "cities", "citronella" and so on in your search results. But be careful how you use the truncation method as you may get lots of irrelevant records from the database.

  2. THE PROXIMITY OPERATOR
    You specify two words and then indicate the distance between the words (i.e. number of words) when searching. For example, if you type into the search field of your database a search string of the form "WORD1(nw)WORD2" where WORD1 and WORD2 are two words you are looking for and "n" is a positive integer number (i.e. 1,2,3, etc), the database will interpret this as "WORD1 must be n words before the word WORD2".

  3. THE ADJACENCY OPERATOR
    The best way to explain this is by way of an example. Now suppose you type into the search field of your database the following: "Rabbit adj virus". This means the database will look for "Rabbit" adjacent to "virus" (i.e. "Rabbit virus" and "virus Rabbit").

  4. BRACKETS ()
    You can use brackets to specify what to search for first before doing anything else outside of the brackets.

  5. IMPLIED BOOLEAN
    Implied boolean are the symbols -, +, and the quotation marks "...". Use them to add words, or remove unwanted words, and to include phrases using the quotation marks. For example,

    "star trek" - voyager + "next generation"

    This means look for articles with "star trek" and "next generation" but don't include articles with "voyager".

    Also, the exact order of the words to search for are usually placed between quotation marks.

If, at any stage, you get stuck on how to join terms together using boolean logic or need to know the various other fancy operators for searching, just check out the HELP screen of every database.

Remember, use boolean logic to narrow things down or to broaden your search results. Also, if you use less boolean logic in your search, you will expand the range of information for you to select, but the relevance rating of that information could be low. So long as your database contains enough information in the area you are interested in, always apply some boolean logic and any of the other fancy operators to get your database to retrieve exactly what you want from it.

Is relevance ranking a useful feature?

Relevance ranking on a database is a useful feature. If you find a database with relevance ranking, please use it as it will save you time in selecting your most relevant documents from your search results (or retrieval set).

What is most likely to give me the most relevant information during searching?

On your search page, look for a field called "title". Then do a search in this field. Why? Because from experience the title field tends to provide a very high level of relevance possible in the information you gather. The next best search is the subject descriptors. The lowest level of relevance is in searching through abstracts, paragraph lead and full-text keyword fields.

How do I assess my search strategy for thoroughness?

So you want to know whether you have selected the right terms covering all the concepts required? You sample a few records and look at the SUBJECT DESCRIPTORS field in each record and see what terms appear. If you see a few relevant terms that are missing in your list, add them to your search strategy table list.

The Relevance Paradox

Don't try to be too thorough in your searching just to find the most relevant information and to cover all the concepts. There is a thing called The Relevance Paradox which states that "the more you want to be complete in your search, the higher the number of irrelevant records that will creep into your retrieval set".

The aim is to balance your search requirements between relevance and thoroughness (or completeness).

This law has come about because in the real world, you will be required to perform jobs like searching on databases within a set period of time as well as to minimise the costs in getting that information.

To help you get that balance you need during searching, remember this idea: "You will often find that the first 10% of the time you spend searching will already give you 90% of the requirements accomplished."

So don't spend too much time on your searching strategy.

The Principle of Diminishing Returns

This principle states that the more databases you search on for new records, the less you will find. So use a few very good databases and do most of the searches on them. The only time when you might be required to look on an exhaustive list of databases is if the client needs to know whether his/her Ph.D. topic is unique, a patent is unique, or a legal search is identified as a legal precedent.

What if my database hasn't got relevance ranking?

If you don't have relevance ranking capabilities on your database, choose your words for searching very carefully.

To begin with, don't use a common word in your search query unless you need to broaden your search results. Otherwise if you do use a common word, you will get a huge hit and have too much "low relevance" information to sort through. Use a quality word to get information of a high relevance rating.

For example, if you are searching for information relating to the use of radioactive isotopes in measuring the productivity of soil, go for a term like "soil" and then a word like "isotope" or "radioisotopes" to help narrow down the search results and to give highly relevant information.

However if you decide to search using a combination of terms in the search field like "soil" and "water", you will probably find what you are looking for. However, it would not give a compact list of relevant articles because it will simply produce too many irrelevant records to sort through properly.

Other ways you can narrow down the range of acceptable information while increasing its relevance is to do a search on the "title" and/or "author" field instead of the "full-text" field. Also do a search using a complete and more specific phrase in quotation marks like "lipid membranes" instead of just "membranes". Or use the boolean restriction operators of AND and NOT to do the same job.

What will I find on a search database?

All the database you are likely to search on usually contain a Help facility, a news facility and a log-off section. And, more importantly, a search facility as well (otherwise why would we need a database?).

There may be a built-in thesaurus, show hyperlinks, have the ability to truncate terms or use Boolean logic when searching on the database, and so on. There may also be an advanced search facility instead of a basic one. There may also be a full or a citation facility available as well.

When is it a good time to use online databases?

Use online databases when searching for information only if you want the latest information. Remember, online databases usually only go back to the 1980s. So it may not be useful for historical searching.

Are search engines on the Internet a good way to find information?

It all depends on the search engine you are using. But bear in mind one thing: no single search engine, not even all of them combined, will comb through every single piece of information that exists on the Internet. There is too much information and the information changes all the time, making it impossible for search engines to sort through it all and keep its own index system up-to-date.

Also, search engines are not consistent in how they index web pages. For example, some of the spiders (or software robots as they are called) that look at people's web pages may only perform an index on the keywords entered by the author of a particular web page inside the "Keyword" META tag. Some other spiders may index just the title or it may index the entire page.

Hence you will find some search engines claiming that they can index an entire web page like http://www.google.com/. However, other search engines will not.

Another problem with search engines is that certain words selected by the spider to form an index can be spelt the same but have different meanings. And what about plural and singular words (i.e. words ending with an "...s")? Or what about verb tenses that differ the word ("...ed")? Search engines are too simplistic and therefore not good at sorting out this situation properly.

Search engines need to have more sophisticated searching technology to help them understand the linguistics of the words you use for searching.

Are there intelligent searching technology available today?

Yes there is. One such intelligent search engine available on the Internet is the one used by http://www.google.com/.

Google.com apply a variety of intelligent searching techniques to get the information people need or want and usually of a high relevance rating. One technique employed by this search engine is the concept-based searching (or clustering). This means it searches for words that relate to the actual word you are searching via the concept it defines.

Does the size of a search engine make a difference?

Not necessarily. While it is potentially possible to cover slightly more information with the help of a large search engine, there is more a concern over the quality of the information it can present and how much of it is duplicated knowledge.

The general rule of thumb is that the bigger the search engine, the more duplicated "knowledge" web sites are likely to be found, and you will also get many more hits than you require. We recommend searching on a variety of different search engines to help you find a wider range of more relevant information. So try a web site that can search on a number of search engines at once. And also use a search engine that can properly index entire web pages properly.

However, the best search engines are those that have intelligent searching technology and have the ability to sort out the higher quality web sites from all others. We therefore recommend Google, MSN and Yahoo as the better search engines if you want intelligent searching technology capabilities.

The basic principle of smart searching

Firstly, you should know where to look and that means choosing your database very well. Secondly, fine-tune your keywords be selecting the smallest possible subset that describes what you want before expanding the search range. Thirdly, take advantage of the available search refining options like BOOLEAN operators. Make use of the EXCLUDE operator (i.e. NOT) to narrow the search to what you want. Thirdly, rerun your query each time you make a refinement to narrow down your list of relevant hits. Fourthly, take advantage of the "query by example" or "find similar sites" option. Finally, if you think about the document, you can also anticipate the answer you want. So have a quick read of the document to give you ideas of the terms you should be searching on your database.

This thinking about the document is all about asking better questions to get better results by knowing more about the document.

How do I select a source/record/item from my search results for the benefit of my client?

After conducting a search and retrieving relevant items, you may be required on behalf of your client to select the really high quality sources to help narrow down the choices.

Now this may be important if your client is an academic. An academic will almost always need to know the quality of the source for his/her work. If this is the case, you must ask yourself and the client the following questions:

  1. What is the purpose of this search?
  2. What type of information is needed?
  3. Is cost an issue? What is the budget?
  4. How current or recent does the data need to be?
  5. How much of each record is needed?
  6. Is full-text needed?
  7. Will library holdings of records be needed?
  8. Are there any known sources?
  9. Which sources would I use for scholarly research?
  10. Is the information current or historical?
  11. What is the most cost effective way to complete this search?
  12. Which sources can I find online?

Now depending on the answers to these questions, you should be able to quickly see which items are of high quality for your client.

Remember, sources that are considered acceptable to academic people will depend on the following factors:

  • Relevance
  • Timeliness
  • Comprehensiveness
  • Specificability
  • Authoritativeness
  • Locatability
  • Acquirability and
  • Useability

So look at the search results and relevant records you have found and decide on a list of quality information to send to academic clients.

How do I evaluate information on the Internet as being of high quality?

The evaluation process is more important on the Internet than on traditional databases because there is more junk on the Internet than on carefully prepared databases.

Look at the authority and credibility of your information. Look for timeliness and currency. Look for footnoting and documentation. Look for the purpose. Look for the author of the document. Is the author well-known?

This is the test for evaluating data on the Internet.

How have databases changed for the information retrieval expert?

Early systems were command-based systems. They were considered too difficult to use for the inexperienced user. Only an information retrieval specialist could use it because he/she could understand the command language used by the database to obtain information. It was also very expensive because you had to log in at a US library site. For example, the search service DIALOG is one ancient retrieval system still in use in some libraries.

Fortunately the interface of information retrieval systems have advanced to what are known as GUI. This is visually more appealing, easier to understand and navigate, able to use multiple windows, have all the data you need to know in front of you via pop-up menus, email the search results to yourself, and opportunities for expansion and development into multi-media documents (i.e. graphics and sounds). For example, the electronic kiosk is a system that uses the GUI interface.

However, GUI is still a little confusing for some people. More work needs to be done to improve the presentation and ease of finding information on any database.

And to this day, the work continues as information retrieval systems improve and become easier to use.

Why is it difficult to develop a database system that works perfectly?

Every human being has information needs. Those needs are the motivation for people to go and search for information known as "information-seeking behaviour".

However, quality research on information seeking behaviour is somewhat unsatisfactory. Apparently the work still lacks comparative information because studies conducted in this field differ widely in aims, objectives and methods.

So while people are still searching for a theory of information seeking behaviour, changes in retrieval systems were almost totally driven by technology and not on the needs of the user and their natural information-seeking behaviour.

We are only now entering the phase of building new databases that are based on how people find information in a natural way.

What are some of the information-seeking behaviours of people?

While we are developing a more unified and accurate theory of information-seeking behaviour, the following information has been gathered:

  1. Most people are usually happy with information they can gather if it is easy to find, not necessarily if it is the best. For example, people will be happy to use the Internet because it is easy to find almost any kind of information.
  2. For more rationally-thinking people like university academics, using the Internet may be easy, but they have to look at the quality of the information. Where is it from? Are there references? Has it been peer reviewed?

Here is a slightly more detailed look at the information-seeking behaviours of three groups of people:


SCIENTISTS

  • Scholars have low reliance on libraries; and a high reliance on specific and up-to-date information from people in their fields of expertise.

  • Scholars are specific in their information needs and need it fast.
  • Conferences are important to scholars. Talking and meeting with colleagues is important as well as following up references in journals.
  • Rarely do they use books and therefore have a low reliance on libraries because they have a high reliance on talking to people in specific fields.
  • Readership of the journal is important for scholars.
  • The sources of information scholars use tend to be: (i) Informal communication; (ii) High reliance on current content; (iii) Low use of abstracting service; and (iv) High reliance on journals.

HUMANITIES (e.g. psychologists and sociologists)

  • These people are more likely to browse the libraries and look for books.
  • Basically the opposite to scientists.

SOCIAL SCIENCES

  • These people fitted somewhere inbetween THE HUMANITIES and the SCIENTISTS.