About Our Search Engine

June 2017

Our new search engine gathers results from all three OSP websites (OSP.mit.edu, COI.mit.edu and KC.mit.edu) and aggregates them into a single stream of results pages. You can search all three sites from any of the three sites. So now you don’t have to hop from site to site looking for items. No matter what site you do the search from, it will return results from all three sites. This new search will also look inside PDF files and return the URL for PDF files containing matches to your search term. This search engine also provides for “fuzzy” searches, which can be helpful if you don’t have an exact term to search for.

You will sometimes find duplicate listings of the same PDF document, since many of them are referenced from multiple locations. However, the features gained by the new search engine more than outweigh this inconvenience.

A quick guide to the search operators that work in the new search engine.

The new search engine is case insensitive, which means that it doesn’t matter if you type, for example: “compliance”, “Compliance”, “cOMPLIance” or “COMPLIANCE”. They will all return the same results.

To do an exact match on a multi-word phrase, put quotes around the entire string. Note that dashes will be considered spaces, so put quotes around terms that are hyphenated.
Examples: “allocation rates”, “key person”, “federal terms”, “sponsor-approved”…

Search Operators and Conventions

There are four search operators you should be aware of.

Wildcard “*” – This works as it does in most search engines. The * may represent any number of characters.

The * can be placed at the beginning, in the middle, or at the end of the word you are searching.

Examples:  

  • alloc*      finds any word starting with alloc
  • *ternal     finds internal, external, eternal, fraternal, etc
  • exp*n       finds expiration, exploration, experimentation, expansion

Single character wildcard “?”

You may have used the % sign for this in other search engines, but this engine uses the “?”. The single character wildcard will find words that have exactly the number of characters in the search string.
Examples:

  • alloc???     finds allocate, but NOT allocated, allocation, allocable
  • ship????     finds  shipment, shipping, but NOT shipped, shipper

Fuzzy search -- search for words that are similar in spelling  “~”

This search supports fuzzy searches based on the Levenshtein Distance, or Edit Distance algorithm. To do a fuzzy search use the tilde  ~ symbol at the end of a single word term. For example, to search for a term similar in spelling to "intern" use the fuzzy search:

intern~

This will return results containing internal, intel, intent, international, etc.

Boosting search terms (^)

The boost “^” operator can be used to elevate the more relevant term of a two word search in the rankings. Boost values range from 1 to 5, where 5 is the highest boost.

If you want the term "export" to be more relevant boost it using the ^ symbol along with the boost factor next to the term. Compare results for these searches.

  • export^5 shipping   vs  export shipping
  • Equipment^4 Threshold   vs    Equipment Threshold