Wildcard searches are useful when searching for terms that may vary in spelling or when exploring a topic before narrowing your focus. A wildcard query can be formed using an asterisk (*) for multiple character replacements or a question mark (?) for single character replacements.
Single Character Replacement
Single character replacement is executed by using a question mark (?) in the place of an unspecified character in a search term. This type of search is most useful when the spelling of a word varies or is often misspelled. For example:
NF?B
The term NFκB can appear in multiple variations, with the proper Greek letter kappa or with a lower or uppercase “k” used in place of the letter kappa. Using a single character replacement search will return records that contain NFκB, records that contain NFkB, and records that contain NFKB. It should be noted that while this syntactic approach is likely to return more results about NFκB, it will also return records that contain other characters occupying the wildcard spot, such as NFIB.
Multiple Character Replacement
Multiple character replacement is executed by using an asterisk (*) in the place of zero, one, or multiple characters in a search term. To continue our example from above, multiple character replacements can be used to account for several unspecified characters. For example:
NF*B
The above example will return results where the Greek letter kappa is spelled out, e.g., NFkappaB, as well as NFκB, NFkB, and NFKB.
Oftentimes, you will want to use the wildcard for the same prefixes of a term to capture all of the variations of that word, without having to type in every variation. For example:
treat*
The above example would return treat, treated, treatment, treating, treatable, along with any other word beginning with treat.
Rules for using Wildcards
- A multiple character wildcard replacement (*) query allows the query to substitute any number of characters in its place, including no characters.
- A single character wildcard (?) query matches terms differing by one character but cannot match an empty character, hyphens, or whitespace.
- Wildcards can be used in the beginning, middle, or end of a single term but cannot be used in a quoted phrase. Multiple single terms can be queried with wildcards, but wildcards do not work within the confines of quotation marks.
- Terms containing dashes will also not work with wildcards. Dashes are converted to spaces by the search engine. iSearch Analytics implicitly quotes these terms to keep the terms together, improving search experience.
- When using a wildcard at the beginning of a term, it is normal to see the results return a bit more slowly. This is due to the fact that the search engine has to search every term within the corpus to find all the terms that match the criteria. A search for “g*” will take longer than a query for “ge*” as the subset of terms that begin with “ge” is much smaller than the subset of terms beginning with “g.”
Lastly, wildcards are a powerful query syntax tool; however, they may return more results at the expense of precision. Wildcards, by nature, are intended to search the corpus for every instance of your search term, regardless of its relation to your search topic. For example, transplan* would return "transplant," "transplanted," "transplantation," etc. Searching trans* would return these examples as well as "transfer," "transport," "transient," etc. This issue could be reduced by adding other terms or filters to your search to reduce the likelihood of false positives.
You can also use wildcard searches to query the content of fields. For example, searching abstract:* will return all applications that contain text in the abstract. Adding a NOT or "–" (-abstract:*) will return documents with blank abstracts.
As with all searches, regardless of the technique used, always review the results for accuracy before continuing your analysis.