2025-04-27 07:49:33 -04:00

593 lines
19 KiB
HTML

<HTML>
<HEAD>
<TITLE>How to specify queries</TITLE>
<META NAME="GENERATOR" CONTENT="Internet Assistant for Microsoft Word 2.0z">
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#000066" VLINK="#808080" ALINK="#FF0000" TOPMARGIN=0>
<TABLE>
<TR><TD><IMG SRC ="/samples/search/book08.jpg" ALIGN=Middle></TD>
<TD VALIGN=MIDDLE><H1>ActiveX<font size="-2">TM</font> Search</h1></TD></TR>
</TABLE>
<BODY>
<H1>How To Specify Queries<BR>
</H1>
<P>
Using the query page, you can do a full-text search for a word
or phrase on this web site. Searches produce a list of files
that contain the word or phrase anywhere in their text.
<P>
The rules for formulating queries are as follows:
<UL>
<LI>Queries are case-insensitive, so you can type your query in
uppercase or lowercase.
<LI>You may search for any word except for those in the exception list
(for English, this includes a, an, and, as, and other common words)
which are ignored during a search.
<LI>Words in the exception list are treated as placeholders in phrase and
proximity queries.
<LI>Punctuation marks such as the period (.), colon (:), semicolon
(;), and comma (,) are ignored during a search.
<LI>To use specially-treated characters ( (&), (|), (^), (#), (@), ($),
((), ()) ) in a query, enclose your query in quotes (").
<LI>You may use <A HREF="#Operators">boolean operators</A> (and,
or, not) and the <A HREF="#Operators">proximity operator</A> (near)
to specify additional search information.
<LI>The <A HREF="#Wildcards">wildcard character</A> (*) may be
used to match words with a given prefix. The query &quot;esc*&quot;
matches the terms &quot;ESC&quot;, &quot;escape&quot;, and so
on.
<LI><A HREF="#FreeTextQueries">Free-text queries</A> can be specified
without regard to query syntax.
<LI><A HREF="#VectorQueries">Vector space queries</A> can be specified.
<LI>OLE and file attribute <A HREF="#PropertyValueQueries">property value queries</A> can
be issued.
</UL>
<P>
<HR>
<H2><A NAME="Operators">Boolean and Proximity Operators </H2>
<P>
Boolean and proximity operators can be used to create a more precise
query.<BR>
<P>
<UL>
<TABLE BORDER="1">
<TR><TH ALIGN="LEFT" WIDTH=147>To search for</TH>
<TH ALIGN="LEFT" WIDTH=167>Example</TH>
<TH ALIGN="LEFT" WIDTH=229>Results</TH></TR>
<TR><TD VALIGN="TOP" >both terms in the same page
</TD><TD VALIGN="TOP" >access and basic
<BR>
<B>-or-</B> <BR>
access &amp; basic
</TD><TD VALIGN="TOP" >pages with both the words &quot;access&quot; and &quot;basic&quot;
</TD></TR>
<TR><TD VALIGN="TOP" >either term in a page
</TD><TD VALIGN="TOP" >cgi or isapi
<BR>
<B>-or-</B> <BR>
cgi | isapi
</TD><TD VALIGN="TOP" >pages with the words &quot;cgi&quot; or &quot;isapi&quot;
</TD></TR>
<TR><TD VALIGN="TOP" >the first term without the second term
</TD><TD VALIGN="TOP" >access and not basic
<BR>
<B>-or-</B> <BR>
access &amp; ! basic
</TD><TD VALIGN="TOP" >pages with the word &quot;access&quot; but
not &quot;basic&quot;
</TD></TR>
<TR><TD VALIGN="TOP" >pages not matching a property value
</TD><TD VALIGN="TOP" >not @size = 100
<BR>
<B>-or-</B> <BR>
! @size = 100
</TD><TD VALIGN="TOP" >pages that are not 100 bytes
</TD></TR>
<TR><TD VALIGN="TOP" >both terms in the same page, close together
</TD><TD VALIGN="TOP" >excel near project
<BR>
<B>-or-</B> <BR>
excel ~ project
</TD><TD VALIGN="TOP" >pages with the word &quot;excel&quot; near the word &quot;project&quot;
</TD></TR>
</TABLE>
</UL>
<P>
Hints:
<UL>
<LI>You can use parentheses to nest expressions within a query.
The expressions in parentheses are evaluated before the rest of
the query.
<LI>Use double quotes (&quot;) to indicate that a boolean or near
operator keyword should be ignored in your query. For example,
&quot;Abbot and Costello&quot; will match pages with the phrase,
not pages that match the boolean expression.
<LI>The Near operator returns a match if the words are within
50 words of each other.
<LI>The Not operator can only be used after an And operator in content
queries; it can only be used to exclude pages that match a previous
content restriction. For property value queries, the Not operator can be
used apart from the And operator
</UL>
<P>
<HR>
<H2><A NAME="Wildcards">Wildcards </H2>
<P>
Wildcard operators are useful for finding pages with words similar
to a given word.
<P>
<UL>
</HR>
<TABLE BORDER="1">
<TR><TH ALIGN="LEFT" WIDTH=146>To search for</TH><TH ALIGN="LEFT" WIDTH=171>Example
</TH><TH ALIGN="LEFT" WIDTH=223>Results</TH></TR>
<TR><TD WIDTH=146>words with the same prefix</TD><TD WIDTH=171>comput*
</TD><TD WIDTH=223>pages with words that have the prefix &quot;comput&quot;, such as &quot;computer&quot;, computing&quot;, and so on.
</TD></TR>
<TR><TD WIDTH=146>words based on the same stem word</TD><TD WIDTH=171>fly**
</TD><TD WIDTH=223>pages with words based on the same stem as &quot;fly&quot;, such as &quot;flying&quot;, &quot;flown&quot;, &quot;flew&quot;, and so on.
</TD></TR>
</TABLE>
</UL>
<HR>
<H2><A NAME="FreeTextQueries">Free-Text Queries </H2>
<P>
The query engine finds pages that best match the words and phrases
in a free-text query. This is done by automatically finding pages
that match the meaning, not the exact wording, of the query. Boolean,
proximity, and wildcard operators are ignored within a free-text
query.
Free-text queries are prefixed with &quot;$contents &quot;.
<BR>
<P>
<UL>
<TABLE BORDER=1>
<TR><TH ALIGN="LEFT" WIDTH=144>To search for</TH><TH ALIGN="LEFT" WIDTH=174>Example
</TH><TH ALIGN="LEFT" WIDTH=222>Results</TH></TR>
<TR><TD WIDTH=144>files that match free-text</TD><TD WIDTH=174>$contents how do I print in Excel?
</TD><TD WIDTH=222>Pages that mention printing and Excel.</TD>
</TR>
</TABLE>
</UL>
<P>
<BR>
<HR>
<H2><A NAME="VectorQueries">Vector Space Queries</A></H2>
<P>
The query engine supports vector space queries. Vector queries return
pages that match a list of words and phrases. The rank of each page
indicates how well the page matched the query.
<UL>
<TABLE BORDER="1">
<TR><TH ALIGN="LEFT" WIDTH=130>To search for</TH>
<TH ALIGN="LEFT" WIDTH=200>Example</TH>
<TH ALIGN="LEFT" WIDTH=170>Results</TH></TR>
<TR><TD>pages that contain specific words</TD>
<TD>light, bulb</TD>
<TD>files that best match the words</TD></TR>
<TR><TD>pages that contain weighted prefixes, words, and phrases</TD>
<TD>invent*, light[50], bulb[10], &quot;light bulb&quot;[400]</TD>
<TD>files that contain words prefixed by &quot;invent&quot;,
the words &quot;light&quot;, &quot;bulb&quot;,
and the phrase &quot;light bulb&quot;. The terms are weighted.</TD></TR>
</TABLE>
</UL>
<UL>
<LI>Components in vector queries are separated by commas.
<LI>Components in vector queries can be weighted using the [weight] syntax.
<LI>Pages returned by vector queries don't necessarily match every term in the query.
<LI>Vector queries work best when the results are sorted by rank.
</UL>
<HR>
<H2><A NAME="PropertyValueQueries">Property Value Queries </H2>
<P>
Property value queries can be used to find files that have property
values that match a given criteria. The properties over which
you can query include basic file information like file name and
file size, and OLE properties including the document summary that is
stored in files created by OLE-aware applications.
<P>
There are two types of property queries, relational queries and
regular expression queries.
<UL>
<LI>Relational property queries consist of an at character (@),
a <A HREF="#PropertyNames">property name</A>, a <A HREF="#RelationalOperators">relational operator</A>,
and a <A HREF="#PropertyValues">property value</A>. For example,
to find all of the files larger than one million bytes, issue
the query &quot;@size &gt; 1000000&quot;.
<LI>Regular expression property queries consist of a pound character
(#), a <A HREF="#PropertyNames">property name</A>, and a
<A HREF="#RegularExpressions">regular
expression</A> for the <A HREF="#PropertyValues">property value</A>.
For example, to find to find all of the video (AVI) files,
issue the query &quot;#filename *.avi&quot;.
</UL>
<H3><A NAME="PropertyNames">Property names </H3>
<P>
Property names are preceded by either the at (@) or pound (#)
character. Use (@) for relational queries, and (#) for regular
expression queries.
<P>
If no property name is specified, <I>@contents</I> is assumed.
<UL>
<LI>Properties available for all files include:
</UL>
<UL>
<UL>
<TABLE>
<TR><TH ALIGN="LEFT" WIDTH=168>Property name</TH>
<TH ALIGN="LEFT" WIDTH=270>Description</TH></TR>
<TR><TD>contents</TD><TD>words and phrases in the file</TD></TR>
<TR><TD>filename</TD><TD>name of the file</TD></TR>
<TR><TD>size</TD><TD>file size</TD></TR>
<TR><TD>write</TD><TD>file last modification time</TD></TR>
</TABLE>
</UL>
</UL>
<UL>
<LI>OLE property values can also be used in queries. Web sites with
files created by most OLE-aware applications can be queried for
these properties:<BR>
</UL>
<UL>
<UL>
<TABLE>
<TR><TH ALIGN="LEFT" WIDTH=168>Property name</TD>
<TH ALIGN="LEFT" WIDTH=270>Description
</TD></TR>
<TR><TD WIDTH=168>DocTitle</TD><TD WIDTH=270>title of the document</TD>
</TR>
<TR><TD WIDTH=168>DocSubject</TD><TD WIDTH=270>subject of the document
</TD></TR>
<TR><TD WIDTH=168>DocAuthor</TD><TD WIDTH=270>the document's author
</TD></TR>
<TR><TD WIDTH=168>DocKeywords</TD><TD WIDTH=270>keywords for the document
</TD></TR>
<TR><TD WIDTH=168>DocComments</TD><TD WIDTH=270>comments about the document
</TD></TR>
</TABLE>
</UL>
</UL>
<P>
A more complete list of properties can be found <A HREF="#AllPropertyNames">here</A>.
<BR>
<H3><A NAME="RelationalOperators">Relational operators </A></H3>
<P>
Relational operators are used in relational property queries.
<P>
<UL>
<TABLE BORDER="1" >
<TR><TH ALIGN="LEFT" WIDTH=175>To search for</TH>
<TH ALIGN="LEFT" WIDTH=144>Example</TH>
<TH ALIGN="LEFT" WIDTH=234>Results</TH></TR>
<TR><TD>property values in relation to a fixed value
</TD><TD>@size &lt; 100
<BR>
@size &lt;= 100
<BR>
@size = 100
<BR>
@size != 100
<BR>
@size &gt;= 100
<BR>
@size &gt; 100
</TD><TD>files whose size matches the query</TD></TR>
<TR><TD>property values with all of a set of bits on
</TD><TD>@attrib ^a 0x820</TD><TD>compressed files with the archive bit on
</TD></TR>
<TR><TD>property values with some of a set of bits on
</TD><TD>@attrib ^s 0x20</TD><TD>files with the archive bit on
</TD></TR>
</TABLE>
</UL>
<H3><A NAME="PropertyValues">Property values </A></H3>
<P>
<UL>
<TABLE BORDER="1">
<TR><TH ALIGN="LEFT" WIDTH=130>To search for</TH>
<TH ALIGN="LEFT" WIDTH=200>Example</TH>
<TH ALIGN="LEFT" WIDTH=170>Results</TH></TR>
<TR><TD>a specific value</TD><TD>@DocAuthor = Bill Gates
</TD><TD>files authored by &quot;Bill Gates&quot;</TD>
</TR>
<TR><TD>values beginning with a prefix</TD><TD>#DocAuthor George*
</TD><TD>files whose author property begins with &quot;George&quot;
</TD></TR>
<TR><TD>files with any of a set of extensions</TD><TD>#filename *.|(exe|,dll|,sys|)
</TD><TD>files with &quot;.exe&quot;, &quot;.dll&quot;, or &quot;.sys&quot; extensions
</TD></TR>
<TR><TD>files modified after a date</TD>
<TD>@write > 96/2/14 10:00:00</TD>
<TD>files modified after February 14, 1996 at 10:00 GMT</TD></TR>
<TR><TD>files modified after a relative date</TD>
<TD>@write > -1d2h</TD>
<TD>files modified in the last 26 hours</TD></TR>
<TR><TD>vectors matching a vector</TD>
<TD>@vectorprop = { 10, 15, 20 }</TD>
<TD>OLE documents with a vectorprop value of { 10, 15, 20 }</TD></TR>
<TR><TD>vectors where each value matches a criteria</TD>
<TD>@vectorprop >^a 15</TD>
<TD>OLE documents with a vectorprop value in which all values in the vector are greater than 15</TD></TR>
<TR><TD>vectors where at least one value matches a criteria</TD>
<TD>@vectorprop =^s 15</TD>
<TD>OLE documents with a vectorprop value in which at least one value is 15</TD></TR>
</TABLE>
</UL>
<UL>
<LI>Be sure to use the pound (#) character before the property
name when using a regular expression in a property value, and
an at (@) character otherwise. The equal (=) relational operator
is assumed for regular expression queries.
<LI>File name (#filename) is the only property that supports regular
expressions with wildcards to the <I>left</I> of text. Wildcards
in regular expressions for all other properties must come after
a prefix.
<LI>Date and time values are of the form yyyy/mm/dd hh:mm:ss. The first two
characters of the year and the entire time can be omitted. Dates and times
are in GMT.
<LI>Dates and times relative to the current time can be expressed
with a minus (-) character followed by zero or more
integer and time unit pairs. Time units are expressed as: (y) for
years, (m) for months, (w) for weeks, (d) for days, (h) for hours, (n) for
minutes, and (s) for seconds.
<LI>Currency values are of the form x.y, where x is the whole value amount
and y is the fractional amount. There is no assumption about units.
<LI>Boolean values are (t) or (true) for true and (f) or (false) for false.
<LI>Vectors (VT_VECTOR) are expressed as an opening brace ({), a comma-separated list of values, then a closing brace (}).
<LI>Single value expressions that are compared against vectors are expressed
as a <A HREF="#RelationalOperators">relational operator</A>, then a (^a) for
<I>All Of</I> or a (^s) for <I>Some Of</I>.
<LI>Numeric values can be in decimal or hex (preceeded by 0x).
<LI>The <i>contents</i> property does not support relational operators.
If a relational operator
is specified, no results will be found. For example, "@contents Microsoft"
will find documents containing Microsoft, but "@contents<b>=</b>Microsoft"
will find none.
</UL>
<BR>
<H4><A NAME="RegularExpressions">Regular expressions</A></H4>
<P>
Regular expressions in property queries are defined as follows:
<UL>
<LI>Any character except, *, ., ?, and | defaults to matching
just itself.
<LI>Regular expressions can enclosed in matching quotes ("), and must
be enclosed in quotes if they contain a space ( ) or closing parenthesis ()).
<LI>*, ., and ? behave as you might expect (match any number of
characters, match (.) or end of sentence, and match any one character)
<LI>| is an escape character. After |, the following characters
have special meaning:
<UL>
<LI>( opens a group. Must be followed by a matching )
<LI>) closes a group. Must be preceded by a matching (
<LI>[ opens a character class. Must be followed by a matching (un-escaped) ]
<LI>{ opens a counted match. Must be followed by a matching }
<LI>} closes a counted match. Must be preceded by a matching {
<LI>, separates OR clauses
<LI>* matches zero of more occurrences of preceding expression.
<LI>? matches zero or one occurrences of preceding expression.
<LI>+ matches one or more occurrences of preceding expression.
<LI>anything else, including | matches itself
</UL>
<LI>Between [ and ] the following characters have special meaning:
<UL>
<LI>^ Match everything but following classes. Must be the first character.
<LI>] Matches ]. May only be preceded by ^, otherwise it closes the class.
<LI>- Range operator. Preceded and followed by normal characters
<LI>anything else matches itself (or begins/ends a range at itself)
</UL>
<LI>Between { and } the following syntax applies:
<UL>
<LI>|{m|} matches exactly m occurrences of the preceding expression.
(0 &lt; m &lt; 256)
<LI>|{m,|} matches at least m occurrences of the preceding expression.
(1 &lt; m &lt; 256)
<LI>|{m,n|} matches between m and n occurrences of the preceding
expression, inclusive. (0 &lt; m &lt; 256, 0 &lt; n &lt; 256)
</UL>
</UL>
<BR>
<HR>
<H2><A NAME="Examples">Query Examples</A><BR> </H2>
<P>
<UL>
<TABLE BORDER="1">
<TR><TH ALIGN="LEFT" WIDTH=246>Example</TH>
<TH ALIGN="LEFT" WIDTH=246>Results</TH></TR>
<TR><TD>@size &gt; 1000000</TD>
<TD>pages larger than one million bytes</TD></TR>
<TR><TD>@write &gt; 95/12/23</TD>
<TD>pages modified after the date</TD></TR>
<TR><TD>Apple tree</TD>
<TD>pages with the phrase &quot;apple tree&quot;</TD></TR>
<TR><TD>&quot;apple tree&quot;</TD>
<TD>same as above</TD></TR>
<TR><TD>@contents apple tree</TD>
<TD>same as above</TD></TR>
<TR><TD>Microsoft and @size &gt; 1000000</TD>
<TD>pages with the word &quot;Microsoft&quot; that are larger than one million bytes</TD></TR>
<TR><TD>&quot;microsoft and @size &gt; 1000000&quot;</TD>
<TD>pages with the phrase specified (not the same as above)</TD></TR>
<TR><TD>#filename *.avi</TD>
<TD>video files. (the '#' prefix is used because the query contains a regular expression)</TD></TR>
<TR><TD>@attrib ^s 32</TD>
<TD>pages with the archive attribute bit on</TD></TR>
<TR><TD>@docauthor = William Gates</TD>
<TD>pages with the given author</TD></TR>
<TR><TD>$contents why is the sky blue?</TD>
<TD>pages that match the query</TD></TR>
<TR><TD>@size &lt; 100 &amp; #filename *.gif</TD>
<TD>GIF files less than 100 bytes in size</TD></TR>
</TABLE>
</UL>
<HR>
<H2><A NAME="AllPropertyNames">List of Property Names </H2>
<P>
These properties are always available for queries. Additional properties
may also be available depending on the configuration of the web server.
<P>
<UL>
<TABLE BORDER="1">
<TR><TH ALIGN="LEFT" WIDTH=165>Property Name</TH>
<TH ALIGN="LEFT" WIDTH=291>Description</TH>
</TR>
<TR><TD>Filename</TD><TD>file name</TD></TR>
<TR><TD>Size</TD><TD>file size</TD></TR>
<TR><TD>Attrib</TD><TD>file attributes</TD></TR>
<TR><TD>Write</TD><TD>last write time</TD></TR>
<TR><TD>Create</TD><TD>create time</TD></TR>
<TR><TD>Access</TD><TD>access time</TD></TR>
<TR><TD>Change</TD><TD>last change time</TD></TR>
<TR><TD>Contents</TD><TD>file contents</TD></TR>
<TR><TD>ShortFileName</TD><TD>8.3 short file name</TD></TR>
<TR><TD>DocTitle</TD><TD>title of the document</TD></TR>
<TR><TD>DocSubject</TD><TD>subject of the document</TD></TR>
<TR><TD>DocAuthor</TD><TD>the document's author</TD></TR>
<TR><TD>DocKeywords</TD><TD>keywords for the document</TD></TR>
<TR><TD>DocComments</TD><TD>comments about the document</TD></TR>
<TR><TD>DocTemplate</TD><TD>template file used for the document</TD></TR>
<TR><TD>DocLastAuthor</TD><TD>the last author to work on the document</TD></TR>
<TR><TD>DocRevNumber</TD><TD>the revision number of the document</TD></TR>
<TR><TD>DocEditTime</TD><TD>the total amount of time spent editing the document</TD></TR>
<TR><TD>DocLastPrinter</TD><TD>the last time the file was printed</TD></TR>
<TR><TD>DocCreateDTM</TD><TD>create date and time</TD></TR>
<TR><TD>DocLastSaveDTM</TD><TD>last save date and time</TD></TR>
<TR><TD>DocPageCount</TD><TD>number of pages in the document</TD></TR>
<TR><TD>DocWordCount</TD><TD>number of words in the document</TD></TR>
<TR><TD>DocCharCount</TD><TD>number of characters in the document</TD></TR>
<TR><TD>DocAppName</TD><TD>name of the application used to create the file</TD></TR>
</TABLE>
</UL>
<P>
<BR>
<BR>
<HR>
<BR>
<FONT size=-1>
Copyright &#169; 1996 Microsoft Corporation. All rights reserved.
</BODY>
</HTML>