Standards

NAVIGATION
CATEGORIES
REFERRENCE
LINKS
  • Fw: FT word Distance exactly

    0 answers - 3948 bytes - related search similar search Add To My Delicious Add To My Stumble Upon Add To My Google Mark Add To My Facebook Add To My Digg Add To My Reddit

    Andrew,
    thanks again for pointing out this error in the semantics of the distance
    functions. Sorry for the late response.
    Here is how the function fts:ApplyFTWordDistanceExactly should be. Please
    note the change in the return clause. As a result your query[2] will then
    evaluate to False, because SE-3 will not be eliminated.
    declare function fts:ApplyFTWordDistanceExactly(
    $ as element(,
    fts:FTM),
    $allMatches as element(allMatches, fts:AllMatches),
    $n as xs:integer)
    ) as element(allMatches, fts:AllMatches) {
    <allMatches>
    {
    for $match in $allMatches/match
    let $sorted = for $si in $match/stringInclude
    order by $si/tokenInfo/@pos ascending
    return $si
    where every $idx in (1 to fn:count($sorted) - 1)
    satisfies fts:wordDistance(
    $sorted[$idx]/tokenInfo,
    $sorted[$idx+1]/tokenInfo,
    $) = $n
    return
    <match>
    {$match/stringInclude}
    {
    for $stringExcl in $match/stringExclude
    where some $stringIncl in $match/stringInclude
    satisfies fts:wordDistance(
    $stringIncl/tokenInfo,
    $stringExcl/tokenInfo,
    $) = $n
    return $stringExcl
    }
    </match>
    }
    </allMatches>
    }
    So, yes, as you pointed out it is sufficient for a StringExclude to be in
    the required distance with one of the remaining StringIncludes to be kept.
    Actually the same correction has to be applied to the other distance
    functions (replacing "where every $stringIncl" with "where some
    $stringIncl" in the return clause).
    The corrections will be included in the next Working Draft.
    I add some more examples showing how distance and negation are intended to
    interact.
    query[2] = . ftcontains ("word1" && "word2" && ! "word3") with distance
    exactly 0 words
    The query matches, for example:
    <nodeword0 word1 word2 word4 </node>
    and also
    <nodeword0 word2 word1 word4 </node>
    in case none of the given words are matched by "word3". Loosely speaking,
    that query returns true for a node, if it contains word1 and word2
    adjacently in any order and not preceeded or succeeded by an occurrence of
    word3.
    Hence, the following do not match:
    <nodeword1 word2 word3 </node>
    <nodeword2 word1 word3 </node>
    <nodeword3 word2 word1 </node>
    <nodeword3 word1 word2 </node>
    <nodeword1 word4 word2 </node<!-- word1 and word2 need to be adjacent
    <nodeword13 word2 </node<!-- where word13 is matched by both word1 and
    word3
    Yours sincerely / Mit freundlichen G,
    Jochen D
    IBM Germany B Laboratory
    DB2 Information Management Software
    Phone: +49-7031-16-2992, Fax: -4891, Email: doerre (AT) de (DOT) ibm.com
    Dear editors,
    When I have a node: <Node>word1 word2 word3</Node>
    I apply the query[1]:
    /Node ftcontains ("word1" && "word2" && "word3") with distance exactly 0
    words
    I will get the AllMatches[1] as:
    AllMatches
    Match
    StringInclude (pos = 1)
    StringInclude (pos = 2)
    StringInclude (pos = 3)
    The final result is True.
    I apply the query[2]:
    /Node ftcontains ("word1" && "word2" && ! "word3") with distance exactly
    0 words
    I seem to get the AllMatches[2] as:
    AllMatches
    Match
    StringInclude (pos = 1)
    StringInclude (pos = 2)
    The final result is also True.
    The reason for AllMatches[2] is that the StringExclude (pos = 3) which
    is generated by ! "word3" has been dropped, according to semantics of
    ApplyFTWordDistanceExactly, because SE-3 does not have a word distance 0
    with both SI-1 and SI-2.
    Are my two results correct? If they are correct, would this be
    inconsistent? what is the intuition when "word3" is a don't-care?
    Can I compare SE-3 to any one of SI-1 and SI-2, not to both of them?
    Thanks,

Re: Fw: FT word Distance exactly


max 4000 letters.
Your nickname that display:
In order to stop the spam: 6 + 5 =
QUESTION ON "Standards"

EMSDN.COM