美文网首页
IR-chapter7: computing scores in

IR-chapter7: computing scores in

作者: woodsouthmmm | 来源:发表于2017-05-01 18:48 被阅读0次

efficient scoring and ranking

FastCosineScore
  • constructing a heap to pick out top K components
  • Inexact top K document retrieval
  • index elimination
    considering documents containing terms whose idf exceeds a preset threshold
    considering documents containing many(even all) terms
  • champion list
    precompute r documents with the highest weights for each term.
    r does not to be the same for every term.(rarer term, larger)
  • static quality scoring and ordering
    net-score
    global champion list, expansion two lists sorted by g(d) value
  • impact ordering
    sorted by common ordering: document-at-a-time scoring
    sorted by uncommon ordering: term-at-a-time scoring
    ordered by a decreasing tf value,advantage:
    1.stop after considering a prefix of posting list
    2.consedering query terms in decreasing order of idf.
  • cluster pruning
    pick ,compute nearest, cluster, computing cosine similarity from q to each leader, then the closest L and its follower
    variation - b1,b2

components of an information retrieval system

  • tiered indexes
    motivation: A has fewer than K documents
    solution: we set a tf threshold of 20 for tier 1 and 10 for tier 2, meaning that the tier 1 index only has postings entries with tf values exceeding 20, and the tier 2 index only has postings entries with tf values exceeding 10.
tiered indexes
  • designing parsing and scoring function
    query parser - translate the user-specified keywords into a query with various operators
    scoring function - manual configuration or machine-learned scoring

  • putting it all together

a complete search system

results snippets: snippets of text accompanying each document in the results list for a query.

Vector space scoring and query operator interaction

Google: the semantics of a conjunctive query that only retrieves documents containing all or most query terms.

  • Boolean retrieval
  • wildcard queries
  • phase queries

相关文章

网友评论

      本文标题:IR-chapter7: computing scores in

      本文链接:https://www.haomeiwen.com/subject/wisizttx.html