2023年11月25日

Python代写 | COMP6714 ASSIGNMENT 1

本次Python代写是完成布尔检索算法

COMP6714 ASSIGNMENT 1

Q1. (25 marks)
Some Boolean retrieval systems (e.g., Westlaw) support the proximity operator /S, which
restricts the occurrences matches to be within the same sentence.
Assume that we have created an additional positional list for $, which records the positions of the end of the sentences. E.g., for the document
A B C. D E.
the position list for $ is [4, 7].
You are required to engine an algorithm to support the query A /S B. To make the task
easier, we further constrain the semantics of the query to satisfy both conditions:
• the occurrences of A and B must be within the same sentence.
• the occurrence of A must precede that of B.
For example, the above example document matches the query A /S C, but not C /S A.
You need to
• make simple modifications to the pseudocode shown in Algorithm 1, which is exactly the algorithm in Figure 2.12 in the textbook. Note that we modify the
algorithm slightly so that array indexes start from 1 instead of 0. Specifically,
– you need to insert some code between Lines 6 and 7, and perform some modifications to some lines afterwards.
– In your submitted algorithm pseudocode (named Q1(p1, p2, p$
)), clearly mark
the modifications using color or boxes.
• You can assume that there is a function skipTo(p, docID, pos), which move the
cursor of list p to the first position such that (1) the position belongs to a document
docID, and (2) the position is no smaller than pos.
Q2. (25 marks)
Consider the scenario of dynamic inverted index construction. Assume that t sub-indexes
(each of M pages) will be created if one chooses the no-merge strategy.
(1) Show that if the logarithmic merge strategy is used, it will result in at most dlog2
te
sub-indexes.
(2) Prove that the total I/O cost of the logarithmic merge is O(t · M · log2
t).
1
2 DUE ON 20:59 4 NOV, 2020 (WED)
Algorithm 1: PositionalIntersect(p1, p2, k)
1 answer ← ∅;
2 while p1 6= nil ∧ p2 6= nil do
3 if docID(p1) = docID(p2) then
4 l ← [ ];
5 pp1 ← positions(p1); pp2 ← positions(p2);
6 while pp1 6= nil do
7 while pp2 6= nil do
8 if |pos(pp1) − pos(pp2)| ≤ k then
9 add(l, pos(pp2));
10 else
11 if pos(pp2) > pos(pp1) then
12 break;
13 pp2 ← next(pp2);
14 while l 6= [ ] ∧ |l[1] − pos(pp1)| > k do
15 delete(l[1]);
16 for each ps ∈ l do
17 answer ← answer ∪ [docID(p1), pos(pp1), ps];
18 pp1 ← next(pp1);
19 p1 ← next(p1); p2 ← next(p2);
20 else
21 if docID(p1) < docID(p2) then
22 p1 ← next(p1);
23 else
24 p2 ← next(p2);
25 return answer ;
Q3. (25 marks)
After the δ encoding, the compressed non-positional inverted list is
01000101 11110001 01110000 00110000 11110110 11011
• Decode the sequence of numbers the compressed list represents.
• List the document IDs in this list.
Q4. (25 marks)
Consider the WAND algorithm described in Section 2.4 in the original paper.1
. There
is a typo in the algorithm in Line 21: it should use “terms[0 .. (pTerm-1)]”.
1Efficient Query Evaluation using a Two-Level Retrieval Process.
COMP6714 ASSIGNMENT 1 3
However, even with this fixed, there is a bug in the algorithm (Figure 2) in which the
algorithm will end up in an infinite loop.
You need to
• Identify the single lines in Figure 2 that causes the bug and describe concisely why
this will lead to a bug.
• Give a simple example illustrating this bug. You should use three terms (named
A, B, C) and k = 1. Do not include unncessary entries in the lists.

程序代写代做C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

CS代写,留学生编程代写,CS作业代写,Java代写,程序代写，代码代写 | ITCS代写

本网站支持淘宝支付宝微信支付 paypal等等交易。如果不放心可以用淘宝交易！

E-mail:itcsdx@outlook.com 微信:itcsdx

如果您使用手机请先保存二维码，微信识别。如果用电脑，直接掏出手机果断扫描。

留学生首次寻找CS代写机构的时候要注意哪些？Java代写 | Weka机器学习 | COMP90049 Project 2

CONTACT

Assignment Example

Service Scope

Recent Case

2024年10月8日

ITCS代写

Python代写 | COMP6714 ASSIGNMENT 1

CONTACT

Assignment Example

Service Scope

Recent Case

MySQL数据库学习指南：留学生如何在不同国家的课程和就业形势下脱颖而出

北美计算机留学高校整理与热门专业前景分析

留学生计算机代写常见服务有哪些？

留学生程序代写靠谱吗

留学生如何选择机器学习方向的专业

Tags