In Privacy-Preserving NLP, we discussed the DP-SGD algorithm. In this algorithm, increasing both the noise multiplier and applying gradient clipping will: (Select only one)
Blog
A user expresses an information need of finding out the exac…
A user expresses an information need of finding out the exact date Abraham Lincoln was born. Using the definition of relevance from our course’s information retrieval lectures, select the document that is most relevant to this information need.
Calculate the BLEU brevity penalty for the following scenari…
Calculate the BLEU brevity penalty for the following scenario: Original Spanish sentence: “Tengo tanta hambre.” Our model’s proposed English sentence: “I am so very hungry” Accepted English reference sentence: “I am really hungry” (Enter the brevity penalty value.)
Suppose you are building an open-domain QA system for a cons…
Suppose you are building an open-domain QA system for a consumer health website. The health website’s users type natural-language health questions such as “What are the early symptoms of type 2 diabetes?” or “Is ibuprofen safe to take with blood thinners?”. Your knowledge base is a corpus of approximately 2 million paragraphs drawn from publicly available medical encyclopedias, clinical guidelines, and patient-education pages. The corpus is updated quarterly with new and revised articles. You plan to use a Retriever-Reader architecture. For the retriever, you are considering a sparse approach (e.g., BM25/TF-IDF) and a dense approach (e.g., DPR). For the reader, you will fine-tune a BERT-based model. Question Part 1: Health questions often use everyday language that differs from clinical terminology in the documents. Discuss the pros and cons of using sparse retrieval versus dense retrieval for this setting. (One pro and one con per method should be sufficient.) Question Part 2: Because the corpus is updated quarterly, some older passages may contain outdated information. Name a specific method at the retrieval, re-ranking, or reader stage that would help the system prefer up-to-date information, and explain where in the pipeline it would be applied. (One or two sentences should be sufficient.)
When calculating a BLEU score, why is a brevity penalty nece…
When calculating a BLEU score, why is a brevity penalty necessary? (One or two sentences should be sufficient.)
Which of the following are true of parallel corpora? Select…
Which of the following are true of parallel corpora? Select all that apply.
Which open-domain QA method augments a pretrained language m…
Which open-domain QA method augments a pretrained language model with a differentiable retrieval component and jointly learns retrieval and answering? (Select the open-domain QA method the best match this description)
Exam 4
Exam 4
Two common types of distillation towers are packed and tray…
Two common types of distillation towers are packed and tray type towers.
Each tray in the column forms a
Each tray in the column forms a