【论文阅读 WWW‘23】Zero-shot Clarifying Question Generation for Conversational Search
创始人
2024-05-29 01:08:15
0

文章目录

  • 前言
  • Motivation
  • Contributions
  • Method
    • Facet-constrained Question Generation
    • Multiform Question Prompting and Ranking
  • Experiments
  • Dataset
  • Result
    • Auto-metric evaluation
    • Human evaluation
  • Knowledge

前言

  • 最近对一些之前的文章进行了重读,因此整理了之前的笔记
  • 理解不当之处,请多多指导
  • 概括:本文利用facet word,基于 GPT-2 进行了 zero-shot 的限制生成,使生成的问题更容易包含facet word。同时利用了prompt,使用8种模板,对每个模板都生成一个结果,然后使用一些排序算法自动挑选出一个最终结果。
  • 更多论文可见:ShiyuNee/Awesome-Conversation-Clarifying-Questions-for-Information-Retrieval: Papers about Conversation and Clarifying Questions (github.com)

Motivation

Generate clarifying questions in a zero-shot setting to overcome the cold start problem and data bias.

cold start problem: 缺少数据导致难应用,难应用导致缺少数据

data bias: 获得包括所有可能topic的监督数据不现实,在这些数据上训练也会有 bias

Contributions

  • the first to propose a zero-shot clarifying question generation system, which attempts to address the cold-start challenge of asking clarifying questions in conversational search.
  • the first to cast clarifying question generation as a constrained language generation task and show the advantage of this configuration.
  • We propose an auxiliary evaluation strategy for clarifying question generations, which removes the information-scarce question templates from both generations and references.

Method

Backbone: a checkpoint of GPT-2

  • original inference objective is to predict the next token given all previous texts

在这里插入图片描述

Directly append the query qqq and facet fff as input and let GPT-2 generate cq will cause two challenges:

  • it does not necessarily cover facets in the generation.
  • the generated sentences are not necessarily in the tone of clarifying questions

We divide our system into two parts:

  • facet-constrained question generation(tackle the first challenge)
  • multi-form question prompting and ranking(tackle the second challenge, rank different clarifying questions generated by different templates)

Facet-constrained Question Generation

Our model utilizes the facet words not as input but as constraints. We employ an algorithm called Neurologic Decoding. Neurologic Decoding is based on beam search.

  • in ttt​ step, assuming the already generated candidates in the beam are 𝐶={𝑐1:𝑘}𝐶 = \{𝑐_{1:𝑘} \}C={c1:k​}, kkk is the beam size, ci=x1:(t−1)ic_i=x^i_{1:(t-1)}ci​=x1:(t−1)i​ is the iii th candidate, x1:(t−1)ix^i_{1:(t-1)}x1:(t−1)i​ are tokens generated from decoding step 1 to (t−1)(t-1)(t−1)

    [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-98Ld4wAG-1678024307327)在这里插入图片描述

    • explain about why this method could better constrain the decoder to generate facet-related questions:
      • (2)top−β(2)top- \beta(2)top−β​ is the main reason for promoting facet words in generations. Because of this filtering, Neurologic Decoding tends to discard generations with fewer facet words regardless of their generation probability
      • (3)(3)(3)​ the group is the key for Neurologic Decoding to explore as many branches as possible. Because this grouping method keeps the most cases $(2^{| 𝑓 |} ) $of facet word inclusions, allowing the decoder to cover the most possibilities of ordering constraints in generation
        • because if we choose top K candidates directly, there may be some candidates containing same facets, this results in less situation containing diverse facets. Towards choosing the best candidate in each group and then choose top K candidates, every candidate will contain different facets.

Multiform Question Prompting and Ranking

Use clarifying question templates as the starting text of the generation and let the decoder generate the rest of question body.

  • if the qqq is “I am looking for information about South Africa.” Then we give the decoder “I am looking for information about South Africa. [SEP] would you like to know” as input and let it generate the rest.
  • we use multiple prompts(templates) to both cover more ways of clarification and avoid making users bored

For each query, we will append these eight prompts to the query and form eight input and generate eight questions.

  • use ranking methods to choose the best one as the returned question

Experiments

Zero-shot clarifying question generation with existing baselines

  • Q-GPT-0
    • input: query
  • QF-GPT-0:
    • input: facet + query
  • Prompt-based GPT-0: includes a special instructional prompt as input
    • input: q “Ask a question that contains words in the list [f]”
  • Template-0: a template-guided approach using GPT-2
    • input: add the eight question templates during decoding and generate the rest of the question

Existing facet-driven baselines(finetuned):

  • Template-facet: append the facet word right after the question template

在这里插入图片描述

  • QF-GPT: a GPT-2 finetuning version of QF-GPT-0.
    • finetunes on a set of tuples in the form as f [SEP] q [BOS] cq [EOS]
  • Prompt-based finetuned GPT: a finetuning version of Prompt-based GPT-0
    • finetune GPT-2 with the structure: 𝑞 “Ask a question that contains words in the list [𝑓 ].” 𝑐𝑞

Note: simple facets-input finetuning is highly inefficient in informing the decoder to generate facet-related questions by observing a facet coverage rate of only 20%

Dataset

ClariQ-FKw: has rows of (q,f,cq) tuples.

  • q is an open-domain search query, f is a search facet, cq is a human-generated clarifying question
  • The facet inClariQ is in the form of a faceted search query. ClariQ-FKw extracts the keyword of the faceted query as its facet column and samples a dataset with 1756 training examples and 425 evaluation examples

Our proposed system does not access the training set while the other supervised learning systems can access the training set for finetuning.

Result

Auto-metric evaluation

在这里插入图片描述

RQ1: How well can we do in zero-shot clarifying question generation with existing baselines

  • all these baselines(the first four rows) struggle to produce any reasonable generations except for Template-0(but it’s question body is not good)
  • we find existing zero-shot GPT-2-based approaches cannot solve the clarifying question generation task effectively.

RQ2: the effectiveness of facet information for facet-specific clarifying question generation

  • compare our proposed zero-shot facet constrained (ZSFC) methods with a facet-free variation of ZSFC named Subject-constrained which uses subject of the query as constraints.
  • our study show that adequate use of facet information can significantly improve clarifying question generation quality

RQ3: whether our proposed zeroshot approach can perform the same or even better than existing facet-driven baselines

  • We see that from both tables, our zero-shot facet-driven approaches are always better than the finetuning baselines

Note: Template-facet rewriting is a simple yet strong baseline that both finetuning-based methods are actually worse than it.

Human evaluation

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-5eC8PWul-1678024307328)在这里插入图片描述

Knowledge

Approaches to clarifying query ambiguity can be roughly divided into three categories:

  • Query Reformulation: iteratively refine the query
    • is more efficient in context-rich situations
  • Query Suggestion: offer related queries to the user
    • is good for leading search topics, discovering user needs
  • Asking Clarifying Questions: proactively engages users to provide additional context.
    • could be exclusively helpful to clarify ambiguous queries without context.

相关内容

热门资讯

背影初中作文(优选6篇) 背影初中作文 篇一背影初中作文在生活中,我们都会遇到许多人,其中有些人会给我们留下深刻的印象。而在我...
墙的故事中考作文【经典3篇】 墙的故事中考作文 篇一墙的故事中考作文墙,是我们生活中常见的存在,无论是家庭住宅的分隔墙,还是城市的...
我最讨厌的人500字初一作文... 我最讨厌的人500字初一作文 第一篇从小到大,不知不觉间,近5000天的日子,悄悄从我们指间溜走。在...
那些人那些事初一作文(精彩3... 篇一:那些人那些事初一作文初中生活就像一部精彩的电影,里面有各种各样的人物和故事。在这个阶段,我遇到...
你是我最敬佩的人中考作文(精... 你是我最敬佩的人中考作文 篇一我最敬佩的人是我的父亲。他是一个平凡的农民,但他的坚韧、勤劳和乐观的精...
中考高考英语作文范文模板【推... 中考高考英语作文范文模板 篇一标题:The Importance of Time Managemen...
中考事后记(最新5篇) 中考事后记 篇一中考是每个初中生都要经历的一场重大考试,对于我来说也不例外。回顾这次中考,我有许多感...
优秀中考作文开头(优质3篇) 优秀中考作文开头 篇一我的梦想之旅人生的旅程,仿佛是一场精心设计的旅行,每个人都有自己的目的地和梦想...
中考满分作文:心旷神怡【精简... 中考满分作文:心旷神怡 篇一心旷神怡——探索自然的乐趣自然是一座无垠的宝库,它的美丽和奥妙无法用言语...
历年中考满分作文【优秀6篇】 历年中考满分作文 篇一标题:互联网对青少年的影响互联网已经成为现代社会中不可或缺的一部分,给我们的生...
中考加油作文(实用6篇) 中考加油作文 篇一中考加油作文近日,我即将迎来人生中的重要一刻——中考。在这漫长的三年中,我付出了大...
就这样,埋下一颗种子中考作文... 就这样,埋下一颗种子中考作文篇一随着时间的推移,中考的脚步越来越近,我感受到了胸口的紧迫感。这次中考...
青春与爱同行中考范文(精选6... 青春与爱同行中考范文 篇一初中生活即将结束,我对这段时光充满了感慨和思考。回想起这三年来的点点滴滴,...
太阳作文(精选3篇) 太阳作文 篇一太阳,这是我们熟悉的天体之一。它是地球的中心,为我们提供了光和热,使我们能够生存。太阳...
中考大山(通用3篇) 中考大山 篇一大山,是我心中最美的地方。每当我想起中考时的那个夏天,总会想起我和同学们一起攀登大山的...
中考文言文必考篇目重点翻译句... 篇一:中考文言文必考篇目重点翻译句子文言文作为中考必考的一项内容,对学生的翻译能力要求较高。下面是几...
中考语文考试重点分析范文【精... 中考语文考试重点分析范文 篇一中考语文考试重点分析范文语文考试一直是中考中最重要的科目之一,它不仅考...
中考英语词语辨析:alive... 中考英语词语辨析:alive/ living/ the living/ 篇一标题:The Diffe...
中考热点作文【经典6篇】 中考热点作文 篇一题目:网络游戏对青少年的影响随着互联网的普及和技术的进步,网络游戏已经成为了许多青...
中考满分作文范文欣赏(精彩6... 中考满分作文范文欣赏 篇一梦想的力量人生如同一场马拉松,我们需要不断努力奔跑,才能到达胜利的终点。而...