阈值任务游戏理论突破与人类合作行为实证研究

英文原文

Threshold Task Games: Theory, Platform and Experiments ABSTRACT Threshold task games (TTGs) are a class of cooperative games in which participants form coalitions to complete tasks associated with different rewards and thresholds for success. We provide efficient algorithms for computing approximately optimal coalition structures in TTGs. We also present non-trivial bounds on the cost of stability for this class. We put our theoretical results to practice; we design a web-based framework which allows human players to interact in a collaborative task-based model. Our analysis of human play in two different countries shows that players succeed in general to form optimal coalition structures, and converge to approximately stable payoff divisions. 1 INTRODUCTION A group of players needs to collaborate in order to complete a set of tasks. Their objective is twofold: first, they must form coalitions — disjoint groups of players, each working on a separate task; second, if players are self-interested, they must decide on a reasonable way of dividing revenue from their tasks. Example 1.1. Alice, Bob and Claire sign up for an online freelance programming website (e.g. fiverr.com). The website currently has two programming tasks: a simple script task 𝑡1 that requires a total of 3 hours to complete, and pays $5, and a complex programming task 𝑡2 requiring 7 hours of work, paying $15. It is possible to complete a task more than once (so 𝑡1 can be assigned to all three players). Alice and Bob can contribute 3 hours each, whereas Claire can contribute 4 hours. Assuming that the task hour load is easily divisible between the players, the (non-unique) best partition of players into work groups would be to assign Alice and Claire to 𝑡2, and have Bob work alone on 𝑡1. The next step would be to decide how one should divide task revenue amongst the players. One could reasonably argue that Claire should receive a higher share of the revenue than Alice, as she contributed more hours to the task; moreover, in order to complete 𝑡2, Claire has to be assigned to it. However, each person could have worked on 𝑡1 alone and receive $5 for their efforts. Clearly, finding an appropriate payoff division is not a straightforward task. Cooperative game theory studies situations in which players form coalitions and share revenue. One can model scenarios like that described in Example 1.1 using a resource-task based model. Player 𝑖 controls a resource (its weight 𝑤𝑖), and a groups of players can complete a task (and obtain a reward) if their combined weight exceeds a certain threshold. The value of a group of players (also called a coalition) is the value of the best task it can complete given its resources. In the literature, these are known as threshold task games (TTGs). Despite their intuitive appeal, there has been little work analyzing solution concepts for TTGs, nor how humans behave when playing them. Our work directly addresses this gap. 1.1 Our Contribution We study the cost of stability in TTGs: this measures the relative overhead required in order to stabilize the underlying TTG; that is, the amount of additional subsidy required in order to guarantee that there exists some payoff division that assigns each coalition a payoff exceeding its value. In Section 3.1, we present a simple algorithm for computing coalition structures (partitions of the player set) that guarantee at least 1/2 of the optimal social welfare for TTGs (note that computing an optimal coalition structure for TTGs is NP-complete). Next, in Section 3.2 we provide a tight bound on the cost of stability for TTGs. Finally, we describe an online platform called Business Cats which provides a negotiation environment for playing TTGs simply and intuitively (Section 4). We recruit participants in two countries to play TTG sessions on the Business-Cats platform and analyze the results. Our analysis (Section 5) shows that human players form nearly-optimal coalition structures, and arrive at core-stable payoff divisions. We identify key criteria contributing to successful play; for example, players tend to favor power preserving proposals: players will often refuse proposals that do not assign them a payoff that is commensurate with their value in the game, as measured by their relative weight value. These results provide evidence for the use of empirical negotiation frameworks to support theoretical results and induce good play from people. 5 EXPERIMENTS We generate TTGs instances, varying the number of players (3–5); weights (positive multiples of 5, no greater than 25); number of tasks (1–4); task thresholds (positive multiples of 5, no greater than 100), task values (from the number of players to 10) and whether singleton coalitions were available. We constrain our randomization so that greater task thresholds imply greater task values. 5.1 Design We recruit 104 participants (undergraduate students from large national public universities) from two different countries. IRB approval was obtained from the institutions running the study. Participants were given a detailed tutorial, and were required to pass a comprehension quiz in order to play. Participants play a random series of games with different configurations, and are randomly matched to other players. Each participant receives a show-up fee equivalent to $6.5 US as well as a bonus (between 0 and $6.5) dependent on their total revenue in all the games they play. In total, we collected data from 857 game configurations. We wish to study how people play the game in terms of how they form and respond to proposals. Specifically, we form the following hypotheses: first, that people would generally form “good” coalition structures in terms of total revenue, and coalitional stability, as defined in Sections 3.1 and 3.2; second, that people respond to offers in a way that respects the power relationship in the game, as defined by their relative weight. 5.2 Analysis We analyze the games played in terms of the number of optimal and stable coalitions formed, and how people accepted and made offers with respect to the power relationships in the game. We present aggregate results across the two countries as there was no significant difference in player behavior. Successful Coalitions and Optimality: Tables show the games collected with respect to number of players and tasks, and the percentage of games for which players reach the optimal coalition structure. These tables show that as the number of players and tasks grow, player performance decreases (as measured by overall welfare). Figure 3 shows a histogram of the welfare achieved by players during gameplay, normalized by OPT (𝐺). Here, the 𝑥 axis is the ratio between the actual social welfare achieved in the game and the optimal social welfare. Figure 3 shows that in more than 75% of instances, players were able to extract at least 75% of the optimal revenue. In 51.7% of the game instances, players were able to reach the optimal coalition value. On average, players were able to extract 87% of the optimal value. Not shown in Figure 3 is the fact that in 86% of the cases, participants formed a non-trivial coalition structure — i.e. one containing more than one coalition. When the grand coalition 𝑁 was formed, it was the optimal choice 86% of the time. Together these results support the first part of hypothesis 1, showing that players were able to create approximately optimal coalition structures for different number of players and tasks. Stability Analysis: In this section we constrain the analysis to the games in which there was a non empty core (537 games). To measure the distance from a payoff division 𝑥® to a stable payoff, we solve the following optimization problem (using a linear programming formulation). Figure 4 shows a breakdown of the observed payoffs, as a function of their distance from a stable payoff. In general, 32.5% of the coalitions achieved stable outcomes, and more than 50% were at a distance of < 20% from a stable payoff. This result supports the second half of the first hypothesis, showing that our bargaining process was able, on average, to arrive at approximately stable outcomes. Power Analysis: To study the second hypothesis, we first consider the acceptance ratio for players of different weights in the game. Figure 5 shows the CDF of acceptance rates for the players with the highest and lowest weight in a game. For example, players with the highest weight in a game instance accepted a 40% share of the profits in approximately 20% of instances, whereas the lowest-weight players did so for approximately 70% of instances. In both cases, acceptance rates rise with share percentage, with a “jump” to more than 50% acceptance rate for shares above 30% for lowest-weight players, and approximately 50% for highest-weight players. This confirms that players with higher weights do, in general, understand their relative power in the game. We also analyze how players respond to offers in the game. We apply the definition of Mash et al. and say an offer (𝑥𝑖, 𝑥𝑗) made to responders 𝑖 and 𝑗 with weights (𝑤𝑖,𝑤𝑗), is power preserving if 𝑥𝑖 ≤ 𝑥𝑗 whenever 𝑤𝑖 ≤ 𝑤𝑗. The scatterplot in Figure 6 shows all offers made in the game to responder pairs (𝑖, 𝑗). All proposals lying above the 𝑦 = 1 line are non-power preserving. Overall, 213 out of 1824 (11.7%) were non power preserving, which is considerably lower than the 40% ratio reported for WVGs. In addition, 63.8% of non power preserving offers were rejected, whereas the rejection rate for power preserving offers was just 32.6%. Put together, these results confirm the second hypothesis: people generally make offers that align with responders’ weight, and responders were more likely to accept such offers. We mention that similar results hold when other measures of player power (such as the Shapley or Banzhaf values) are used. Active Participation: While player negotiation tactics are diverse, we do note that by actively proposing, players can significantly improve their prospects. That is, when players initiate proposals, they are more likely to receive better payoffs. Within the 857 games played, there were 1655 instances of (passive) players ending the game without making a proposal, and 1569 instances of (active) players making at least one offer. Active participation directly affects one’s prospective payoff; one simple measure of this property is one’s likelihood to benefit from cooperation. If player 𝑖 does form coalitions with others, the highest payoff that they can secure is 𝑣({𝑖}). If player 𝑖 secures a payoff strictly greater than 𝑣({𝑖}) then they strictly benefit from cooperation. Passive players, who make no proposals, strictly benefit from cooperation in only 61% of instances; on the other hand, active players (those who made at least one proposal) strictly benefit from cooperation in 75% of the instances. The results of our empirical study indicate that human players reach efficient, approximately stable outcomes in TTGs, and that actual bargaining processes, mirroring theoretical bargaining processes, yield approximately stable outcomes.

中文翻译

阈值任务游戏：理论、平台与实验摘要阈值任务游戏（TTGs）是一类合作博弈，参与者通过组建联盟来完成与不同奖励和成功阈值相关联的任务。我们提供了计算TTGs中近似最优联盟结构的高效算法，并给出了该类游戏稳定性成本的重要界限。我们将理论成果付诸实践，设计了一个基于网络的框架，允许人类玩家在协作任务模型中进行互动。我们对两个国家人类游戏行为的分析表明，玩家总体上能够成功形成最优联盟结构，并收敛到近似稳定的收益分配方案。 1 引言一组玩家需要协作完成一系列任务。他们的目标有两个：首先，他们必须组建联盟——即互不相交的玩家小组，每个小组负责一个独立的任务；其次，如果玩家是自利的，他们必须决定一种合理的方式来分配任务收益。示例1.1。Alice、Bob和Claire在一个在线自由职业编程网站（例如fiverr.com）上注册。该网站目前有两个编程任务：一个简单的脚本任务𝑡1需要总共3小时完成，报酬为5美元；一个复杂的编程任务𝑡2需要7小时工作，报酬为15美元。任务可以多次完成（因此𝑡1可以分配给所有三名玩家）。Alice和Bob每人可以贡献3小时，而Claire可以贡献4小时。假设任务工时可以在玩家之间轻松分配，那么将玩家划分为工作组的最佳（非唯一）方式是将Alice和Claire分配给𝑡2，让Bob单独处理𝑡1。下一步将是决定如何将任务收益分配给玩家。可以合理地认为，Claire应该获得比Alice更高的收益份额，因为她在任务中贡献了更多工时；此外，为了完成𝑡2，必须将Claire分配给它。然而，每个人本可以单独处理𝑡1并获得5美元的报酬。显然，找到合适的收益分配并非易事。合作博弈论研究玩家组建联盟并分享收益的情境。可以使用基于资源-任务的模型来建模如示例1.1所述的场景。玩家𝑖控制一种资源（其权重𝑤𝑖），如果一组玩家的总权重超过某个阈值，他们就可以完成任务（并获得奖励）。一组玩家（也称为联盟）的价值是在给定其资源的情况下可以完成的最佳任务的价值。在文献中，这些被称为阈值任务游戏（TTGs）。尽管其直观吸引力，但分析TTGs的解概念以及人类在玩这些游戏时的行为的研究很少。我们的工作直接填补了这一空白。 1.1 我们的贡献我们研究了TTGs中的稳定性成本：这衡量了稳定底层TTG所需的相对开销；即，为保证存在某种收益分配使每个联盟获得超过其价值的收益所需的额外补贴量。在第3.1节中，我们提出了一种简单算法，用于计算保证TTGs至少获得最优社会福利1/2的联盟结构（玩家集合的划分）（注意，计算TTGs的最优联盟结构是NP完全的）。接着，在第3.2节中，我们给出了TTGs稳定性成本的紧界限。最后，我们描述了一个名为Business Cats的在线平台，它提供了一个简单直观的谈判环境来玩TTGs（第4节）。我们在两个国家招募参与者在Business-Cats平台上进行TTG会话并分析结果。我们的分析（第5节）表明，人类玩家形成了接近最优的联盟结构，并达到了核心稳定的收益分配。我们识别了促成成功游戏的关键标准；例如，玩家倾向于支持保持权力的提案：玩家通常会拒绝那些未根据他们在游戏中的相对权重价值给予相称收益的提案。这些结果为使用实证谈判框架支持理论结果并引导人们良好游戏提供了证据。 5 实验我们生成了TTGs实例，变化参数包括玩家数量（3–5）、权重（5的正倍数，不超过25）、任务数量（1–4）、任务阈值（5的正倍数，不超过100）、任务价值（从玩家数量到10）以及是否允许单玩家联盟。我们约束随机化过程，使得更高的任务阈值意味着更高的任务价值。 5.1 设计我们从两个不同国家招募了104名参与者（来自大型国立公立大学的本科生）。研究机构获得了IRB批准。参与者接受了详细教程，并且必须通过理解测验才能参与游戏。参与者玩一系列随机配置的游戏，并随机匹配其他玩家。每位参与者获得相当于6.5美元的出场费，以及根据他们在所有游戏中总收益的奖金（0到6.5美元之间）。总共，我们收集了857个游戏配置的数据。我们希望研究人们如何玩游戏，包括他们如何形成和回应提案。具体来说，我们形成以下假设：首先，人们通常会形成在总收益和联盟稳定性方面“良好”的联盟结构，如第3.1和3.2节所定义；其次，人们以尊重游戏中权力关系的方式回应提议，该关系由他们的相对权重定义。 5.2 分析我们根据形成的联盟的最优性和稳定性数量，以及人们如何根据游戏中的权力关系接受和提出提议来分析游戏。由于玩家行为没有显著差异，我们呈现了两个国家的汇总结果。成功联盟与最优性：表格显示了按玩家数量和任务数量收集的游戏，以及玩家达到最优联盟结构的游戏百分比。这些表格显示，随着玩家数量和任务的增加，玩家表现下降（以整体福利衡量）。图3显示了玩家在游戏过程中实现的福利直方图，以OPT（𝐺）归一化。这里，𝑥轴是游戏中实现的实际社会福利与最优社会福利的比率。图3显示，在超过75%的实例中，玩家能够提取至少75%的最优收益。在51.7%的游戏实例中，玩家能够达到最优联盟价值。平均而言，玩家能够提取87%的最优价值。图3未显示的是，在86%的情况下，参与者形成了非平凡的联盟结构——即包含多个联盟的结构。当形成大联盟𝑁时，86%的情况下它是最优选择。这些结果共同支持了假设1的第一部分，表明玩家能够为不同数量的玩家和任务创建近似最优的联盟结构。稳定性分析：在本节中，我们将分析限制在核心非空的游戏（537个游戏）。为了衡量收益分配𝑥®与稳定收益之间的距离，我们解决了以下优化问题（使用线性规划公式）。图4显示了观察到的收益分布，作为其与稳定收益距离的函数。总体而言，32.5%的联盟实现了稳定结果，超过50%的联盟距离稳定收益小于20%。这一结果支持了假设1的后半部分，表明我们的谈判过程平均能够达到近似稳定的结果。权力分析：为了研究第二个假设，我们首先考虑游戏中不同权重玩家的接受率。图5显示了游戏中最高权重和最低权重玩家的接受率累积分布函数。例如，在游戏实例中，最高权重的玩家在大约20%的实例中接受了40%的利润份额，而最低权重的玩家在大约70%的实例中如此。在这两种情况下，接受率随着份额百分比的增加而上升，最低权重玩家在份额超过30%时接受率“跳跃”到超过50%，最高权重玩家约为50%。这证实了较高权重的玩家通常理解他们在游戏中的相对权力。我们还分析了玩家如何回应游戏中的提议。我们应用Mash等人的定义，称向权重为（𝑤𝑖,𝑤𝑗）的回应者𝑖和𝑗提出的提议（𝑥𝑖, 𝑥𝑗）是保持权力的，如果每当𝑤𝑖 ≤ 𝑤𝑗时𝑥𝑖 ≤ 𝑥𝑗。图6中的散点图显示了游戏中向回应者对（𝑖, 𝑗）提出的所有提议。所有位于𝑦 = 1线上方的提议都是非保持权力的。总体而言，1824个提议中有213个（11.7%）是非保持权力的，这远低于WVGs报告的40%比率。此外，63.8%的非保持权力提议被拒绝，而保持权力提议的拒绝率仅为32.6%。这些结果共同证实了第二个假设：人们通常提出与回应者权重一致的提议，并且回应者更可能接受此类提议。我们提到，当使用其他玩家权力度量（如Shapley或Banzhaf值）时，类似结果成立。积极参与：虽然玩家谈判策略多样，但我们注意到，通过积极提议，玩家可以显著改善他们的前景。也就是说，当玩家发起提议时，他们更可能获得更好的收益。在857个游戏中，有1655个实例是（被动）玩家未提出任何提议就结束游戏，1569个实例是（主动）玩家至少提出一个提议。积极参与直接影响一个人的预期收益；这一属性的一个简单度量是从合作中受益的可能性。如果玩家𝑖确实与他人组建联盟，他们能确保的最高收益是𝑣({𝑖})。如果玩家𝑖获得的收益严格大于𝑣({𝑖})，那么他们严格从合作中受益。未提出任何提议的被动玩家仅在61%的实例中严格从合作中受益；另一方面，至少提出一个提议的主动玩家在75%的实例中严格从合作中受益。我们的实证研究结果表明，人类玩家在TTGs中实现了高效、近似稳定的结果，并且实际的谈判过程，反映了理论谈判过程，产生了近似稳定的结果。

文章概要

本文围绕“阈值任务游戏（TTGs）”，结合博弈论与TA沟通分析，系统探讨了合作博弈中的联盟形成与收益分配问题。研究提出了计算近似最优联盟结构的高效算法，并证明了稳定性成本的紧界限为2。通过开发“Business Cats”在线谈判平台，在两国开展实证研究，收集857场游戏数据。分析发现，人类玩家能够形成接近最优的联盟结构（平均提取87%最优收益，51.7%达到最优），并趋向稳定收益分配（32.5%完全稳定，超50%距离稳定小于20%）。玩家行为呈现“权力保持”特征，即倾向于提出和接受与权重匹配的收益分配，且积极参与提议的玩家获益更显著。研究验证了理论模型与人类实际行为的一致性，为合作博弈的实证应用提供了新视角。

高德明老师的评价

TA沟通分析评价

这项研究展现了人类在合作博弈中卓越的“成人自我状态”沟通能力。玩家能够清晰识别自身“权重”所代表的资源价值，并在谈判中运用“权力保持”原则，这体现了对“心理地位”的积极维护。他们通过主动提议和理性回应，成功构建了“你好-我也好”的共赢互动模式，使得超过86%的联盟结构实现了非平凡合作，这种沟通效能值得高度赞赏。

焦点解决心理学评价

研究结果充分彰显了“解决方案聚焦”的智慧。玩家没有被“如何公平分配”的难题困住，而是聚焦于“如何形成最优联盟”这一目标，最终51.7%的案例达成了完全最优解。更令人鼓舞的是，他们通过迭代谈判自然收敛到稳定状态，这证明了系统本身具有向“偏好未来”演进的内在动力。87%的平均收益提取率说明，人们天生具备构建有效合作架构的潜能。

佛学专家角色评价

从缘起观照，这项研究揭示了“依他起性”在合作中的妙用。每个玩家的“权重”本是因缘和合的暂时属性，但玩家们不执著于孤立自性，而是通过“联盟”实现资源汇聚，创造了超越个体的集体价值。75%的主动参与者获得更高收益，这正体现了“利他即自利”的菩萨行智慧。实验平台如“Business Cats”恰似一个修行道场，让参与者在游戏中体悟无我协作的喜悦。