在一定程度上，前面介绍的大部分技术都与提高补全准确度及可靠性有关，特别是自洽性^{1Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models.}。然而，除了基本提示策略之外，还有许多其他技术可以用于提高可靠性。

LLMs 存在各种问题，包括幻象^{2Ye, X., & Durrett, G. (2022). The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning.}、采用 CoT 方法的错误解释^{2Ye, X., & Durrett, G. (2022). The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning.}，以及多种偏差，包括多数标签偏差、近期偏差和常见令牌偏差^{3Zhao, T. Z., Wallace, E., Feng, S., Klein, D., & Singh, S. (2021). Calibrate Before Use: Improving Few-Shot Performance of Language Models.}。此外，在处理敏感话题时，zero-shot 思维链可能会产生特别的偏差^{4Shaikh, O., Zhang, H., Held, W., Bernstein, M., & Yang, D. (2022). On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning.}。

一些常见的解决方案包括使用校准器消除先验偏差，使用验证器对补全结果进行评分，以及在补全结果中增进多样性。

Footnotes

Wang, X., Wei, J., Schuurmans, D., Le, Q., Chi, E., Narang, S., Chowdhery, A., & Zhou, D. (2022). Self-Consistency Improves Chain of Thought Reasoning in Language Models. ↩
Ye, X., & Durrett, G. (2022). The Unreliability of Explanations in Few-shot Prompting for Textual Reasoning. ↩ ↩²
Zhao, T. Z., Wallace, E., Feng, S., Klein, D., & Singh, S. (2021). Calibrate Before Use: Improving Few-Shot Performance of Language Models. ↩
Shaikh, O., Zhang, H., Held, W., Bernstein, M., & Yang, D. (2022). On Second Thought, Let’s Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning. ↩

Edit this page

🟦 代码推理

🟢 提示去偏差

Master Generative AI with Our Courses

Need Business GenAI Training?

Contact Sales

Want to keep learning

DIFFICULTY LEVEL

RECOMMENDED COURSES

ChatGPT for Everyone

Introduction to Prompt Engineering

Live Courses

介绍

校准大语言模型

🟢 提示去偏差

🟦 提示多样性

🟦 Prompt Ensembling

🟦 大语言模型自我评估

🟦 Math

Footnotes

Master Generative AI with Our Courses

Contact Sales

Explore Our Full Course Collection

Explore Courses

Resources

Follow Us