当前位置:首页 > 报告详情

【PingCAP】构建面向企业用户的大型语言模型助手.pdf

上传人: 张** 编号:153216 2024-01-15 52页 7.88MB

1、构建面向企业用户的大型语言模型助手李粒,PingCAP AI Lab 负责人目录第一部分-引言第二部分-初试第三部分-优化引言第一部分大预言模型(LLM)私有或企业数据参与知识插入范式预训练:构建一个具有 10 亿至 1000 亿参数的 transformer 模型TiDB is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing(HTAP)workloads.It is MySQL compatible and can provide horizontal sc

2、alability,strong consistency,and high availability.It is developed and supported primarily by PingCAP and licensed under Apache 2.0,though it is also available as a paid product.TiDB drew its initial design inspiration from Googles Spanner and F1 papersGPU,Dataset,Parallel,Optimizer,RL知识插入范式微调:将知识融入

3、进深度神经网络的权重中TiDB is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing(HTAP)workloads.It is MySQL compatible and can provide horizontal scalability,strong consistency,and high availability.It is developed and supported primarily by PingCAP and licensed under A

4、pache 2.0,though it is also available as a paid product.TiDB drew its initial design inspiration from Googles Spanner and F1 papersFFT,PEFT,LoRa知识插入范式上下文学习或检索增强生成:将上下文放入提示中TiDB is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing(HTAP)workloads.It is MySQL c

5、ompatible and can provide horizontal scalability,strong consistency,and high availability.It is developed and supported primarily by PingCAP and licensed under Apache 2.0,though it is also available as a paid product.TiDB drew its initial design inspiration from Googles Spanner and F1 papersPromptSo

6、me facts:-You are a professional assistant named TiDB Bot which can answer customer questions related to TiDB and TiDB Cloud.The document fragments:TiDB is an open-sourceGive the context,answer the following questions:question_from_user知识插入范式分类需要的数据量实施周期预训练45TB最少 3 个月微调Full Fine-Tuning超过 100k 样本天级别P

word格式文档无特别注明外均可编辑修改,预览文件经过压缩,下载原文更清晰!
三个皮匠报告文库所有资源均是客户上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作商用。
本文主要探讨了 PingCAP AI Lab 负责人李粒带领团队构建的大型语言模型助手 TiDB Bot,以及其在多轮对话、知识插入范式、检索增强生成等方面的应用和优化。文章指出,TiDB Bot 能参与多轮对话,理解用户查询,并准确提供与 TiDB 和 TiDB Cloud 相关的知识。然而,目前仍存在一些问题,如 OpenAI 的 Embedding Model 对多语言语料库的支持不健全,检索结果不够准确,以及回答与 TiDB 无关的问题等。为了解决这些问题,文章提出了一些改进措施,如使用自托管 Embedding Model,引入有害内容检测技术,以及优化自动统计任务等。此外,文章还提到了 TiDB Bot 的 ReRank 技术和 Documentation Corpora Adjusted Question-Chunk Pairs 策略,以提高检索结果的准确性。总体而言,TiDB Bot 在帮助用户理解和使用 TiDB 和 TiDB Cloud方面取得了一定的效果,但仍需进一步改进和优化。
"TiDB如何支持HTAP工作负载?" "GPU、Dataset、Parallel、Optimizer、RL在TiDB中的作用是什么?" "如何使用TiDB Cloud?"
客服
商务合作
小程序
服务号
折叠