Difference between revisions of "Timeline of large language models"

Revision as of 21:04, 8 March 2023

This is a timeline of FIXME.

Sample questions

The following are some interesting questions that can be answered by reading this timeline:

Big picture

Time period	Development summary	More details

Full timeline

Year	Month and date	Event type	Details
2021	May		Google anounces chatbot LaMDA, but doesn't release it publicly.
2022	April		OpenAI reveals DALL-E 2.
2023	January 5		A paper discusses the concern about the potential of LLMs to influence, modify, and manipulate user preferences adversarially. As these models become more proficient in deducing user preferences and offering tailored assistance, their lack of interpretability in adversarial settings is a major concern. The paper examines existing literature on adversarial behavior in user preferences and provides red teaming samples for dialogue models like ChatGPT and GODEL. It also probes the attention mechanism in these models for non-adversarial and adversarial settings.^[1]
2023	February 14	Study	A paper presents a framework called ChatCAD, which integrates LLMs with computer-aided diagnosis (CAD) networks for medical images. ChatCAD uses LLMs to enhance the output of multiple CAD networks by summarizing and reorganizing the information presented in natural language text format. This approach merges the strengths of LLMs' medical domain knowledge and logical reasoning with the vision understanding capability of existing medical-image CAD models. The goal is to create a more user-friendly and understandable system for patients compared to conventional CAD systems. The paper suggests that LLMs can also be used to improve the performance of vision-based medical-image CAD models in the future.^[2]
2023	February 17	Study	A paper surveys the state of the art of hybrid language models architectures and strategies for complex question-answering (QA, CQA, CPS). While very large language models are good at leveraging public data on standard problems, they may require specific architecture, knowledge, skills, tasks, methods, sensitive data, performance, human approval, and versatile feedback to tackle more specific complex questions or problems. The paper identifies the key elements used with LLMs to solve complex questions or problems and discusses challenges associated with complex QA. The paper also reviews current solutions and promising strategies, using elements such as hybrid LLM architectures, human-in-the-loop reinforcement learning, prompting adaptation, neuro-symbolic and structured knowledge grounding, program synthesis, and others.^[3]
2023	February 21	Study	A paper presents a catalog of prompt engineering techniques in pattern form that have been applied successfully to solve common problems when conversing with large language models (LLMs), such as ChatGPT. Prompt patterns are reusable solutions to common problems faced when working with LLMs that can customize the outputs and interactions with an LLM. The paper provides a framework for documenting patterns for structuring prompts to solve a range of problems and presents a catalog of patterns that have been applied successfully to improve the outputs of LLM conversations. It also explains how prompts can be built from multiple patterns and illustrates prompt patterns that benefit from combination with other prompt patterns. The paper contributes to research on prompt engineering that applies LLMs to automate software development tasks.^[4]
2023	February 24	Study	A paper proposes a system called LLM-Augmenter that improves large language models by using external knowledge and automated feedback. The system adds plug-and-play modules to a black-box LLM to ground responses in external knowledge and iteratively improve responses using feedback generated by utility functions. The system is validated on task-oriented dialog and open-domain question answering, showing a significant reduction in hallucinations without sacrificing fluency and informativeness. The source code and models are publicly available.^[5]
2023	March 1	Study	A paper introduces a method to train language models like ChatGPT to understand concepts precisely using succinct representations based on category theory. The representations provide concept-wise invariance properties and a new learning algorithm that can accurately learn complex concepts or fix misconceptions. The approach also allows for the generation of a hierarchical decomposition of the representations, which can be manually verified by examining each part individually.^[6]
2023	March 6	Study	A paper explores the potential of using LLMs as zero-shot human models for human-robot interaction (HRI). Human models are important for HRI, but they are challenging to create. LLMs have consumed vast amounts of human-generated text data and can be used as human models without prior knowledge or interaction data. The authors conducted experiments on three social datasets and found that LLMs can achieve performance comparable to purpose-built models, but there are limitations such as sensitivity to prompts and spatial/numerical reasoning issues. The authors demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios through a case study on a simulated trust-based table-clearing task and a robot utensil-passing experiment. The results show that LLMs offer a promising approach to human modeling for HRI, but it is incomplete.^[7]
2023	March 6	Study	A paper proposes a perspective on prompts for LLMs that distinguishes between diegetic and non-diegetic prompts, and studies how users write with LLMs using different user interfaces. The results show that when the interface offered multiple suggestions and provided an option for non-diegetic prompting, participants preferred choosing from multiple suggestions over controlling them via non-diegetic prompts. When participants provided non-diegetic prompts it was to ask for inspiration, topics or facts. Single suggestions in particular were guided both with diegetic and non-diegetic information. The paper informs human-AI interaction with generative models by revealing that writing non-diegetic prompts requires effort, people combine diegetic and non-diegetic prompting, and they use their draft and suggestion timing to strategically guide LLMs.^[8]
2023	March 7		A paper presents SynthIE, a method for synthetic data generation that LLMs to generate plausible text for structured outputs in the opposite direction. The authors demonstrate the effectiveness of this approach on closed information extraction, where collecting ground-truth data is challenging, and no satisfactory dataset exists to date. They synthetically generate a dataset of 1.8 million data points, demonstrate its superior quality compared to existing datasets in a human evaluation, and use it to fine-tune small models (220M and 770M parameters). The models they introduce outperform existing baselines of comparable size with a substantial gap in micro and macro F1 scores. Code, data, and models are available for reproducibility.^[9]

Meta information on the timeline

How the timeline was built

The initial version of the timeline was written by FIXME.

Funding information for this timeline is available.

Feedback and comments

Feedback for the timeline can be provided at the following places:

FIXME

What the timeline is still missing

https://arxiv.org/search/?query=Large+language+model&searchtype=all&source=header

Timeline update strategy

External links

References

↑ Subhash, Varshini (5 January 2023). "Can Large Language Models Change User Preference Adversarially?". arXiv:2302.10291 [cs]. doi:10.48550/arXiv.2302.10291.
↑ Wang, Sheng; Zhao, Zihao; Ouyang, Xi; Wang, Qian; Shen, Dinggang (2023). "ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models". doi:10.48550/arXiv.2302.07257.
↑ Daull, Xavier; Bellot, Patrice; Bruno, Emmanuel; Martin, Vincent; Murisasco, Elisabeth (17 February 2023). "Complex QA and language models hybrid architectures, Survey". arXiv:2302.09051 [cs]. doi:10.48550/arXiv.2302.09051.
↑ White, Jules; Fu, Quchen; Hays, Sam; Sandborn, Michael; Olea, Carlos; Gilbert, Henry; Elnashar, Ashraf; Spencer-Smith, Jesse; Schmidt, Douglas C. (21 February 2023). "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT". arXiv:2302.11382 [cs]. doi:10.48550/arXiv.2302.11382.
↑ Peng, Baolin; Galley, Michel; He, Pengcheng; Cheng, Hao; Xie, Yujia; Hu, Yu; Huang, Qiuyuan; Liden, Lars; Yu, Zhou; Chen, Weizhu; Gao, Jianfeng (1 March 2023). "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback". arXiv:2302.12813 [cs]. doi:10.48550/arXiv.2302.12813.
↑ Yuan, Yang (2023). "Succinct Representations for Concepts". doi:10.48550/arXiv.2303.00446.
↑ Zhang, Bowen; Soh, Harold (6 March 2023). "Large Language Models as Zero-Shot Human Models for Human-Robot Interaction". arXiv:2303.03548 [cs]. doi:10.48550/arXiv.2303.03548.
↑ Dang, Hai; Goller, Sven; Lehmann, Florian; Buschek, Daniel (6 March 2023). "Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting". arXiv:2303.03199 [cs]. doi:10.1145/3544548.3580969. Retrieved 8 March 2023.
↑ Josifoski, Martin; Sakota, Marija; Peyrard, Maxime; West, Robert (7 March 2023). "Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction". arXiv:2303.04132 [cs]. doi:10.48550/arXiv.2303.04132.

[1] Subhash, Varshini (5 January 2023). "Can Large Language Models Change User Preference Adversarially?". arXiv:2302.10291 [cs]. doi:10.48550/arXiv.2302.10291.

[2] Wang, Sheng; Zhao, Zihao; Ouyang, Xi; Wang, Qian; Shen, Dinggang (2023). "ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models". doi:10.48550/arXiv.2302.07257.

[3] Daull, Xavier; Bellot, Patrice; Bruno, Emmanuel; Martin, Vincent; Murisasco, Elisabeth (17 February 2023). "Complex QA and language models hybrid architectures, Survey". arXiv:2302.09051 [cs]. doi:10.48550/arXiv.2302.09051.

[4] White, Jules; Fu, Quchen; Hays, Sam; Sandborn, Michael; Olea, Carlos; Gilbert, Henry; Elnashar, Ashraf; Spencer-Smith, Jesse; Schmidt, Douglas C. (21 February 2023). "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT". arXiv:2302.11382 [cs]. doi:10.48550/arXiv.2302.11382.

[5] Peng, Baolin; Galley, Michel; He, Pengcheng; Cheng, Hao; Xie, Yujia; Hu, Yu; Huang, Qiuyuan; Liden, Lars; Yu, Zhou; Chen, Weizhu; Gao, Jianfeng (1 March 2023). "Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback". arXiv:2302.12813 [cs]. doi:10.48550/arXiv.2302.12813.

[6] Yuan, Yang (2023). "Succinct Representations for Concepts". doi:10.48550/arXiv.2303.00446.

[7] Zhang, Bowen; Soh, Harold (6 March 2023). "Large Language Models as Zero-Shot Human Models for Human-Robot Interaction". arXiv:2303.03548 [cs]. doi:10.48550/arXiv.2303.03548.

[8] Dang, Hai; Goller, Sven; Lehmann, Florian; Buschek, Daniel (6 March 2023). "Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting". arXiv:2303.03199 [cs]. doi:10.1145/3544548.3580969. Retrieved 8 March 2023.

[9] Josifoski, Martin; Sakota, Marija; Peyrard, Maxime; West, Robert (7 March 2023). "Exploiting Asymmetry for Synthetic Training Data Generation: SynthIE and the Case of Information Extraction". arXiv:2303.04132 [cs]. doi:10.48550/arXiv.2303.04132.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

@@ Line 32: / Line 32: @@
 |-
 | 2023 || March 1 || Study || A paper introduces a method to train language models like ChatGPT to understand concepts precisely using succinct representations based on category theory. The representations provide concept-wise invariance properties and a new learning algorithm that can accurately learn complex concepts or fix misconceptions. The approach also allows for the generation of a hierarchical decomposition of the representations, which can be manually verified by examining each part individually.<ref>{{cite journal |last1=Yuan |first1=Yang |title=Succinct Representations for Concepts |date=2023 |doi=10.48550/arXiv.2303.00446}}</ref>
+|-
+| 2023 || March 6 || Study || A paper explores the potential of using LLMs as zero-shot human models for human-robot interaction (HRI). Human models are important for HRI, but they are challenging to create. LLMs have consumed vast amounts of human-generated text data and can be used as human models without prior knowledge or interaction data. The authors conducted experiments on three social datasets and found that LLMs can achieve performance comparable to purpose-built models, but there are limitations such as sensitivity to prompts and spatial/numerical reasoning issues. The authors demonstrate how LLM-based human models can be integrated into a social robot's planning process and applied in HRI scenarios through a case study on a simulated trust-based table-clearing task and a robot utensil-passing experiment. The results show that LLMs offer a promising approach to human modeling for HRI, but it is incomplete.<ref>{{cite journal |last1=Zhang |first1=Bowen |last2=Soh |first2=Harold |title=Large Language Models as Zero-Shot Human Models for Human-Robot Interaction |journal=arXiv:2303.03548 [cs] |date=6 March 2023 |doi=10.48550/arXiv.2303.03548 |url=https://arxiv.org/abs/2303.03548}}</ref>
 |-
 | 2023 || March 6 || Study || A paper proposes a perspective on prompts for LLMs that distinguishes between diegetic and non-diegetic prompts, and studies how users write with LLMs using different user interfaces. The results show that when the interface offered multiple suggestions and provided an option for non-diegetic prompting, participants preferred choosing from multiple suggestions over controlling them via non-diegetic prompts. When participants provided non-diegetic prompts it was to ask for inspiration, topics or facts. Single suggestions in particular were guided both with diegetic and non-diegetic information. The paper informs human-AI interaction with generative models by revealing that writing non-diegetic prompts requires effort, people combine diegetic and non-diegetic prompting, and they use their draft and suggestion timing to strategically guide LLMs.<ref>{{cite journal |last1=Dang |first1=Hai |last2=Goller |first2=Sven |last3=Lehmann |first3=Florian |last4=Buschek |first4=Daniel |title=Choice Over Control: How Users Write with Large Language Models using Diegetic and Non-Diegetic Prompting |journal=arXiv:2303.03199 [cs] |date=6 March 2023 |doi=10.1145/3544548.3580969 |url=https://doi.org/10.48550/arXiv.2303.03199 |access-date=8 March 2023}}</ref>

Difference between revisions of "Timeline of large language models"

Revision as of 21:04, 8 March 2023

Contents

Sample questions

Big picture

Full timeline

Meta information on the timeline

How the timeline was built

Feedback and comments

What the timeline is still missing

Timeline update strategy

See also

External links

References

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools