| SciPort RLP

Inhaltszusammenfassung

Driven by growing concerns over the misuse of AI in spreading false information, this study investigates the potential of large language models (LLMs) to generate disinformation using advanced jailbreak prompting techniques. It employs two open-source LLMs and one commercial LLM, each presented with 24 false claims across eight thematic areas. Findings reveal that LLMs generate disinformation 68% of the time when prompted, with open-source LLMs contributing significantly to this output. Notab...Driven by growing concerns over the misuse of AI in spreading false information, this study investigates the potential of large language models (LLMs) to generate disinformation using advanced jailbreak prompting techniques. It employs two open-source LLMs and one commercial LLM, each presented with 24 false claims across eight thematic areas. Findings reveal that LLMs generate disinformation 68% of the time when prompted, with open-source LLMs contributing significantly to this output. Notably, disinformation was also produced without the use of specialized prompting techniques, indicating a high baseline vulnerability. Additionally, the study evaluates the LLMs’ accuracy in generating truthful content, finding over 80% success in supporting factual claims. This dualability of LLMs to generate both disinformation and accurate information – especially with ease in the former – highlights the urgent need for effective safeguards to prevent potential misuse» weiterlesen » einklappen

Autoren

Shalan, Shehab (Autor)

Ernst, Marina (Autor)

Hopfgartner, Frank (Autor)

Beteiligte Einrichtungen

Institut für Informatik
(Universität Koblenz)

Universität Koblenz
(Universität Koblenz)

Starten Sie Ihre Suche...

Generating and Analyzing Tweet-Style Disinformation with LLMs

Inhaltszusammenfassung

Autoren

Beteiligte Einrichtungen