Security practices in AI development

AI and Society (forthcoming)
  Copy   BIBTEX

Abstract

What makes safety claims about general purpose AI systems such as large language models trustworthy? We show that rather than the capabilities of security tools such as alignment and red teaming procedures, it is security practices based on these tools that contributed to reconfiguring the image of AI safety and made the claims acceptable. After showing what causes the gap between the capabilities of security tools and the desired safety guarantees, we critically investigate how AI security practices attempt to fill the gap and identify several shortcomings in diversity and participation. We found that these security practices are part of securitization processes aiming to support (commercial) development of general purpose AI systems whose trustworthiness can only be imperfectly tested instead of guaranteed. We conclude by offering several improvements to the current AI security practices.

Other Versions

No versions found

Similar books and articles

Use GenAI Tools to Boost Your Security Posture.Ken Huang, Yale Li & Patricia Thaine - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli, Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 305-338.
Build Your Security Program for GenAI.Ken Huang, John Yeoh, Sean Wright & Henry Wang - 2024 - In Ken Huang, Yang Wang, Ben Goertzel, Yale Li, Sean Wright & Jyoti Ponnapalli, Generative AI Security: Theories and Practices. Springer Nature Switzerland. pp. 99-132.

Analytics

Added to PP
2025-03-15

Downloads
73 (#305,893)

6 months
73 (#87,134)

Historical graph of downloads
How can I increase my downloads?

Author Profiles

Petr Spelda
Charles University, Prague

Citations of this work

No citations found.

Add more citations