The Cloudcast

The Cloudcast (@cloudcastpod) is the industry's #1 Cloud Computing podcast, and the place where Cloud meets AI. Co-hosts Aaron Delp (@aarondelp) & Brian Gracely (@bgracely) speak with technology and business leaders that are shaping the future of business. Topics will include Cloud Computing | AI | AGI | ChatGPT | Open Source | AWS | Azure | GCP | Platform Engineering | DevOps | Big Data | ML | Security | Kubernetes | AppDev | SaaS | PaaS .

Sean Falconer (@seanfalconer, Head of Dev Relations @SkyflowAPI, Host @software_daily) talks about security and privacy of LLMs and how to prevent PII (personally identifiable information) from leaking out

SHOW: 807

CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw

NEW TO CLOUD? CHECK OUT OUR OTHER PODCAST - "CLOUDCAST BASICS"

SHOW SPONSORS:

Want to win a Tesla Cybertruck or $100,000? Enter the WS02 Choreo Code Challenge (before August 30th)
WSO2 Choreo - Why build a platform? Just add developers instead
CloudZero provides immediate and ongoing savings with 100% visibility into your total cloud spend

SHOW NOTES:

SkyFlow (homepage)
Partially Redacted Podcast
Software Engineering Daily

Topic 1 - Our topic for today is the security and privacy LLMs. What’s Sean’s origin story?

Topic 2 - Let’s dig into LLM security and privacy. We see this concern a lot on the podcast and we’ve touched on it with various past shows, but we haven’t dug in deep. First, let’s frame the problem. What are we talking about when we talk about LLM security and privacy?

Topic 3 - First, there is a fear that customer PII information might leak out. Second, company IP or confidential into might leak out related to products or offerings. We’ve seen examples of both to date. This could be exposed in the form of integration into a model (query it for the answer) or in the fine-tuning or RAG stage. Either one could lead to compliance issues, lost rev etc. But, that same data at risk is the potential differentiation of the models. How do you both mask the data but take advantage of the data?

Topic 4 - One thing I’ve noticed is many orgs only think about privacy in relation to the fine-tuning stage where they are taking a broad model and making it company specific. It is about much more than that though. Just like standard software development, we have different stages. How is the data collected and stored, how is it used for training and fine-tuning, how is it used after deployment and during interaction stage, etc. How should security and privacy be handled across all phases?

Topic 5 - Let’s talk beyond LLMs for a bit. What about Data Lakes and Data Warehousing? I see this as a problem across all big data, correct?

Topic 6 - How does API security fit into this? Much of what we are talking about is at the storage and retrieval level. But, increasingly we see API issues exposing data. How does that fit in here?

Topic 7 - Let’s talk podcasts, we had Jeff, the previous host of Software Engineering Daily on a few times. How are things over at Software Engineering Daily? Tell everyone a bit about the show.

FEEDBACK?

Email: show at the cloudcast dot net
Twitter: @cloudcastpod
Instagram: @cloudcastpod
TikTok: @cloudcastpod

fyyd: Podcast Search Engine

March 27, 2024 26m

The Cloudcast

https://www.thecloudcast.net

LLM Security and Privacy