The Privacy Guardrail: How to Implement Warnings for On-Device LLM Failures

The Privacy Guardrail: How to Implement Warnings for On-Device LLM Failures

Leverage Chrome’s on-device Prompt API to deliver a private, infrastructure-free LLM experience. While on-device models like Gemini Nano offer improved privacy by keeping data local, hybrid AI experiences often require a fallback to a cloud model when local resources are unavailable. This post details how to implement essential transparency warnings and give users a choice to proceed, ensuring reliable functionality while maintaining user trust and preserving privacy as your core value proposition.

Using the on device Prompt API in Chrome offers the ability to have a private LLM experience without the need to spin up your own infrastructure. Having a private experience with an LLM on a device can be limited by the user’s browser choice as well as the device they are choosing to use. In this post we are going to take a look at how to warn users when they may be going to a cloud model instead of using a local model.

On-device inference with Chrome’s Prompt API and React

My colleague Jeff Huleatt wrote an article about the caveats of using the on-device model. Despite all the caveats, one undeniable advantage that the on device model has over any other model is the ability to offer improved privacy. The reason is that the on device model runs on device (its in the name) and there is no need for a network. This also benefits progressive web apps that are installed on a user’s device that offer an offline first experience. Having this capability, paired with other PWA APIs like the file system API can offer unique privacy preserving experiences. In some cases, this means that files never leave the device so you could build a file categorizer on device.

Warnings when you leave the device

My other colleague Cynthia Wang had the clever idea that when you do use hybrid inference and a user is using a different device, you should at least throw a warning. This is especially true if your value proposition is a privacy preserving feature. In Cynthia’s she demonstrates that when you offer a hybrid AI experience (on device and provider model) you should throw a warning in the event that Gemini Nano is not available. You may also want to show a toggle with the toggle greyed out if there is no local model available. This provides reliable fallback for the user if they are okay with sending their data off device but also provides the opportunity to not have the app break entirely if there is no on device model.

Here is the code block in its original form from Cynthia’s blog article.

 useEffect(() => {
const checkNanoAvailability = async () => {
  // Checking availability of the on-device model
  const { languageModelProvider, onDeviceParams } = metadataModel.chromeAdapter;
  if (await languageModelProvider?.availability(onDeviceParams.createOptions) !== "available") {
    console.warn("Gemini Nano is not available. Falling back to cloud model.");
    setShowNanoAlert(true); // IMPORTANT TO WARN USERS
  } else {
    console.log("Gemini Nano is available and ready.");
  }
};
checkNanoAvailability();
}, []);

<>
{showNanoAlert && (
  <div className="modal-overlay">
  <div className="modal-content">
  // Omitting for brevity
  </div>
)}
</> 

Conclusion

If you are using the local on device model and offer a privacy preserving feature, you should always consider throwing a warning when the user may be switching to a provider hosted model. Giving the users the choice whether to proceed with the operation can help ensure user trust and give them transparency into how their data is being processed.

Related Content

ai • Feb 25, 2026

Get LIVE Feedback from AI: Building a Presentation Coach

Tired of generic chatbots? Discover how Firebase AI Logic lets you build unique, streamed conversational AI experiences with Gemini's native audio models! From practicing presentations with AI feedback to building custom voice assistants, the possibilities are endless. This video dives deep into Firebase AI Logic, showcasing how you can create immersive audio interactions within your apps. Learn how to leverage Gemini's native audio models to provide a unique vocal experience, keeping your API keys secure while offering features like custom personas, real-time feedback via tool calls, and diverse voice selections. We'll walk through a practical demo of building an AI assistant that helps practice public speaking, complete with live metrics and a distinct AI personality.

Watch on YouTube
firebase • Dec 19, 2025

Top Features of 2025: AI, SQL & Real-time Updates | Firebase Release Notes

In this masterpiece, Nohe and the crew unwrap some of the most requested feature requests from the Firebase community - yes, that's you! This episode unpacks major announcements including the comprehensive integration of Gemini AI across the development workflow, the General Availability of Firebase Data Connect with managed PostgreSQL, and the arrival of real-time updates for Remote Config on the web.

Watch on YouTube