Using the on device Prompt API in Chrome offers the ability to have a private LLM experience without the need to spin up your own infrastructure. Having a private experience with an LLM on a device can be limited by the user’s browser choice as well as the device they are choosing to use. In this post we are going to take a look at how to warn users when they may be going to a cloud model instead of using a local model.

On-device inference with Chrome’s Prompt API and React

My colleague Jeff Huleatt wrote an article about the caveats of using the on-device model. Despite all the caveats, one undeniable advantage that the on device model has over any other model is the ability to offer improved privacy. The reason is that the on device model runs on device (its in the name) and there is no need for a network. This also benefits progressive web apps that are installed on a user’s device that offer an offline first experience. Having this capability, paired with other PWA APIs like the file system API can offer unique privacy preserving experiences. In some cases, this means that files never leave the device so you could build a file categorizer on device.

Warnings when you leave the device

My other colleague Cynthia Wang had the clever idea that when you do use hybrid inference and a user is using a different device, you should at least throw a warning. This is especially true if your value proposition is a privacy preserving feature. In Cynthia’s she demonstrates that when you offer a hybrid AI experience (on device and provider model) you should throw a warning in the event that Gemini Nano is not available. You may also want to show a toggle with the toggle greyed out if there is no local model available. This provides reliable fallback for the user if they are okay with sending their data off device but also provides the opportunity to not have the app break entirely if there is no on device model.

Here is the code block in its original form from Cynthia’s blog article.

 useEffect(() => {
const checkNanoAvailability = async () => {
  // Checking availability of the on-device model
  const { languageModelProvider, onDeviceParams } = metadataModel.chromeAdapter;
  if (await languageModelProvider?.availability(onDeviceParams.createOptions) !== "available") {
    console.warn("Gemini Nano is not available. Falling back to cloud model.");
    setShowNanoAlert(true); // IMPORTANT TO WARN USERS
  } else {
    console.log("Gemini Nano is available and ready.");
  }
};
checkNanoAvailability();
}, []);

<>
{showNanoAlert && (
  <div className="modal-overlay">
  <div className="modal-content">
  // Omitting for brevity
  </div>
)}
</>

Conclusion

If you are using the local on device model and offer a privacy preserving feature, you should always consider throwing a warning when the user may be switching to a provider hosted model. Giving the users the choice whether to proceed with the operation can help ensure user trust and give them transparency into how their data is being processed.

The Privacy Guardrail: How to Implement Warnings for On-Device LLM Failures

On-device inference with Chrome’s Prompt API and React

Warnings when you leave the device

Conclusion

Related Content

dinner over engineered

Location AI: Ground Gemini in Google Maps

Implement hybrid inference in Android using Firebase AI Logic