Implement Gemini AI SDK with SwiftUI

artificial-intelligence May 26, 2024
Implement Gemini AI SDK with SwiftUI

In the fast evolving landscape of artificial intelligence, utilising the power of AI models to your iOS applications can not just give a personalized experience that meet the evolving needs of users but also gives a competitive edge in the app market.

In this tutorial, you will learn how to build a knowledge-based assistant using SwiftUI and Gemini-pro model. The assistant takes prompt from the user and is capable of generating contextually relevant responses. Even if you’ve never played with Gemini APIs, this tutorial will take you through each and every step in the process in detail.

Prerequisites

Following are the pre-requisites to get started with Gemini AI integration:

  • Xcode 15.0 or higher and iOS 15 or higher
  • GoogleGenerativeAI package installed in your project
  • A GeminiAI API key (you can get a free one if you don't have it)

Don't worry if you don't have the last two yet, I'll cover how to get set up.

Create a new Xcode project

First and foremost step is to setup an Xcode project. If you’re new to Xcode, follow these steps to create a new Xcode project.

  1. Open Xcode and select Create New Project
  2. Select App and click on Next
  3. In the next step you have to give a name to your project. I have named it ChatWithGemini and rest of the config details can be left as it is.

Now, you are all set with your Xcode project.

Getting Started with Google Gemini APIs

Before actually using the Gemini API there are four crucial steps that one need to follow:

  1. Setting up the API key
  2. Adding the API key to your project
  3. Adding the Google Gemini SDK package
  4. Initialising the model

Let's take a closer look at each one of these steps.

Setting up the API key

To access the Google Gemini APIs, you need an API key.

An API key is a unique identifier that is used to perform an API call. Since Google Gemini APIs are not open, in order to use them you first need to register on their platform and obtain a secret code or the access key via which you can become an authenticated user and call their APIs in your project.

Why do you need an API Key?

By requiring an API key, Google ensures that only authorized users can access their APIs and also monitor and manage the usage of their services.

How to get your API Key?

To get your API key, you need to login to Google AI Studio and you will be redirected to Get API keys page where you can create the API key.


If you already have a Google Cloud project, you can use that existing project to create the API key. Otherwise, you can create API key in a new project.

Once the key is generated you can copy it and use it in your Xcode project. Now, how to actually use it, we will discuss it in the next section.

Since every API key uniquely identifies a user, they are meant to be kept as a secret and not shared with anyone. If others get access to your Gemini API key, they can make calls on your behalf and you might get billed for it.

Adding API key to Xcode project

In order to securely store the API key in your Xcode project, there are few different ways to do it. Let’s use one of the popular methods i.e. to add the API key in a .plist file.

đź’ˇ Note : The .plist file should not be added to the version control system so that no one else is able to use your API key.

For adding the API key, you first have to create a new Property List file in the root folder of your app and add the API key to it.

Right click on the project → New File

Select Property List File from the multiple options available and name it as GenerativeAI-Info

For adding the API Key to this GenerativeAI-Info.plist file, click on + and add a new key to your Information property list and set it’s value as the API key you would have copied from the Google AI Studio.

So far, you have simply stored the API key but you also have to find a way to fetch this API_KEY from the GenerativeAI-Info.plist file

To fetch this API key, you will have to create a new Swift file, say APIKey and create an enum APIKey within it with a computed property named “default”.

enum APIKey {
  // Fetch the API key from `GenerativeAI-Info.plist`
  static var `default`: String {
    guard let filePath = Bundle.main.path(forResource: "GenerativeAI-Info", ofType: "plist")
    else {
      fatalError("Couldn't find file 'GenerativeAI-Info.plist'.")
    }
    let plist = NSDictionary(contentsOfFile: filePath)
    guard let value = plist?.object(forKey: "API_KEY") as? String else {
      fatalError("Couldn't find key 'API_KEY' in 'GenerativeAI-Info.plist'.")
    }
    if value.starts(with: "_") {
      fatalError(
        "Follow the instructions at https://ai.google.dev/tutorials/setup to get an API key."
      )
    }
    return value
  }
}

 In the above code,

  1. A computed property named default is created.
  2. Then, the filePath variable is assigned the path of the file with the name “GenerativeAI-Info” of type plist from the bundle container directory.
  3. If the filePath is obtained without errors, it tries to read the contents of the file as a dictionary and then tries to find the value corresponding to the key value “API_KEY”.
  4. If the key name for the API key or the .plist file is different, you will have to update the code accordingly.

Now, you are all set to use your API key in your project files.

Adding the SDK package

Google AI SDK is available in iOS as a Swift package named GoogleGenerativeAI which you can simply add to our Xcode project and simplify the overall integration process. Let see how to add it:

  1. Right click on the project → Add Package Dependencies
  2. Paste this package URL in the search bar https://github.com/google/generative-ai-swift
  3. Click Add Package. You would be able to see generative-ai-swift in your package dependencies below the project navigator on the left and GoogleGenerativeAI package is now added to your project.

Initialise the model

Next step is to initialise the model so that you can access the functions that it provides to generate response to the prompts provided.

To initialise the model in your SwiftUI file, you first have to import the GoogleGenerativeAI module at the top

import GoogleGenerativeAI

and then create an instance of the model. Since, Gemini is an umbrella term used for the family of generative AI models, you have to specify the model that you want to use and pass the API key for authorization. You can read more about the Gemini models available here.

Since we only have to pass text as an input we can use gemini-pro model.

let model = GenerativeModel(name: "gemini-pro", apiKey: APIKey.default)

The APIKey is taken from the APIKey swift file created in earlier steps.

Creating the UI using SwiftUI

The idea is to make a basic UI for this tutorial and the focus is more on the functionality. You can modify the UI as per your requirements and make it more visually appealing.

Firstly, you would require 3 state variables, one for managing the user input, second for managing the response received from the AI model and third for managing the loading time the model takes to generate the response.

@State var userPrompt = ""
@State var response = "How can I help you today?"
@State var isLoading = false

The userPrompt is initialised as an empty string as there is no user input when the app starts and since there is no prompt to generate the response, hence isLoading is set as false.

Let’s create the UI of the app.

VStack {
    Text("Welcome to Gemini AI")
        .font(.largeTitle)
        .foregroundStyle(.indigo)
        .fontWeight(.bold)
        .padding(.top, 40)
    ZStack{
        ScrollView{
            Text(response)
                .font(.title)
        }
        if isLoading {
            ProgressView()
                .progressViewStyle(CircularProgressViewStyle(tint: .indigo))
                .scaleEffect(4)
        }
    }
            
    TextField("Ask anything...", text: $userPrompt, axis: .vertical)
        .lineLimit(5)
        .font(.title3)
        .padding()
        .background(Color.indigo.opacity(0.2), in: Capsule())
        .disableAutocorrection(true)
}
.padding()

In the above code,

  1. Creates a simple UI where at the top there is a title which says “Welcome to Gemini AI”.
  2. The response that is going to be generated by the AI model will get displayed in a ScrollView and while the response is getting loaded, a progress View will get displayed.
  3. There is also a TextField which takes the user input, stores it in the userPrompt variable and can expand vertically upto 5 lines as the user types.

Here is how it looks like:

Accessing the model to generate response

The last step is to add the functionality to generate response based on the user input. When the user hits enter, the model should process the prompt and generate the response.

Let’s first add a onSubmit() modifier to the TextField which calls the function generateResponse() when the user enters and submit the prompt.

TextField("Ask anything...", text: $userPrompt, axis: .vertical)
    .lineLimit(5)
    .font(.title3)
    .padding()
    .background(Color.indigo.opacity(0.2), in: Capsule())
    .disableAutocorrection(true)
    .onSubmit {
        generateResponse()
}

Now, it’s time to define the function generateResponse().

func generateResponse(){
    isLoading = true;
    response = ""
    
    Task {
        do {
            let result = try await model.generateContent(userPrompt)
            isLoading = false
            response = result.text ?? "No response found"
            userPrompt = ""
        } catch {
            response = "Something went wrong! \n\(error.localizedDescription)"
        }
    }
}

In the above code,

  1. isLoading variable is set as true until the response is generated and as soon as the response is received isLoading is again set to false.
  2. The default AI model response i.e. “How can I help you today?” is replaced with an empty string so that only the loader appears on the screen while the response is being generated.
  3. Then, an asynchronous task using Task and await is performed and in case of any errors, the response is set with a string which highlights the error.
  4. The code passes the userPrompt to the generateContent function of the GenerativeModel class which is of type GenerateContentResponse. To access the response’s content as text, the text property of GenerateContentResponse is accessed.

This is how the app responds to the user prompt:

Note: The result is in Markdown format, so you can use LocalizedStringKey to wrap the returned text along with that you need to specify the type of response as LocalizedStringKey while defining the state variable.

You’ll have to make changes in the following lines of your code:

@State var response: LocalizedStringKey = "How can I help you today?"

Task {
    do {
        let result = try await model.generateContent(userPrompt)
        isLoading = false
        response = LocalizedStringKey(result.text ?? "No response found")
        userPrompt = ""
    } catch {
        response = "Something went wrong! \n\(error.localizedDescription)"
    }
}

Output:

Conclusion

With just ~100 lines of code, you have a knowledge based AI assistant SwiftUI app. You would have observed how easy it is to integrate Gemini APIs into your app. Now, it’s time for you to blend the power of Gemini APIs and SwiftUI and build more interesting applications. 

You can get the full source code from here. 

But we do not stop there, in the next articles we will learn how to make an interactive AI chat app which is capable of storing the user prompt as well as the response and also take image as a prompt.


Signup now to get notified about our
FREE iOS Workshops!