Build voice assistant with React Native and Gemini AI

Build voice assistant with React Native and Gemini AI

In this tutorial, we are going to find out step by step how to build an assistant that you can chat face to face via text or voice message. Using React Native and one of the most popular language models today is Gemini AI.

Prerequisites:

  1. Node.js and npm: Ensure that Node.js and npm are installed on your machine. Otherwise, you can follow here downloading-and-installing-node-js-and-npm
  2. Code editor: You can use any editor you love but in this tutorial, I use Visual Studio Code
  3. Gemini AI key: Obtain your Gemini AI key by signing up at https://ai.google.dev/

Create a new React Native project

Let’s begin by creating a new project with React Native using the following command:

1
npx react-native init ReactNativeVoiceAssistant

To set up your environment, run, and build a React Native project, you can follow the instructions outlined in the React Native Get Started guide.

Installing dependencies

Next, we will install some packages for this project. Run the following commands one after another:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
// for interaction with Rest API
npm install axios

// speech to text
npm install @react-native-voice/voice

// react native navigation
npm install @react-navigation/native
npm install react-native-screens
npm install react-native-safe-area-context
npm install @react-navigation/native-stack

// text to speech
npm install --save react-native-tts

// chat component
npm install react-native-gifted-chat

// vector icons
npm install --save react-native-vector-icons
npm install @types/react-native-vector-icons

// environment variable
npm install -D react-native-dotenv

I will not explain this detail here, you can find more documentation on their official sites.

Implementing

Creating folders structure

In the root, create some folders: src/screens, src/navigations, src/services and src/styles. The screens folder contain all code related to Welcome Screen and ChatScreen, navigations folder for code related to navigation between screens, services folder for code related to logic accessing the rest API, styles for stylesheets.

Creating service for accessing the Gemini AI rest API

First, create a .env file in the root folder and store your API key here. Please remember not to publish your API key.

1
2
API_KEY=YourApiKeyHere
API_URL=https://generativelanguage.googleapis.com/v1beta/models/gemini-1.0-pro:generateContent

Next, create the file src/services/AiService.js. The following code will read variables from .env and call the rest API via Axios.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import { API_KEY, API_URL } from "@env";
import axios from "axios";

export const getAnswerFromGpt = async (prompts) => {
console.log(prompts);
try {
const client = axios.create({
headers: {
"Content-Type": "application/json",
},
});
const response = await client.post(`${API_URL}?key=${API_KEY}`, {
contents: prompts,
generationConfig: {
temperature: 0.9,
topK: 1,
topP: 1,
maxOutputTokens: 2048,
stopSequences: [],
},
safetySettings: [
{
category: "HARM_CATEGORY_HARASSMENT",
threshold: "BLOCK_MEDIUM_AND_ABOVE",
},
{
category: "HARM_CATEGORY_HATE_SPEECH",
threshold: "BLOCK_MEDIUM_AND_ABOVE",
},
{
category: "HARM_CATEGORY_SEXUALLY_EXPLICIT",
threshold: "BLOCK_MEDIUM_AND_ABOVE",
},
{
category: "HARM_CATEGORY_DANGEROUS_CONTENT",
threshold: "BLOCK_MEDIUM_AND_ABOVE",
},
],
});

const answer = response.data?.candidates[0]?.content?.parts?.[0]?.text;
return Promise.resolve({ success: true, data: answer });
} catch (error) {
return Promise.resolve({ success: false, msg: error.message });
}
};

Creating WelcomeScreen

We need to create a src/screens/WelcomeScreen.tsx file for logic code and a src/styles/WelcomeStyle.ts file for the stylesheet.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Import necessary components and libraries
import { NavigationProp, ParamListBase, useNavigation } from '@react-navigation/native';
import React from 'react';
import { Text, TouchableOpacity, View } from 'react-native';
import { SafeAreaView } from 'react-native-safe-area-context';
import MaterialCommunityIcons from 'react-native-vector-icons/MaterialCommunityIcons';
import { styles } from '../styles/WelcomeStyle';
const WelcomeScreen = () => {
const navigation = useNavigation<NavigationProp<ParamListBase>>();
const handleNext = () => {
navigation.navigate('ChatScreen');
};

return (
<SafeAreaView style={styles.container}>
<View style={styles.container}>
<View>
<Text style={styles.title}>React Native Voice Assistant</Text>
<Text style={styles.subtitle}>With Voice Command powered by OpenAI</Text>
</View>
<MaterialCommunityIcons
name="account-tie-voice"
size={200}
style={{ marginBottom: 10, marginRight: 10, color: '#10a37f' }}
/>

<TouchableOpacity style={styles.button} onPress={handleNext}>
<Text style={styles.buttonText}>Start Chat</Text>
</TouchableOpacity>
</View>
</SafeAreaView>
);
};

export default WelcomeScreen;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
import { StyleSheet } from "react-native";

// Styles
const styles = StyleSheet.create({
container: {
flex: 1,
justifyContent: "space-between",
alignItems: "center",
padding: 38,
},
title: {
fontSize: 24,
fontWeight: "bold",
textAlign: "center",
color: "#10a37f",
},
subtitle: {
fontSize: 12,
color: "#666",
textAlign: "center",
},
button: {
backgroundColor: "#10a37f",
paddingVertical: 5,
paddingHorizontal: 20,
borderRadius: 20,
},
buttonText: {
color: "#FFF",
fontSize: 18,
},
});

export { styles };

Creating ChatScreen

We also create two files the same as WelcomeScreen. Put the following code to src/screens/ChatScreen.tsx file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
// Import necessary components and libraries
import Voice from '@react-native-voice/voice';
import React, { useEffect, useRef, useState } from 'react';
import { Platform, SafeAreaView, TouchableOpacity, View } from 'react-native';
import { GiftedChat, IMessage, Send } from 'react-native-gifted-chat';
import Tts from 'react-native-tts';
import MaterialCommunityIcons from 'react-native-vector-icons/MaterialCommunityIcons';
import { getAnswerFromGpt } from '../services/OpenAiService';
import { styles } from '../styles/ChatStyle';

const initialMessages: IMessage[] = [
{
_id: 1,
text: 'Hi there! How can I assist you today?',
createdAt: new Date(),
system: true,
user: { _id: 1 },
},
];

const ChatScreen = () => {
const [recording, setRecording] = useState(false);
const [messages, setMessages] = useState(initialMessages);
const [result, setResult] = useState('');
const listPrompt = useRef<any>([]);

const onSend = (messages: IMessage[] = []) => {
const { text } = messages[0];
setMessages(previousMessages => {
return GiftedChat.append(previousMessages, messages, Platform.OS !== 'web');
});
processTranscription(text);
};

useEffect(() => {
Tts.setDefaultLanguage('en_US');
}, []);

useEffect(() => {
Voice.onSpeechStart = () => {
console.log('===speech start');
setRecording(true);
};

Voice.onSpeechEnd = () => {
console.log('===speech end');
setRecording(false);
};

Voice.onSpeechError = (e: any) => {
const errMsg: string = e.error?.message;

if (errMsg.includes('No match')) {
console.log('You are not speaking!');
} else {
console.log(errMsg);
}

setRecording(false);
};

Voice.onSpeechResults = (e: any) => {
const prompt = e.value[0];
if (!prompt) {
return;
}
setResult(prompt);
};

setMessages(initialMessages);

return () => {
Voice.destroy().then(Voice.removeAllListeners);
Tts.stop();
};
}, []);

const stopRecording = async () => {
try {
await Voice.stop();
setRecording(false);
console.log('== stopRecording');
if (result) {
const newMsg = {
_id: Math.round(Math.random() * 1000000),
text: result,
createdAt: new Date(),
user: {
_id: 2,
name: 'User',
},
};

const newMessage = [newMsg];

setMessages(previousMessages => {
return GiftedChat.append(previousMessages, newMessage, Platform.OS !== 'web');
});

processTranscription(result);
}
} catch (error: any) {
console.log('== eror when stop: ', error);
}
};

const startRecording = async () => {
console.log('== startRecording ');
setRecording(true);
Tts.stop();

try {
await Voice.start('en_US');
} catch (e) {
console.error(e);
}
};

const readTheAnswer = (message: string) => {
console.log('tts: ', message);
Tts.speak(message);
};

const processTranscription = async (prompt: string) => {
if (prompt.trim().length > 0) {
console.log('stt: ', prompt.trim());
listPrompt.current = [
...listPrompt.current,
{
role: 'user',
parts: [
{
text: prompt.trim(),
},
],
},
];
getAnswerFromGpt(listPrompt.current).then((res: any) => {
if (res.success) {
const newMsg = {
_id: Math.round(Math.random() * 1000000),
text: res.data,
createdAt: new Date(),
user: {
_id: 1,
name: 'Assistant',
},
};

const newMessage = [newMsg];
setMessages(previousMessages => {
return GiftedChat.append(previousMessages, newMessage, Platform.OS !== 'web');
});

if (res.data) {
listPrompt.current = [
...listPrompt.current,
{
role: 'model',
parts: [
{
text: res.data.trim(),
},
],
},
];
readTheAnswer(res.data);
}
} else {
console.log(res.msg);
}
});
}
};

const renderSend = (props: any) => {
return (
<>
<Send {...props}>
<MaterialCommunityIcons name="send" style={styles.buttonSend} />
</Send>
<TouchableOpacity style={styles.buttonMicStyle} onPress={recording ? stopRecording : startRecording}>
{recording ? (
<MaterialCommunityIcons name="stop" style={styles.buttonRecordingOff} />
) : (
<MaterialCommunityIcons name="microphone" style={styles.buttonRecordingOn} />
)}
</TouchableOpacity>
</>
);
};

const scrollToBottomComponent = () => {
return <MaterialCommunityIcons name="arrow-down-circle-outline" size={38} color="#10a37f" />;
};

return (
<SafeAreaView style={styles.container}>
<View style={styles.container}>
<GiftedChat
messages={messages}
showAvatarForEveryMessage={true}
onSend={messages => onSend(messages)}
user={{
_id: 2,
name: 'User',
avatar: '',
}}
alwaysShowSend
renderSend={renderSend}
scrollToBottom
scrollToBottomComponent={scrollToBottomComponent}
/>
</View>
</SafeAreaView>
);
};

export default ChatScreen;

This code put into src/styles/ChatStyle.ts

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import { StyleSheet } from "react-native";

const styles = StyleSheet.create({
container: {
flex: 1,
},
buttonSend: {
fontSize: 20,
marginBottom: 12,
marginRight: 10,
},
buttonMicStyle: {
alignItems: "center",
justifyContent: "center",
alignSelf: "center",
marginRight: 10,
},
buttonRecordingOff: { fontSize: 25 },
buttonRecordingOn: { fontSize: 25 },
});

export { styles };

Creating navigation

Create a src/navigations/AppNavigation.tsx file to handle navigation between screens in the application.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
import { NavigationContainer } from "@react-navigation/native";
import { createNativeStackNavigator } from "@react-navigation/native-stack";
import React from "react";
import ChatScreen from "../screens/ChatScreen";
import WelcomeScreen from "../screens/WelcomeScreen";

const Stack = createNativeStackNavigator();

const AppNavigation = () => {
return (
<NavigationContainer>
<Stack.Navigator initialRouteName="WelcomeScreen">
<Stack.Screen
name="WelcomeScreen"
component={WelcomeScreen}
options={{ headerShown: false }}
/>
<Stack.Screen
name="ChatScreen"
component={ChatScreen}
options={{
title: "Voice Assistant",
headerStyle: {
backgroundColor: "#d8d8d8",
},
headerTintColor: "black",
headerTitleStyle: {
fontWeight: "bold",
},
}}
/>
</Stack.Navigator>
</NavigationContainer>
);
};

export default AppNavigation;

Next, we include the AppNavigation component in App.tsx file by replacing content with the following code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
import React from "react";
import { LogBox } from "react-native";
import { SafeAreaProvider } from "react-native-safe-area-context";
import AppNavigation from "./src/navigations/AppNavigation";

LogBox.ignoreLogs(["Warning: ..."]); // Ignore log notification by message
LogBox.ignoreAllLogs(); //Ignore all log notifications

export default function App() {
return (
<SafeAreaProvider>
<AppNavigation />
</SafeAreaProvider>
);
}

Run your application

Run on ios:

1
npx react-native run-ios

Run on android:

1
npx react-native run-android

Demo

Click on the following video to watch a demo talk with AI

Watch the video

Comments

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×