Android API Spec
Note: All functions listed in this document are safe to call from the main or UI thread and all callbacks will be run on the main thread, unless there are explicit instructions or explanations.
LeapClient
The entrypoint of LEAP SDK. It doesn’t hold any data.
object LeapClient {
suspend fun loadModel(path: String): ModelRunner
suspend fun loadModelAsResult(path: String): Result<ModelRunner>
}
loadModel
This function can be called from UI thread. The path
should be a local path pointing to model bundle file. The app should hold the ModelRunner
object returned by this function until there is no need to interact with the model anymore. See ModelRunner
for more details.
The function will throw LeapModelLoadingException
if LEAP fails to load the model.
loadModelAsResult
This function can be called from UI thread. The path
should be a local path pointing to model bundle file. This function is merely a wrapper around loadModel
function to return a Result
.
Conversation
The instance of a conversation, which stores the message history and states that are needed by the model runner for generation.
While this Conversation
instance holds the data necessary for the model runner to perform generation, the app still needs to maintain the UI state of the message history representation.
interface Conversation {
// Chat history
val history: List<ChatMessage>
// Whether a generation is in progress
val isGenerating: Boolean
// Generating response from a text input as user message
fun generateResponse(userTextMessage: String): Flow<MessageResponse>
// Generating response from an arbitrary new message
fun generateResponse(message: ChatMessage): Flow<MessageResponse>
// Export the chat history in this convesation to a `JSONArray`.
fun exportToJSONArray(): JSONArray
}
Creation
Instance of this class should not be directly initialized. It should instead be created by the ModelRunner
instance.
Lifetime
While a Conversation
stores the history and state that is needed by the model runner to generate content, its generation function relies on the model runner that creates it. As a result, if that model runner instance has been destroyed, the Conversation
instance will fail to run subsequent generations.
history
history
value field will return a copy of the chat message history. Any mutations to its return value will not change the internal state of the generation. If there is an ongoing generation, the partial message may not be available in the return value of this field. However, it is guaranteed that when MessageResponse.Complete
is received and when the flow is completed, the history value field will be updated to have the latest message.
isGenerating
isGenerating
value field is true if the generation is still in progress. Its value will be consistent across all threads.
generateResponse
generateResponse(message: ChatMessage)
is the preferred method for response generation. It can be called from UI thread.
The return value is a Kotlin asynchronous flow . The generation will not start until the flow is collected (following the convention of flows). Refer to Android documentation on how to properly handle the flow with lifecycle-aware components.
A MessageResponse
instance will be emitted from this flow, which contains the chunk of data generated from the model.
Errors will be thrown as LeapGenerationException
in the stream. Use .catch
to capture errors
from the generation.
If there is already a running generation, further generation requests are blocked until the current generation is done. However, there is no guarantee that the order in which requests are received will be the order in which they are processed.
exportToJSONArray
Export the whole conversation history into a JSONArray
. Each element can be interpreted as a ChatCompletionRequestMessage
instance in OpenAI API schema.
See also: Gson Support.
Cancellation of the generation
Generation will be stopped when the coroutine job that runs the flow is canceled, but it may (no guarantee) keep going as long as the job of the flow is still active.
Hence, we highly recommend the generation be started within a coroutine scope associated with a lifecycle-aware component, so that the generation can be stopped if the lifecycle-aware components are destroyed. Here is an example:
job = lifecycleScope.launch {
conversation.generateResponse(userInput)
.onEach {
when (it) {
is MessageResponse.Chunk -> {
Log.d(TAG, it.text)
}
is MessageResponse.Complete -> {
Log.d(TAG, "Generation is done!")
}
else -> {}
}
}
.collect()
// Stop the generation by cancelling the job
job.cancel()
ModelRunner
An instance of the model loaded in memory. Conversation
instances should always be created from an instance of ModelRunner
. The application needs to own the model runner object – if the model runner object is destroyed, any ongoing generations may fail.
If you need your model runner to survive after the destruction of activities, you may need to wrap it in an Android Service .
interface ModelRunner {
// create a conversation instance
fun createConversation(systemPrompt: String? = null): Conversation
// create a conversation from chat message history
fun createConversationFromHistory(history: List<ChatMessage>): Conversation
// unload the model: the runner cannot be used after unload is called.
suspend fun unload()
// Start generation from the conversation instance.
fun generateFromConversation(
conversation: Conversation,
callback: GenerationCallback,
): GenerationHandler
}
createConversation
Factory method to create a conversation instance based on this model runner. As a result, the model runner instance will be used for any generation around the created conversation instance. The model runner will have access to the internal state of the created conversation.
If the model runner is unloaded, any conversation instances created from the model runner will be read only.
createConversationFromHistory
This factory method will create a conversation object with the provided chat history. It can be used to restore a conversation from persistent storage while ensuring that a living model runner is backing it.
unload
Unload the model from memory. The model runner will not be able to perform generation once this method is invoked. An exception may be thrown by any ongoing generation. It is the app developer’s responsibility to ensure that unload
is called after all generation is complete.
generateFromConversation
This function is not recommended to be called by the app directly. It is an internal interface for the model runner implementation to expose the generation ability to LEAP SDK. Conversation.generateResponse
is the better wrapper of this method, which relies on Kotlin coroutines to connect with lifecycle-aware components.
This function may block the thread. If you must use it, please call it outside the main thread.
ChatMessage
Data class that is compatible with the message object in OpenAI chat completion API.
data class ChatMessage(
val role: Role,
val content: List<ChatMessageContent>
val reasoningContent: String? = null
) {
fun toJSONObject(): JSONObject
}
ChatMessage.fromJSONObject(obj: JSONObject): ChatMessage
Fields
role
: The role of this message (seeChatMessage.Role
).content
: A list of message contents. Each element is an instance ofChatMessageContent
.reasoningContent
: The reasoning content generated by the reasoning models. Only messages generated by reasoning models will have this field. For other models or other roles, this field should benull
.
toJSONObject
Return a JSONObject
that represents the chat message. The returned object is compatible with ChatCompletionRequestMessage
from OpenAI API. It contains 2 fields: role
and content
.
See also: Gson Support.
fromJSONObject
Construct a ChatMessage
instance from a JSONObject
. Not all JSON object variants in ChatCompletionRequestMessage
of OpenAI API are acceptable. As of now, role
supports user
, system
and assistant
; content
can be a string or an array.
LeapSerializationException
will be thrown if the provided JSONObject cannot be recognized as a
message.
See also: Gson Support.
ChatMessage.Role
Roles of the chat messages, which follows the OpenAI API definition. It is an enum with the following values:
enum class Role(val type: String) {
SYSTEM("system"),
USER("user"),
ASSISTANT("assistant"),
}
SYSTEM
: Indicates the associated content is part of the system prompt. It is generally the first message, to provide guidance on how the model should behave.USER
: Indicates the associated content is user input.ASSISTANT
: Indicates the associated content is model-generated output.
ChatMessageContent
Data class that is compatible with the content object in OpenAI chat completion API. It is a sealed class.
abstract class ChatMessageContent {
fun clone(): ChatMessageContent
fun toJSONObject(): JSONObject
}
fun ChatMessageContent.fromJSONObject(obj: JSONObject): ChatMessageContent
toJSONObject
returns an OpenAI API compatible content object (with atype
field and the real content fields)fromJSONObject
receives an OpenAI API compatible content object to build a message content. Not all OpenAI content objects are accepted. Currently, only following content types are supported:Text
: pure text content.
LeapSerializationException
will be thrown if the provided JSONObject cannot be recognized as a
message.
MessageResponse
The response generated from models. The generation may take a while to finish, so the generated text will be emitted as “chunks”. When the generation completes, a complete response object will be emitted. This is a sealed class where only the following options are available:
sealed class MessageResponse {
class Chunk(val text: String) : MessageResponse()
class ReasoningChunk(val reasoning: String) : MessageResponse()
class Complete(val fullMessage: ChatMessage, val finishReason: GenerationFinishReason) : MessageResponse()
}
Chunk
is a piece of generated text.ReasoningChunk
is a piece of generated reasoning text. It will be emitted only by reasoning models.Complete
indicates the completion of a generation.- The
fullMessage
field contains the completeChatMessage
with all the content generated from this round of generation - The
finishReason
indicates why the generation is done – as of now the only expected value isSTOP
- The
Error Handling
All errors are thrown as LeapException
, which has following subclasses:
LeapModelLoadingException
: error in loading the modelLeapGenerationException
: error in generating contentLeapSerializationException
: error in serializing / deserializing data.
Gson Support
Leap Android SDK also has Gson support. The leap_gson
package should be imported to enable Gson to serialize and deserialize Leap objects. The Gson support has the same behaviors as the existing org.json
implementation.
The following types are supported:
Create Gson Object
To create a Gson object that supports Leap objects, call registerLeapAdapters
on the GsonBuilder
before creating the Gson object.
import ai.liquid.leap.gson.registerLeapAdapters
import com.google.gson.GsonBuilder
val gson = GsonBuilder().registerLeapAdapters().create()
Serializing and Deserializing Conversation History
With a Conversation
object, simply call Gson.toJson
to convert the chat message history into a JSON string. The returned JSON will be an array.
val json = gson.toJson(conversation.history)
To deserialize the conversation history from a JSON array, use LeapGson.messageListTypeToken
as the type hint for Gson.
import ai.liquid.leap.gson.LeapGson
val chatHistory: List<ChatMessage> = gson.fromJson(json, LeapGson.messageListTypeToken)