Not so long ago, I released a new Kotlin library called Telegram BotKit. This library is a thin client between your Kotlin code and Telegram’s HTTP API for bots. It can also act as a lightweight server for your bots receiving updates via long-polling.

The library API design is a result of careful crafting. It utilizes Kotlin’s Context Receivers to make the library easy to integrate and extend. Out of the box, the BotKit supports the full range of Bot API methods. It is also equipped with everything needed to quickly incorporate the future Bot API changes.

Here is how it looks in action:

kotlin
val client = TelegramBotApiClient(botApiToken)
val poller = TelegramBotApiPoller(client)

poller.start(TelegramBotUpdateListener(
    onMessage = { message ->
        message.reply(
            text = "Hello, *${message.from?.firstName ?: "stranger"}*!",
            parseMode = ParseMode.MARKDOWN,
            replyMarkup = inlineKeyboard {
                buttonLink("Telegram", "https://telegram.org")
                row {
                    button("Bot", "bot")
                    button("API", "api")
                }
            }
        )
    },
    onCallbackQuery = { callbackQuery ->
        when (callbackQuery.data) {
            "bot" -> callbackQuery.answer("🤖")
            "api" -> callbackQuery.answer("🚀")
            else -> callbackQuery.answer("🤷")
        }
    }
))

This example is packed with goodness! It might not be apparent, because there is a lot hidden behind the type-inference. But can you spot a type-safe DSL in there? How about a suspendable callback with a context receiver?

Exploring and playing around with the sample is best done in an IDE. I will leave it as an exercise for a curious mind. The library is readily available on Maven Central.

In this post, I will focus on the structure of the API and the process of achieving it.

Long time in the making

I have been crafting Telegram bots for various purposes for more than 5 years. Ironically, I started my journey with the most advanced bot I have to date — Laterator.

This reminder-bot was created before Telegram supported scheduling messages. Though, it has a different use case: any message acts as a reminder, and the bot will nag you about it until marked as done. I find this much more productive than having a reading list which I never open.

Laterator intro Laterator intro

Laterator chat Laterator chat

Any message type Telegram supports works as a reminder: texts, pictures, channel posts, forwarded messages, etc. The trick is that the bot stores zero content, and only operates with message ids.

Security by lazynuity

The coolest part, though, is the text-based interface to snooze your reminders. Using Parsus, I created a grammar that is able to parse most human-writable formats of time. A gentle introduction to this is in a post about parsing time. Though, I would probably open-source the whole thing at some point.

While developing Laterator, I was building the client for the Bot API ad-hoc, serving the immediate needs of the bot. It grew organically, and was neither uniform nor complete. As I started to create more bots, I realized that it would be better to have a standalone library to separate the concerns. In some form, it lived in a monorepo for a long time. But I wanted to share it with the world!

From specks to spec

Telegram does not provide a machine-readable spec for their Bot API. Honestly, I suspect that this is done on purpose, and not due to the lack of resources. I have a number of competing theories as to why that is, but I will leave them for another day.

The only source of truth for the Bot API is the webpage that documents all the available methods and data types. This page does not provide the versioned view of the API, and instead shows only the latest state.

Side note. The last bit at least makes sense. They only have one observable production deployment of their Bot infrastructure. It runs the latest version they deployed, and there is no way to "keep using" a previous one.

On the webpage, each method and data type used in the responses is documented as a section with a table inside.

Telegram Bot API webpage Telegram Bot API webpage

In order for my library to be complete, I would have to support everything that’s described on that page. For the library to be up-to-date, I would need to quickly release a new version after Telegram publishes the changes of the Bot API. This means at least some part of the API extraction process has to be automated.

Relying entirely on an automation, however, in this case is dangerous. The “spec” is far from formal, and sometimes inconsistent. What is required, then, is a set of tools, that aid me with the migrations of the BotKit library. I would then be able to manually verify that each migration was correct.

The way I chose to do it is via Kotlin source code generation. After each regeneration of the API sources, I can review them as a regular PR. Checking them into the version control also gives the benefit of a built-in change history tracking.

I implemented all the required steps using Gradle, directly in the build logic of the library.

  1. Downloading the new version of the API description as an HTML page
  2. Extracting a model from the API
  3. Generating Kotlin sources that correspond to the data types and methods

Parsing the model from rendered HTML was an interesting task in itself. It possibly deserves a post on its own. I even found a place to sneak in a Parsus grammar in there as well.

Hydra of BotKit

My goal with this library is to provide a minimal level of abstraction. It should be relatively transparent what the library does with each request to the upstream API. At the same time, I want to provide the best developer experience by anticipating the common needs.

This is naturally done by separating the API into a low- and high-level layers. In general, the best APIs on this planet implement the high-level using only abstractions they make available for the users. This ensures that users could extend or even define their own higher layer, more suitable for their needs.

The BotKit follows exactly this strategy.

BotKit API generation BotKit API generation

The lower level consists of the generated sources, declaring:

  1. Data types that only carry the data and no logic
  2. HTTP request methods that only make the request, dealing with serialization

The higher level has sub-layers building on top of each other:

  1. Fail-safe methods that allow handling semantic errors provided in Telegram’s response
  2. Happy-path methods that remove the noise of unwrapping
  3. Semantic helper methods that act as shortcuts for the common use cases

Let’s take a look at all of them in detail.

Data types

Data types are generated as Kotlin’s data classes — a natural fit.

@Serializable
data class Message(
    val messageId: MessageId,
    val date: UnixTimestamp,
    val chat: Chat,
    val from: User? = null,
    // ...
)

The @Serializable annotations come from kotlinx.serialization. Using this library ensures the support for all Kotlin features, such as dealing with nullability and working with inline value classes.

Some of the classes like InputMedia form sealed type hierarchies to differentiate between cases. A number of primitive fields get wrapped into inline value classes like UnixTimestamp to provide better semantics and stronger type-safety.

The data classes also don’t have any domain-specific methods defined on them. Instead, all the domain methods are defined as extension function, as described further. This is to ensure that the full surface of the API is extensible by users in the same the library itself does it.

The generation process also makes sure to bring all the official docs right into the IDE.

Data types kdocs Data types kdocs

In the today’s nice Telegram Bot API 6.9, the Message data types has a whopping 72 fields! However, in practice, only a few of these fields will have a value. Because of this, BotKit provides a user-friendly version of the toString() for all data classes that includes only fields with a value.

HTTP requests

For each method in the Bot API spec, there is a lower level function that does an actual HTTP call with the payload.

suspend fun TelegramBotApiClient.trySendMessage(
	requestBody: SendMessageRequest
): TelegramResponse<Message> =
    executeRequest("sendMessage", requestBody) {
        httpClient.post {
            url {
                protocol = apiProtocol
                host = apiHost
                port = apiPort
                path("bot$apiToken", "sendMessage")
            }
            contentType(ContentType.Application.Json)
            setBody(requestBody)
        }.body()
    }

It is a suspend function, because it is doing a remote call and allows the client to decide whether they want to wait for it or do it asynchronously. Under the hood, it uses the Ktor framework.

This function also returns a TelegramResponse<T> — a generic type for the Bot API response that accounts for the case of semantic errors. Examples of such errors include usage of an already deleted message, sending a message to a user who blocked the bot, etc.

Next level

Using the lower level methods would create a lot of noise in the user code. That’s why the BotKit provides 4 overloads for each method.

A method that does not require manually creating a data class instance, but still allows for error handling:

// API
suspend fun TelegramBotApiClient.trySendMessage(
    chatId: ChatId,
    text: String,
    parseMode: ParseMode? = null,
	// ...
): TelegramResponse<Message> =
    trySendMessage(SendMessageRequest(chatId, text, parseMode, /*...*/))


// Usage
suspend fun usage(client: TelegramBotApiClient, chatId: ChatId) {
	val response = client.trySendMessage(chatId, "hey")
	response.result?.let { message ->
		println(message)
	}
}

A happy-path method that unwraps the response, throwing an exception when it is not ok:

// API
suspend fun TelegramBotApiClient.sendMessage(
    chatId: ChatId,
    text: String,
    parseMode: ParseMode? = null,
	// ...
): Message =
    trySendMessage(SendMessageRequest(chatId, text, parseMode, /*...*/)).getResultOrThrow()


// Usage
suspend fun usage(client: TelegramBotApiClient, chatId: ChatId) {
	val message = client.sendMessage(chatId, "hey")
	println(message)
}

And now it gets a bit more interesting.

You can see that the methods above require the client type to be a receiver in the function declaration. But we want an API that allows making semantic calls like message.reply("hey"). There, the receiver has to be the message. So the client instance would have to come through the regular argument door for those methods, and through the receiver door for others. It would make the API inconsistent.

This is where the Context Receivers come in. They allow us to do exactly what we need: keep the receiver position vacant, while still providing access to the client instance in the function body.

// API
context(TelegramBotApiContext)
suspend fun sendMessage(
    chatId: ChatId,
    text: String,
    parseMode: ParseMode? = null,
    // ...
): Message =
    botApiClient.trySendMessage(SendMessageRequest(chatId, text, parseMode, /*...*/)).getResultOrThrow()


// Usage
context(TelegramBotApiContext)
suspend fun usage(chatId: ChatId) {
	val message = sendMessage(chatId, "hey")
	println(message)
}

For completeness, there is also one more overload that uses the context instead of a receiver, but does not unwrap the TelegramResponse.

The context type is relatively trivial and follows the Context Receivers best practices. It only exposes the botApiClient member seen in the sendMessage implementation.

interface TelegramBotApiContext {
    val botApiClient: TelegramBotApiClient
}

Semantic methods

Now, that we have all the building blocks in place, we can provide some methods that are not part of the spec, but are generally useful for bot implementations.

// API
context(TelegramBotApiContext)
suspend fun Message.reply(
    text: String,
    parseMode: ParseMode? = null,
    // ...
): Message =
    sendMessage(chat.id, text, parseMode, /*...*/)


// Usage
context(TelegramBotApiContext)
suspend fun usage(incoming: Message) {
	val replyMessage = incoming.reply("hey")
	println(replyMessage)
}

Notice in the reply declaration, how we don’t have to pass the chatId anymore, because we can use the id of the message from the same chat.

There are many more semantic methods provided by the BotKit. Here are just a few examples.

  • Working with messages:
    • Message.replyMarkdown() and Message.replyHtml()
    • Message.forward() and Message.editText()
  • Working with chats:
    • Chat.sendMessage() and ChatId.sendMessage()
  • Working with queries:
    • CallbackQuery.answer() and InlineQuery.answer()

You can easily discover all of them via auto-completion:

Semantic methods in auto-completion Semantic methods in auto-completion

The best part, though, is that you can just as easily introduce your own!

For instance, you might have to frequently edit the message text to append new content to it. This is how you can implement it yourself:

// User-defined semantic method
context(TelegramBotApiContext)
suspend fun Message.appendText(
    additionalText: String,
): Message =
    editText(text = (message.text ?: "") + additionalText)


// Usage
context(TelegramBotApiContext)
fun usage(message: Message) {
	val updatedMessage = message.appendText("UPD: bots are cool!")
	println(updatedMessage)
}

Futures and options

For the Telegram BotKit library, this is only the beginning.

There are a number of interesting things that would make the library even better:

  • File download and upload handlers
  • Inline keyboard dialogs DSL
  • Scenario-based and local integration testing

I plan to keep evolving the library, and one day bring it to the stable release.

Conclusion

This post introduced a new kit on the block of Telegram Bot API wrappers — the Telegram BotKit for Kotlin.

In this article, I shared how this library came to be, including the challenges and the approach to the API creation. I described the structure of the API in detail, and explained how the library’s extensibility is supported by the lower and higher level separation of API elements.

The API provides at least 4 flavors of each method to satisfy most needs. Some of them are powered by Kotlin’s experimental language feature called Context Receivers. Semantic methods are built on top of this, making the user code even more readable and concise.

I hope you enjoyed the tour! I am looking forward to your feedback and feature requests.