There are times when using your computer or working online requires you to select a date and time. Tasks like choosing a doctor’s appointment slot, scheduling an email, or setting a reminder fall into this category.

In such cases, you’re typically presented with a month-view calendar widget to find the right day, followed by an option to pick a time. Mobile interfaces could get more elaborate: a cogwheel-style skeuomorphic time picker on iOS or a wall clock-like widget on Android. They get the job done, but in a sense, they are all very mechanical.

If you compare this to how people describe time to each other, the difference is like simply tapping on your friend’s name to make a phone call versus punching in an actual phone number. Both methods yield the same result, but one is significantly more human friendly.

Languages, and English is not an exception, accommodate a myriad of ways to describe dates and times. Sometimes, it’s actually more convenient to describe time in text form, instead of going through the process of clicking and scrolling. Some apps recognize that and provide users with this capability. Examples include date chips in Google Docs, reminders in Slack and scheduled tweets in Typefully.

Date chips in Google Docs Date chips in Google Docs

Scheduling a tweet on typefully.com Scheduling a tweet on typefully.com

Outline

In this series of posts, we will set out to create a library that is able to parse every common way of describing time in English. Doing the same for the language of your choice should not be too complicated afterward, and would be a take-home assignment.

We will compose our date-time parser in Kotlin using Parsus library. I created Parsus to explore use-cases exactly like these. If you want to learn more, you can watch my talk at KotlinConf 2023. There, I explain how Parsus leverages coroutines to gain some unique features.

Because Parsus is multiplatform, we would be able to use our new parser anywhere: on the backend, in native apps, and even in the browser. In fact, there is an interactive playground embedded right in this post that uses the parser compiled to JavaScript via Kotlin/JS.

The full source code of the parser is also available on GitHub.

Modeling expectations

There are indeed a lot of ways to describe time in English. But we can always start small and parse only the simplest formats, setting up the scene for later expansion.

For the scope of this post, let’s assume we want to support these formats:

  • Date: 2023-08-05, 10/12/2023
  • Time: 12:34, 5:25 pm

We can model them in a few lines of Kotlin:

data class MiniDateTime(val date: MiniDate? = null, val time: MiniTime? = null) {
    constructor(time: MiniTime) : this(null, time)
}

data class MiniDate(val year: Int, val month: Int = 1, val dayOfMonth: Int = 1)

data class MiniTime(val hour: Int, val minute: Int = 0)

We’ll let the business logic in the production code decide what it means when either a date or a time isn’t specified. The meaning could depend on the actual time context in which the model is evaluated. For instance, if it’s 10am in the user’s timezone, then a reminder for 5pm is for today, while a reminder for 8am is for tomorrow.

It’s time to start

Using Parsus framework, we define our parser in a grammar. Initially, we want to parse time in a 24H format, i.e. strings like 12:34.

You can think of a grammar being a decomposition of a regex, only each component is type-safe and reusable. The most basic components are called tokens, and the more complex ones — parsers.

class MiniDateTimeGrammar : Grammar<MiniDateTime>() {

    val digit12 by regexToken("\\d{1,2}") map { it.text.toInt() }

    val colon by literalToken(":")

    val time24: Parser<MiniTime> by parser { TODO() }

    override val root by time24 map { MiniDateTime(it) }
}

There is already a lot going on here, so let’s break it down. Inside a grammar class, we declare two tokens and two parsers:

  • digit12 token parses strings that consist of one or two digits and converts it to a number;
  • colon token gives a name to the : character, so can be referenced later;
  • time24 is our first work horse parser that we will implement in a second;
  • root parser override is required by the Grammar to denote an entry point for parsing.

Now, let’s compose time24.

Parsus allows writing parsers as simple procedural code. Whenever we need to parse the next thing inside a parser {} block, we just invoke the necessary parser as if it was a function. Incidentally, tokens are also parsers!

val time24: Parser<MiniTime> by parser { // this: ParsingScope
    val hours: Int = digit12()
    colon() // ignored result
    val minutes: Int = digit12()
    MiniTime(hours, minutes)
}

It works exactly like it reads. We parse the first number, then a colon, then another number. Finally, we return the parsed result as our model.

As a side note, none of the explicit type declaration are required here, but I include them so that the code is easier to follow. The same applies for most of the code snippets in this post.

With that, we can already use the grammar to parse our first input:

fun main() {
    val input = "12:34"
    val result = MiniDateTimeGrammar().parseOrThrow(input)
    println(result)
    // MiniDateTime(date=null, time=MiniTime(hour=12, minute=34))
}

Parser combinators

Writing parsers as procedural code can be useful, because they stay readable, debuggable and extremely flexible. However, in cases when a parser is simple enough, we can also use high-order functions on parsers to combine them.

Here is how we can rewrite the time24 parser from above.

val time24: Parser<MiniTime>
    by digit12 and ignored(colon) and digit12 map
        { (h: Int, m: Int) -> MiniTime(h, m) }

Here, we use parser combinators called and and ignored to define the exact same parser. It still parsers three fragments, ignoring the result of the second one. The combined type of and applications is Parser<Tuple2<Int, Int>>. This is Parsus representation of a pair of values. It can be destructured to extract the values as shown in the last call to map.

Given how handy it is to combine parsers this way, Parsus also provides an even more concise syntax based on Kotlin’s * (times) and - (unaryMinus) operators:

val time24: Parser<MiniTime>
    by digit12 * -colon * digit12 map
        { (h: Int, m: Int) -> MiniTime(h, m) }

This is the notation we are going to use in the rest of the post.

We need more time

Now, let’s extend the time format and support inputs such as 5pm or 10:30 am.

These are the new tokens we are adding in the beginning of the grammar:

val am by literalToken("am")
val pm by literalToken("pm")

val ws by regexToken("\\s+") // whitespace

And now the new parsers:

val afterNoon: Parser<Int> by am or pm map
    { if (it.token == pm) 12 else 0 }

val timeAmPm: Parser<MiniTime>
    by digit12 * maybe(-colon * digit12) * -maybe(ws) * afterNoon map
        { (h: Int, m: Int?, amPm: Int) -> MiniTime((h + amPm) % 24, m ?: 0) }

The afterNoon parser will parse the am / pm ending and compute an offset for the hour. The timeAmPm parser is effectively a type-safe version of the following regex: \d{1,2}(\:\d{1,2})?\s*(am|pm).

Here we also see two new parser combinators:

  • or in the afterNoon is used to combine alternatives — if the first one fails, the next one is tried
  • maybe falls back to null whenever the underlying parser fails

Now we can bring the two time formats together:

val time: Parser<MiniTime> by timeAmPm or time24

override val root by time map { MiniDateTime(it) }

The alternatives must be in this order: timeAmPm or time24. Otherwise, the parsing might fail, as in the example 5:24pm. If time24 comes first, then it would succeed parsing 5:24 as valid time, but there will be no continuation to parse pm.

Juicy dates

Next up is parsing dates like 2023-08-05, 10/12/2023.

We know the drill already: new tokens, new parsers, nice compositions.

val digit4 by regexToken("\\d{4}") map { it.text.toInt() }
val digit12 by regexToken("\\d{1,2}") map { it.text.toInt() }

val dash by literalToken("-")
val slash by literalToken("/")

val dateSep by dash or slash

val yyyyMMdd: Parser<MiniDate>
    by digit4 * -dateSep * digit12 * -dateSep * digit12 map
        { (y: Int, m: Int, d: Int) -> MiniDate(y, m, d) }

val ddMMyyyy: Parser<MiniDate>
    by digit12 * -dateSep * digit12 * -dateSep * digit4 map
        { (d: Int, m: Int, y: Int) -> MiniDate(y, m, d) }

val date: Parser<MiniDate> by yyyyMMdd or ddMMyyyy

val dateTime: Parser<MiniDateTime>
    by maybe(date) * -maybe(ws) * maybe(time) map
        { (d: MiniDate?, t: MiniTime?) -> MiniDateTime(d, t) }

override val root by dateTime

We combine the date and time formats together in the dateTime parser. It is also a good practice to delegate root to a named parser, instead of having logic there.

The grammar is finally ready for some real use-cases:

fun main() {
    fun parse(s: String) = MiniDateTimeGrammar().parseOrThrow(s)
    println(parse("09/01/2007 9:42 am"))
    println(parse("2077-12-10 5:25"))
}

Chickens, eggs, years

We may never know what came first, but what we know for sure is that in different countries people believe different things. Does 09/01/2007 represent January 9th or September 1st?

Luckily, our grammar is regular Kotlin code, so we can easily support both cases by adding a parameter:

class MiniDateTimeGrammar(
    private val dayThenMonth: Boolean = true,
) : Grammar<MiniDateTime>() {
    // ...

    val d2d2yyyy: Parser<MiniDate>
          by digit12 * -dateSep * digit12 * -dateSep * digit4 map
              { (v1: Int, v2: Int, y: Int) ->
                  if (dayThenMonth) MiniDate(y, month = v2, v1) else MiniDate(y, month = v1, v2)
              }

    val date: Parser<MiniDate> by yyyyMMdd or d2d2yyyy

    // ...
}

When creating an instance of the grammar, we can infer the parameter value from the user’s locale. The beauty of it is that this change is not entangled with the rest of the parsing logic.

We can also test just one of the sub-parsers in our grammar:

fun main() {
    fun parseEU(s: String) =
        MiniDateTimeGrammar(dayThenMonth = true).run { parseOrThrow(d2d2yyyy, s) }
    fun parseUS(s: String) =
        MiniDateTimeGrammar(dayThenMonth = false).run { parseOrThrow(d2d2yyyy, s) }

    println(parseEU("09/01/2007"))
    // MiniDate(year=2007, month=1, dayOfMonth=9)

    println(parseUS("09/01/2007"))
    // MiniDate(year=2007, month=9, dayOfMonth=1)
}

Interactive playground

The icing on the cake when creating a Parsus grammar is that it can be compiled to Kotlin/JS and executed in the browser. Since I am building my blog with Astro, I can embed custom components that use JavaScript directly in the blog posts. This is exactly what I did using the grammar we just created!

Type something in the input field that follows the formats we support! You can then click the Parse button or hit Enter to view the parsed output. Remember, that at this point we support a very limited number of formats. You can find them sprikled around the post.

MiniTime Parser Playground
Output

Go back to the Outline or see the full parser source code on GitHub.

Conclusion

In this post, we started on a journey to create a multiplatform Kotlin library for parsing free form date and time descriptions. We defined a simple data model for absolute date-time values. We learned how to declare a Parsus grammar, and how to iterate on it, expanding the number of supported formats.

The final grammar in this step is able to parse inputs such as 09/01/2007 9:42 am or 2077-12-10 5:25. As a bonus, it also supports parsing the day or the month first in the 09/01 notation, depending on the user preference.

Finally, we discovered that it’s possible to compile Parsus grammars to JavaScript and even incorporate them into widgets embedded in blog posts. Thanks to the magic of Kotlin Multiplatform and Astro!

In future posts, we will explore relative and semantic definitions of time, such as in 2 days, next Sat morning or EOD on Aug 3rd.

I’m already counting the days!