Option of Sequences in Scala

Assuming you have a dataset and its schema contains nested structures with few nullable fields. Often these fields are modelled as Options in Scala. The alternative is to not model them as Options but deal with nulls instead of Nones, but who wants that? So when you do have these Options that contains Options that contain Sequences in Scala, how do you access them without making your code too verbose or your colleagues hate you or Scala? Let's create an example and walk-through of few options, pun intended!

Example Structure ๐Ÿงช

Let's assume the following nested structure:

type X = String
case class A(osx: Option[Seq[X]])
case class B(oa: Option[A])

Note that I have picked the names of the types here to match the letter for the types used in Scala's library.

So given b, an instance of B, our task is to retrieve the osx of A:

// Given instance b of B
val b = B( Some(A( Some(Seq("a1", "a2", "a3")) )) )

// Get the Seq[X] from A.osx
var xs: Seq[X] = ???

Useful Properties ๐Ÿงฐ

Before we move on to our experiment, we should take note of the following property of Options:

// Option is like a list that might contain 1 or 0 elements
Option(B).toSeq == Seq[B.type] 

To be clear, having and Option of B means that your field might contain no objects of type B, or just 1.

Option[B] {
  def toSeq: Seq[B] = {
    this match {
      case Some(b: B) => Seq[B](b)
      case None => Seq()
    }
  }
}

This is of course is similar to having an Option of sequence of type B which means that your field might contain no sequences of type B, or just 1... sequence.

Option(Seq[B]).toSeq == Seq[Seq[B.type]]

As we know, a sequence of a sequence of type B can be flattened to a sequence of type B:

Seq[Seq[B.type]].flatten == Seq[B.type]

First Approach ๐Ÿ”จ

Everything looks like a nail when you hold a hammer...

In the first approach we will work only with Options.

// Work with Options through B to A and then getOrElse
xs = b.oa.flatMap(_.osx).flatMap(sx => Some(sx)).getOrElse(Nil) // or
xs = b.oa.flatMap(_.osx).flatMap(Some(_)).getOrElse(Nil)

// with types written down
xs = b                // B
  .oa                 // Option[A]
  .flatMap[Seq[X]](   // Option[A].flatMap[Seq[X]]
    a => a.osx        //   A => Option[Seq[X]]
  )                   // Option[Seq[X]]
  .flatMap[Seq[X]](   // Option[Seq[X]].flatMap[Seq[X]]
    sx => Some(sx)    // Seq[X] => Some[Seq[X]]
  )                   // Option[Seq[X]]
  .getOrElse[Seq[X]]( // Option[Seq[X]].getOrElse[Seq[X]]
    Nil               // default :=> Seq[X]
  )                   // Seq[X]

Learning point: This approach is easy to follow as it uses Options all the way but it is a bit more verbose than what it should. Nothing wrong here!

Second Approach ๐ŸŒช

In the second approach we can convert an Option into Seq and flatten that in order to retrieve the sequence of Xs.

// Convert B.oa Option to Sequence, then apply a function in flatMap to get the A.osx
xs = b.oa.toSeq.flatMap(a => a.osx.toSeq.flatMap(sx => sx)) // or
xs = b.oa.toSeq.flatMap(a => a.osx.toSeq.flatten) // or
xs = b.oa.toSeq.flatMap(_.osx.toSeq.flatten)

// with types written down
xs = b                 // B
  .oa                  // Option[A]
  .toSeq               // Seq[A]
  .flatMap[X, Seq[X]]( // Seq[A].flatMap[X, Seq[X]]
    a =>               //   A => Seq[X]
      a.osx            //     Option[Seq[X]]
      .toSeq           //       Seq[Seq[X]]
      .flatten[X]      //         Seq[Seq[X]].flatten[X]
  )                    // Seq[X]

Learning point: This approach is dealing only with Sequences but it complicates the conversion because it uses a deeper nested function.

Third Approach ๐ŸŒˆ

In this approach, we will work with a smaller nested function and convert to Seq only at the end.

// Work with Options to get A.osx, convert to Seq of Seq and finally flatten it
xs = b.oa.flatMap(a => a.osx).toSeq.flatten // or
xs = b.oa.flatMap(_.osx).toSeq.flatten

// with types written down
xs = b              // B
  .oa               // Option[A]
  .flatMap[Seq[X]]( // Option[A].flatMap[Seq[X]]
    a => a.osx      //   A => Option[Seq[X]]
  )                 // Option[Seq[X]]
  .toSeq            // Seq[Seq[X]]
  .flatten          // Seq[X]

Learning point: This approach is simpler and less verbose because it is handling Options to get the required field, smaller function in flatMap and less complicated types.

Conclusion ๐Ÿ“ฃ

The recommendation here is to use Options to reach the field, smaller functions in flatMap, and then finally Option to sequence Seq along with flattening to get the result.

You can find the code of the above experiment in GitHub Gist, here:

https://gist.github.com/kyrsideris/ceccce60d2a7002d6e15184579c563ed

... and please leave a comment or suggestion for improvement! ๐Ÿ˜€

Created by Kyriakos Sideris, ยฉ 2021