Option of Sequences in Scala
Assuming you have a dataset and its schema contains nested structures with few nullable
fields. Often these fields are modelled as Option
s in Scala. The alternative is to not model them as Options but deal with null
s instead of None
s, but who wants that? So when you do have these Option
s that contains Option
s that contain Sequences in Scala, how do you access them without making your code too verbose or your colleagues hate you or Scala? Let's create an example and walk-through of few options, pun intended!
Example Structure ๐งช
Let's assume the following nested structure:
type X = String
case class A(osx: Option[Seq[X]])
case class B(oa: Option[A])
Note that I have picked the names of the types here to match the letter for the types used in Scala's library.
So given b
, an instance of B
, our task is to retrieve the osx
of A
:
// Given instance b of B
val b = B( Some(A( Some(Seq("a1", "a2", "a3")) )) )
// Get the Seq[X] from A.osx
var xs: Seq[X] = ???
Useful Properties ๐งฐ
Before we move on to our experiment, we should take note of the following property of Option
s:
// Option is like a list that might contain 1 or 0 elements
Option(B).toSeq == Seq[B.type]
To be clear, having and Option
of B
means that your field might contain no objects of type B
, or just 1.
Option[B] {
def toSeq: Seq[B] = {
this match {
case Some(b: B) => Seq[B](b)
case None => Seq()
}
}
}
This is of course is similar to having an Option
of sequence of type B
which means that your field might contain no sequences of type B
, or just 1... sequence.
Option(Seq[B]).toSeq == Seq[Seq[B.type]]
As we know, a sequence of a sequence of type B
can be flattened to a sequence of type B
:
Seq[Seq[B.type]].flatten == Seq[B.type]
First Approach ๐จ
Everything looks like a nail when you hold a hammer...
In the first approach we will work only with Option
s.
// Work with Options through B to A and then getOrElse
xs = b.oa.flatMap(_.osx).flatMap(sx => Some(sx)).getOrElse(Nil) // or
xs = b.oa.flatMap(_.osx).flatMap(Some(_)).getOrElse(Nil)
// with types written down
xs = b // B
.oa // Option[A]
.flatMap[Seq[X]]( // Option[A].flatMap[Seq[X]]
a => a.osx // A => Option[Seq[X]]
) // Option[Seq[X]]
.flatMap[Seq[X]]( // Option[Seq[X]].flatMap[Seq[X]]
sx => Some(sx) // Seq[X] => Some[Seq[X]]
) // Option[Seq[X]]
.getOrElse[Seq[X]]( // Option[Seq[X]].getOrElse[Seq[X]]
Nil // default :=> Seq[X]
) // Seq[X]
Learning point: This approach is easy to follow as it uses Option
s all the way but it is a bit more verbose than what it should. Nothing wrong here!
Second Approach ๐ช
In the second approach we can convert an Option
into Seq
and flatten that in order to retrieve the sequence of X
s.
// Convert B.oa Option to Sequence, then apply a function in flatMap to get the A.osx
xs = b.oa.toSeq.flatMap(a => a.osx.toSeq.flatMap(sx => sx)) // or
xs = b.oa.toSeq.flatMap(a => a.osx.toSeq.flatten) // or
xs = b.oa.toSeq.flatMap(_.osx.toSeq.flatten)
// with types written down
xs = b // B
.oa // Option[A]
.toSeq // Seq[A]
.flatMap[X, Seq[X]]( // Seq[A].flatMap[X, Seq[X]]
a => // A => Seq[X]
a.osx // Option[Seq[X]]
.toSeq // Seq[Seq[X]]
.flatten[X] // Seq[Seq[X]].flatten[X]
) // Seq[X]
Learning point: This approach is dealing only with Sequences but it complicates the conversion because it uses a deeper nested function.
Third Approach ๐
In this approach, we will work with a smaller nested function and convert to Seq
only at the end.
// Work with Options to get A.osx, convert to Seq of Seq and finally flatten it
xs = b.oa.flatMap(a => a.osx).toSeq.flatten // or
xs = b.oa.flatMap(_.osx).toSeq.flatten
// with types written down
xs = b // B
.oa // Option[A]
.flatMap[Seq[X]]( // Option[A].flatMap[Seq[X]]
a => a.osx // A => Option[Seq[X]]
) // Option[Seq[X]]
.toSeq // Seq[Seq[X]]
.flatten // Seq[X]
Learning point: This approach is simpler and less verbose because it is handling Options to get the required field, smaller function in flatMap and less complicated types.
Conclusion ๐ฃ
The recommendation here is to use Option
s to reach the field, smaller functions in flatMap, and then finally Option
to sequence Seq
along with flattening to get the result.
You can find the code of the above experiment in GitHub Gist, here:
https://gist.github.com/kyrsideris/ceccce60d2a7002d6e15184579c563ed
... and please leave a comment or suggestion for improvement! ๐