Breathe Life Into Your Data

Breathe Life Into Your Data

Any chart, on its own, can be pretty boring and lifeless. This can lead to the reader not understanding your point. They need a splash of color or an explosion of size to start taking shape and bring understanding. In Vega-Lite, much of the look and feel of any graphic comes from adding data to channels. Doing this will breathe life into your data and help give it some shape. Great…but what are channels?

Channels are the visual properties of a chart (like size, shape, and color) you encode with your data. Said another way, channels are where you can customize your chart with your data. To aid you in this task, Vega-Lite provides you with numerous channels to use for chart creation.

In this chapter we will go over encoding data points in charts, to assist in communicating a more specific visual narrative, through manipulating channels, specifically:

To run these examples you will need to install the following Elixir packages:

  • explorer

  • vega_lite

  • kino

  • kino_vega_lite

The typical code structure for a chart can be automated using Smart Charts in Elixir

This will leave you with the following chart:

In this section, you will be focusing on changing the following line of code below that needs to be added to the template above:

...where the XXXX above is used to encode channels (i.e. Shape, Opacity, etc.).

Note: Other types of marks can be found here

1) Color

Smart Cells are a great way to create charts with minimal effort. Once created, you can convert them to code to further customize. The Chart Smart Cell is what you'll use below to color the mpg dataset by class.

Converting the above smart cell Chart will leave you with the following code:

Vl.new(width: 400, title: "Color by Class")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:point)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode_field(:color, "class", type: :nominal)

There can also be times when you need to change all of the data points to a single color (i.e."black"). To do so, you will need to change the following line of code from...

|> Vl.encode_field(:color, "class", type: :nominal)

...to

|> Vl.encode(:color, value: "black")

Your code should now look like the code below.

Vl.new(width: 400, title: "Color All Points Black")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:point)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode(:color, value: "black")

Note: Vl.encode_field is changed to Vl.encode(). The "_field," part of the code is shorthand for the encode function that uses the field parameter. Since we are not using a field from the dataset to color the points black part of the code ("_field") is not helpful. Use the more generic encode function by removing the "_field," as shown above. You can read more about it in the documentation.

To create filled-in circles as opposed to hollow circles simply change the :point mark to :circle. Your code will then look like this.

Vl.new(width: 400, title: "Color Fill All Points Black")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:circle)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode(:color, value: "black")

2) Shape

Working from the chart created with the Smart Cells, you can change the :shape instead of :color by changing one line of code. Change

|> Vl.encode_field(:color, "class", type: :nominal)

to

|> Vl.encode_field(:shape, "class", type: :nominal)

Your code should match the code below.

Vl.new(width: 400, title: "Shape")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:point)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode_field(:shape, "class", type: :nominal)

Note: While there are many mark types to choose from, you must use the :point mark when encoding the :shapechannel.

3) Size

Next up is Size. The only needed change, again, is the last line of code. You are removing :shape and replacing that with :size.

Vl.new(width: 400, title: "Size")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:point)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode_field(:size, "class", type: :nominal)

Again, to change the data points to filled vs. hollow circles, change the mark from :point to :circle.

Vl.new(width: 400, title: "Size With Solid Points")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:circle)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode_field(:size, "class", type: :nominal)

4) Opacity

You may have noticed that Vega_lite defaults to include opacity in all of the exercises we have already completed. However, you can specify opacity in the last line of code, substituting :size for :opacity.

Vl.new(width: 400, title: "Opacity")
|> Vl.data_from_values(mpg, only: ["displ", "hwy", "class"])
|> Vl.mark(:point)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode_field(:opacity, "class", type: :nominal)

5) Conditional Encoding

How does conditional formatting work?

Vega-Lite’s conditions may look a little different but are simple if-then statements. Each condition has a test, a value and a default value. The test checks to see if a condition is true. If it is, then the value is applied. Otherwise, apply the default value.

Looking at the Opacity chart above it appears that there are outliers in the dataset. If you look at the X and Y axis the outliers are grouped where: the y-axis is greater than 20 and the x-axis is greater than 5.

To visually show the outliers you can isolate them by changing colors, shape, size, etc. Let's try to isolate the outliers using color

Vl.new(width: 400, title: "Conditional Encoding")
|> Vl.data_from_values(mpg, only: ["hwy", "displ", "class"])
|> Vl.mark(:circle)
|> Vl.encode_field(:y, "hwy", type: :quantitative)
|> Vl.encode_field(:x, "displ", type: :quantitative)
|> Vl.encode(:color,
  condition: %{test: "datum['hwy'] > 20 & datum['displ'] > 5", value: "red"},
  value: "black"
)

One thing to notice in the code above is the use of the datum variable in your conditional test.

The datum variable represents a row in your dataset passed into VegaLite. You can access any column within the row by typing datum[column_name]. datum is commonly used in conditional encoding to specify a constant instead of a variable. It can be used anywhere you need to change how a column is handled based on its value. You'll see it used more throughout other Data Visualization sections.

Wrapping Up

Encoding channels with your data is what brings life to the charts you make. Channels provide you the flexibility to customize to your heart's content and to help your users better understand the story you’re sharing with them. Even though you covered seven different scenarios, you only just scratched the surface of all the different channels that Vega-Lite makes available to you. Check them out and start experimenting to see what you can come up with.

In the next section, you’ll learn how to use VegaLite to transform data.