An Age Pyramid in Altair

3 minute read

Charting an Age Pyramid in Altair

Altair is a declarative statistical visualization library for Python, based on

Vega and Vega-Lite, and the source is available on GitHub.

import altair as alt
from vega_datasets import data
from altair.expr import datum, if_

If you are running this code in a Jupyter notebook (as opposed to a JupyterLab book),

uncomment the next cell and run it to enable rendering in the notebook session. ###

# alt.renderers.enable('notebook')

If you are using a notebook and fail to run this cell, the following error is displayed:

<VegaLite 2 object> If you see this message, it means the renderer has not been properly enabled for the frontend that you are using. For more information, see

Perhaps, you simply forgot?

If so, you may still run into trouble, as I did when I switch to the Jupyter notebook. When you run the cell, you may get this other message:

ValueError: To use the ‘notebook’ renderer, you must install the vega package and the associated Jupyter extension. See for more information.

Since I had installed Altair for Jupyter only, I needed to install the missing components in my local environment: conda install -c conda-forge notebook vega

This is a correction of the Gallery example, which renders an inverted age pyramid.

pop = data.population()

# Get the min and max of the slider tool from the dataset:
slider = alt.binding_range(min=pop.year.min(), max=pop.year.max(), step=10)

# If name is None or not given, the default slider title of "selector<nnn>" will be used;
# Note 1: The <nnn> portion change as per the number of time the chart has been refreshed.
# Note 2: name (or default string) is automatically concatenated with "_" (?) and fields.
# Note 3: To my knowldege, the slider does not take an initial value, which could be min by default,;
#             Instead, the initial position seems to be the middle of the range, but not quite.
#             Also, the initial position is not labeled.

select_year = alt.selection_single(name='Select', fields=['year'], bind=slider)

base = ( alt.Chart(pop).add_selection(select_year)
                       .transform_calculate(gender=if_( == 1, 'Male', 'Female')) )

title = alt.Axis(title='population')
color_scale = alt.Scale(domain=['Male', 'Female'], range=["steelblue", "salmon"])

# Try this: change alt.Y with alt.X, and keep all else the same: there should not be any difference.
# My guess is that these are methods for encoding axes, so the assignment does not really matter:
# its the lower case x and y assigned to them that matter.

left = ( base.transform_filter(datum.gender == 'Female')
             .encode(y=alt.Y('age:O', axis=None, sort='descending'),
                         color=alt.Color('gender:N', scale=color_scale, legend=None))
              .mark_bar().properties(title='Female') )

middle = base.encode(y=alt.Y('age:O', axis=None, sort='descending'),
                                 text=alt.Text('age:Q')).mark_text().properties(width=20, title='Age')

right = ( base.transform_filter(datum.gender == 'Male')
              .encode(y=alt.Y('age:O', axis=None, sort='descending'),
                          x=alt.X('sum(people):Q', axis=title),
                          color=alt.Color('gender:N', scale=color_scale, legend=None))
               .mark_bar().properties(title='Male') )

# Concatenate the three charts horizontally, same as using alt.hconcat(left, middle, right):
left | middle | right

Age Pyramid