Easier Plate Notation in Python using Daft

Plate notation is a useful visual method for describing graphical models, but the software can be awkward. Here we demonstrate daft-pgm, a solution using pure Python.

Plate Notation

Plate notation has become something of a standard method for describing probabilistic graphical models a.k.a. Bayesian models / Bayesian networks. It offers a compact visual representation of model structure - graphs and subgraphs of directed edges and nodes - making model comparisons easier, and is very popular in the discipline of machine learning.

For example, the following diagram describes a two-level hierarchical linear model which is quite complicated to fully describe in mathematical notation. The diagrammatic form offers a useful shorthand - albeit one that is incomplete, as I discuss in the Notebook below:

Example of Plate Notation

As with any mathematical notation, there's a number of ways to do it, and what we want is a quick, programmatic way to create and share the diagrams. A quick search yielded some suggestions for existing software:

All the above are reasonable methods, but I really want something that fits into the Python ecosystem and can be run within Jupyter Notebooks without needing to write and cross-compile other code.

Daft-PGM

Happily, there exists daft-pgm, a small, lightweight package for drawing plate notation diagrams purely in Python using the long-established matplotlib package to do much of the heavy lifting. Daft is developed by Dan Foreman-Mackey, astronomer, Bayesian statistician and developer of several great tools including George for Gaussian process regression and emcee an incredibly fast ensemble MCMC sampler that has seen a lot of use with Applied AI.

As per the project webpage: Daft is a Python package that uses matplotlib to render pixel-perfect probabilistic graphical models for publication in a journal or on the internet. With a short Python script and an intuitive model building syntax you can design directed (Bayesian Networks, directed acyclic graphs) and undirected (Markov random fields) models and save them in any formats that matplotlib supports (including PDF, PNG, EPS and SVG).

That all sounds promising, and the examples on the project webpage look great, so let's give it a try:


Daft in Action: Worked Examples of Hierarchical Linear Models

The following Notebook accompanies a larger project called pymc3_vs_pystan which I wrote primarily for presentation at the PyData London 2016 Conference.

The Notebook is available in a dedicated repo on our public Github. The following static render lets the casual reader go through it all here too:


In Summary

Daft seems to do the job very well indeed:

  • Plate notation is a useful addition to mathematical notation for describing models, and daft-pgm makes it very straightforward to construct these diagrams entirely in Python, using a simple API based on the familiar matplotlib package.
  • The above diagrams show an easy-to-follow progression from a basic pooled linear model right up to a two-level nested hierarchical linear model.
  • Finally, even with this most complicated hierarchical model, the code required to control the daft-pgm diagram is not too onerous, and can be easily broken into a separate class / file if desired.
  • Crucially, we didn't have to leave the Python environment nor learn any new TikZ/LaTeX code, nor resort to semi-automated and manual methods to construct the diagrams: they're all fully repeatable and reproducible.

I hope this was useful, let me know if you try using daft-pgm for your own projects, and if you'd like to get involved, I believe code contributions are welcome on the project's Github repo.



Jonathan Sedar

Jon founded Applied AI in 2013 to bring a bespoke data science consulting service to the financial sector. His twelve years technical and advisory experience are sought by senior audiences worldwide.