Programming Bayesian Networks with Infer.NET and Python

Note: I have moved my blog to my personal site. The most updated version of this post can be found here.

In the search of a good tool or programming library for Bayesian networks (a.k.a probabilistic graphical models, belief networks, if you don’t know what they mean then this post is not for you), I came by Infer.NET by Microsoft Research. This library looks promising. It is a modern library (many BN libraries and tools were developed a long time ago and were stopped developing). It is built on a good platform (.NET). It supports various languages (C#, C/C++, F#, VB, (Iron)Python, basically any language that works on the .NET platform). It is supposed to be fast and scalable. There are some drawbacks though: although it can be used on Mac OS and Linux with the Mono implementation of .NET framework, it is mainly for Windows; its API is a bit difficult to use for those who have never developed for .NET (like me). Update: I wrote a new post on installing Infer.NET on Linux with Mono (the post applies to Mac OS too).

It took me several hours to learn the basics of Infer.NET and wrote my first program using it in Python. This post is a short tutorial on Infer.NET for people like me. Its main purpose is for my reference, but I hope it can help others too. Note that this post is just a draft. I will update it as I learn more about Infer.NET and will move the finished article to my academic or personal website, hopefully with articles on other BN tools and libraries.

Install Infer.NET for Python

A .NET variant of Python, called IronPython, runs Python code on the .NET platform and can use .NET libraries (I heard that Microsoft would include IronPython in future versions of Visual Studio). You will need:

  • .NET platform (or Mono if you are on Mac OS or Linux). I chose .NET instead of Mono, although my main computer runs Mac OS, because I wanted to avoid any unnecessary hassle caused by a non-official implementation. You should install .NET version 4.
  • (optional) Visual Studio .NET, especially if you want to use C#, C/C++, VB, or F#. I installed VS. NET 2010 Ultimate.
  • Infer.NET; you should take note of the directory where Infer.NET was installed.
  • IronPython
  • and a good text editor, like Emacs, but Notepad is fine.

Basic usage

Like other BN tools, you construct a BN by first defining random variables then assigning prior distributions (chosen among a list of supported distribution types) to some of them. However, unlike many BN tools where conditional probability distributions (CPD’s) are restricted to certain forms (CPT’s and linear Gaussians), Infer.NET seems quite flexible as it utilizes full capability of object-oriented programming, generic programming, and of course operator overloading. In theory, you can easily build custom expressions for CPD’s. In Infer.NET, you specify a conditional linear Gaussian p(Y|X) ~ N(A*X+b,sigma^2) in its natural form as Variable.GaussianWithMeanAndVariance(A*X + b, sigma^2) instead of something like LinearGaussian(A, b, sigma^2) after specifying that Y is conditioned on X (observe that X is hidden in the expression of p(Y|X)).

Note that inference in Infer.NET is only approximate, like BUGS (and unlike tools such as BNT, which support exact inference).

Following are basic steps to use Infer.NET with Python for inference.

Initialization

First, you need to import basic .NET libraries and Infer.NET libraries into Python:

import System
from System import *
import clr

import sys
sys.path.append(r'C:\Path\to\your\Infer.NET\bin\Release')

clr.AddReferenceToFile("Infer.Compiler.dll")
clr.AddReferenceToFile("Infer.Runtime.dll")

# import all namespaces
import MicrosoftResearch.Infer
import MicrosoftResearch.Infer.Models
import MicrosoftResearch.Infer.Distributions
from MicrosoftResearch.Infer import *
from MicrosoftResearch.Infer.Models import *
from MicrosoftResearch.Infer.Distributions import *

Those lines import most of things you will need.

Construct models

You define random variable by calling static methods (or class methods) of the Variable class. To create CPD for a random variable that is conditioned on other variables, you need to define the parents first, then define the distribution of the child with the appropriate expressions of its parents. Specifically:

  • To define a prior distribution of a random variable X, use
    X = Variable.<distribution function>(<parameters>)
    

    For example,

    X = Variable.Bernoulli(0.8)
    

    to create a Bernoulli random variable X with p(X=T) = 0.8,

    X = Variable.Discrete(Array[float]([0.5, 0.3, 0.2]))
    

    to create a 3-value discrete random variable X with p(X=0)=0.5, p(X=1)=0.3, p(X=2)=0.2,

    X = Variable.GaussianFromMeanAndVariance(20, 2^2)
    

    to create a Gaussian random variable X with mean 20 and variance 2^2.

  • To create Y with CPD conditioned on continuous variables X1,…,Xn, define X1,…,Xn first, then define Y using the appropriate distribution function but supply its parameters with expression of X1,…,Xn. For example:
    Y= Variable.GaussianFromMeanAndVariance(2*X1+X2, X3^2)
    

    to create (Y|X1,X2,X3) as Gaussian with mean (2*X1+X2) and variance X3^2.

  • To create Y with CPD conditioned on discrete variables X1,…,Xn, define X1,…,Xn first, then use cascaded condition blocks on X1,…,Xn. The condition blocks are:
    • with (Variable.Case(Xk, x)): specifies the case for Xk=x.
    • with (Variable.Switch(Xk)): is equivalent to multiple Case blocks, iterating over all values of integer variable Xk.
    • with (Variable.If(Xk)): specifies the case when boolean variable Xk is true. Use IfNot for the complement.

    As an example:

    Y = Variable.New[bool]()
    with (Variable.Case(X,0)):
        Y.SetTo(Variable.Bernoulli(0.1))
    with (Variable.Case(X,1)):
        Y.SetTo(Variable.Bernoulli(0.5))
    

    to specify a CPD: P(Y=1|X=0) = 0.1, P(Y=0|X=0) = 0.9, P(Y=1|X=1) = P(Y=0|X=1) = 0.5. Important note: for scalar variables, you must create them outside condition blocks and use SetTo() method inside condition blocks, as in the previous code fragment. See this guide for more details.

  • To create Y with CPD conditioned on both continuous and discrete variables, combine the previous cases. For example:
    Y = Variable.New[float]()
    with (Variable.Case(Z,0)):
        Y.SetTo(Variable.GaussianWithMeanAndVariance(X,2^2))
    with (Variable.Case(Z,1)):
        Y.SetTo(Variable.GaussianWithMeanAndVariance(0.9*X,3^2))
    

Inference

First, create an inference engine by: ie = InferenceEngine(). Then specify observations (evidence) by assigning values to property ObservedValue of random variable. Inference (computing posteriors) is carried out by ie.Infer(X). For example:

Y.ObservedValue = 5.0;
print "Probability of X is:", ie.Infer(X)

An example

Consider two redundant sensors S1 and S2 measuring a physical value X, and a (rough) prediction F of X. The graph is as follows: X->X1, X->X2, S1->X1, S2->X2, X->F. X1 and X2 are measurements, while S1 and S2 represent the states of the sensors (0: normal, 1: open circuit (damaged), 2: strange value). All X* and F are continuous. X is N(25, 40^2) (actually, this is not important). F|X is N(X, 3^2). Xk|X,Sk=0 is N(X,1). Xk|X,Sk=1 is N(-46,1). Xk|X,Sk=2 is N(X,40^2). Given observations of X1 and X2 and F, compute distribution of X, S1 and S2. The Python program is given below. It is easy to follow and quite readable.

import System
from System import *
import clr

import sys
sys.path.append(r'C:\path\to\Infer.NET\bin\Release')

clr.AddReferenceToFile("Infer.Compiler.dll")
clr.AddReferenceToFile("Infer.Runtime.dll")

# import all namespaces
import MicrosoftResearch.Infer
import MicrosoftResearch.Infer.Models
import MicrosoftResearch.Infer.Distributions
from MicrosoftResearch.Infer import *
from MicrosoftResearch.Infer.Models import *
from MicrosoftResearch.Infer.Distributions import *

# The model
X = Variable.GaussianFromMeanAndVariance(25,40^2)

nSensor = Range(2)
S = Variable.Array[int](nSensor)
with (Variable.ForEach(nSensor)):
    S[nSensor] = Variable.Discrete(Array[float]([0.99, 0.001, 0.009]))
Xh = Variable.Array[float](nSensor)
with (Variable.ForEach(nSensor)):
    with (Variable.Case(S[nSensor],0)):
        Xh[nSensor] = Variable.GaussianFromMeanAndVariance(X, 1)
    with (Variable.Case(S[nSensor],1)):
        Xh[nSensor] = Variable.GaussianFromMeanAndVariance(-46, 1)
    with (Variable.Case(S[nSensor],2)):
        Xh[nSensor] = Variable.GaussianFromMeanAndVariance(X, 1600)

F = Variable.GaussianFromMeanAndVariance(X, 9)

# The inference
ie = InferenceEngine()
Xh.ObservedValue = Array[float]((23, -45))
F.ObservedValue = 22
print "Probability of X:", ie.Infer(X)
print "Probability of S:", ie.Infer(S)
Advertisements
This entry was posted in Python, Research and tagged , , . Bookmark the permalink.

5 Responses to Programming Bayesian Networks with Infer.NET and Python

  1. Pingback: Install Infer.NET on Linux | Truong's Weblog

  2. nrolland says:

    yeah, except that with such a licence, what is the point of investing time in it ?

  3. Devendra says:

    I am able to import all of the modules except
    “from MicrosoftResearch.Infer.Distributions import *” last import
    I am using ubuntu 12.04. I installed mono and I run ironpython(ipy.exe) by downloading the source.

  4. ciabo says:

    Hi
    I’m new in the world of Infer.NET. Do you know a way to run a project written in c# with mono instead of using IronPython??
    Thanks

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s