How to make a presentation with Augmented Reality using Processing

AT IAR2010 we did our presentation using a simple-simple software which I wrote using Processing.

FakePress at IAR2010

The software was written in a hurry in a single afternoon, so it turned out not being a general purpose utility, and being more a single-shot program usable to address that single presentation.

Nonetheless people actually enjoyed the presentation and the way we were using fiducial markers, and many of the people at the event were asking me if I could give them the software.

So I thought of publishing it over here so that anyone interested could take a look.

one of our AR slides

Again: it is very basic but you can use it as some sort of tutorial to get you started with AR using Processing.

And, so back to business: here is the zip containing the sources.

Source code of the software used for the AR presentation

The processing sketch basically uses content in your data directory and attaches them to fiducial markers so that when you hold up a marker to the camera, the corresponding content is shown.

(Note: the previous sentence turned out not being clear enough; it means that if you already don’t have a “data” directory setup in your processing sketch directory, you should create one and place your contents in there, according to what is explained in the rest of the post :) )

But let’s start from the beginning.

The program uses the TUIO library for processing. That means that before starting your presentation you need to start up the TUIO server so that it can connect to your webcam and start tracking the markers that enter/exit the field of view. I used the reactivision server, as it’s simple to install (you don’t actually need to install it, but merely unzip it to your hard drive) and has wide support for all platforms.

One issue I needed to address was that when reactivision is turned on it locks your webcam, not letting you use it inside your Processing sketch. So I used an additional webcam mounted on top of my monitor so that it is aligned to the one that is built in the frame of my laptop, thus obtaining basically the same point of view from both webcams. (a trick for OSX users: not many external USB webcams are mac-friendly, I bypassed this problem by using this little software that basically turns many models of USB webcams into mac-friendly ones).

two webcams on my mac

So: one camera used in reactivision and the other one used in Processing.

The webcam capture in Processing was done using the OpenCV framework and processing library.

So, let’s go through the source code a bit, so that I can explain the mess inside it :)

(again, sorry for the chaotic coding: it was all done in a few hours just for the conference, hope you find it useful anyway).

The main idea is that each marker has an ID (the reactivision classical amoeba markers have numeric IDs, and we used them). Each marker triggers a content named with its ID (for example marker with ID 20 could trigger the “20.mov” video file in your sketch’s data directory).

first of all we load the required libraries:

( Minim, to play sounds )

import ddf.minim.*;

import ddf.minim.signals.*;

import ddf.minim.analysis.*;

import ddf.minim.effects.*;

( OpenCV for webcam capture )

import hypermedia.video.*;

(the standard Processing video library to play videos)

import processing.video.*;

(the TUIO library to connect to the reactivision server)

import TUIO.*;

In the variables definitions section at the top of the coded I created one variable for each content: a Movie variable for each movie, an AudioPlayer variable for each sound, and a PImage variable for each image.

The variables are named against the marker ID that triggers the related content.

For the sake of bad programming, i pre-loaded all the contents up, so that they would be all in memory and just be triggered when needed during the presentations.

Much better strategies can be used, but this is the way you’ll find it done in the example. It actually is the best solution for responsiveness of your presentation, but it doesn’t work so well if you plan on presenting a lot of AR slides: in this case you should load content only when it’s needed: you’ll have a short lag when doing this, but there’s also ways in which you can prevent it, for example by preloading only the previous/next slides of the currently shown one, or tricks like that.

I won’t go through all the lines of code, but only highlight the key ones.

A java Hashtable is used in the software to store the status of the various contents: either playing or not playing.

The setup() method is used to start everything up and to preload the content using the initContent() method.

In this last method you will find the commands needed to load the various contents from your data directory. (Note: this is hardcoded in the source, so you will need either to customize it to your needs or to have movies and images exactly as i associated them to IDs)

Let’s see how various contents are loaded:

a Movie:

m4= new Movie(this, “4.mov”,30);

m4.loop();

m4.speed(0);

here I am loading a movie and putting it to sleep by setting its speed to 0. This is the only way I found to stop a movie and its audio using the Processing video library (some sort of but seems to be present in its stop and pause methods). When triggered by the corresponding fiducial marker we will set the speed back to 1 for playback.

for Images, the basic PImage is used:

p0 = loadImage(“p0.png”);

for Sounds, we use Minim’s loaders:

s7 = minim.loadFile(“7.wav”, 2048);

The software implements TUIO callbacks to react to fiducial markers being placed in front of the camera.

In the addTuioObject method, for example, we first catch the new objects in the field of view:

String sid = “” + tobj.getSymbolID();

and then we use the symbol ID to trigger the playing state on the hashtable:

String pstate = (String) playing.get( sid );

if(pstate.equals(“N”)){

playing.put(sid,”S”);

… (it goes on with an infinite length of IF..ELSE statements to check which content must be triggered)

}

In the removeTuioObject method, called by the TUIO library when a fiducial marker leaves the field of vision,we do exactly the opposite, turning off the content related to the marker that just left the field of vision.

Everything else takes place in the draw() method of the sketch.

First of all, the camera input is read and displayed (the second camera, to provide a background to our AR slides content):

opencv.read();

image( opencv.image(), 0, 0 );

Then the getTuioObjects() method offered by the TUIO library is used to handle, one by one, the playing of the contents related to the in-sight markers:

Vector tuioObjectList = tuioClient.getTuioObjects();

for (int i=0;i<tuioObjectList.size();i++) {

TuioObject tobj = (TuioObject)tuioObjectList.elementAt(i);

String sid = “” + tobj.getSymbolID();

if(playing.get( sid )==null ){

playing.put(sid,”N”);

}

String pstate = (String) playing.get( sid );

… (goes on checking which content needs to be played)..

}

Each TuioObject instance contains both a session ID and a marker ID, allowing you to understand what fiducial marker is on screen, and, by comparing previous session IDs that you might have recorded, to find out if we’re dealing with a new display of that marker, or if it’s the continuation of a previous one.

When we decide what to do with the marker, we can use the TuioObject‘s getScreenX() and getScreenY() methods to find out where on screen we need to display the content, and use these coordinates to show our images/movies…

For example at the lines like this one:

image(p0,tobj.getScreenX(width)-ww/2,tobj.getScreenY(height)-hh/2,ww,hh);

in which the coordinates are used to center an image on the marker.

And that’s just about all there is to it! :)

feel free to contact or comment if you find any problem or have any doubt.