Latest Entries »

This post is for people who can’t root or unlock Motorola Atrix after trying through all the guide from Internet. If you haven’t touched
your phone, you may want to take a look at these resources first:
http://forum.xda-developers.com/forumdisplay.php?f=995
http://atrix4ghacks.com/atrix-4g-hacks/all-you-need-to-know-about-hacking-atrix-4g/

First I managed to root my phone with the guide here and unlock the bootloader with the guide here using a sbf from here. Note that unlocking will wipe out all your app and data, so
you better backup first. When you reboot your phone, you should see “unlock” at the top left corner. To test if you are root, download Terminal Emulator from Market and in terminal, type “su”, Sudouser will ask you (yes YOU) for permission.

But life is not always easy, I happened to screw up the Sudouser app as well as the su in /system/bin/su, and when I come back to the Pete’s awesome Motorola root tools, it doesn’t work anymore. The script fails with some errors like “adb can’t get permission using local.prop” (I found this quite strange cos the unroot and grant ADB.root functions which seem to rely on the same method still work flawlessly).
After some desperate reboots, I decided to give up on Pete’s tools and find a more robust way to root. Thanks to the awesome Android community I was able to find out quite easily.
http://forum.xda-developers.com/showpost.php?p=15872818

However when I execute “/preinstall/dosu” it throws an error saying “groups command undefined”. Although the command still leave me at root after execution, the Sudouser.apk is corrupted
and can not be opened once installed. The work around for this is to use Pete’s 1-click tool to gain temporary ADB root access first, then deal with the mounting, copying and installing later.
Here are the detail steps:

First download this and this, and extract into a same folder
Open cmd and type in

moto-fastboot.exe flash preinstall preinstall.img
moto-fastboot.exe reboot

After that, open Pete’s root tools and run “Temp ADB root procedure”, wait for the device to reboot twice. Once you are granted root access to adb, the rest are very similar to XDA’s guide
except that we no longer need to run /preinstall/dosu. Note that we are only su in ADB not android terminal, so running the same script in Android terminal emulator will not work
adb shell
this will leave us in a new shell begins with ‘#’ in stead of ‘$’, this indicates we have full control of the device.

/bin/mount -o remount,rw /system
cp /preinstall/su /system/bin/
chmod 6755 /system/bin/su
PATH=/system/bin:$PATH pm install /preinstall/Superuser.apk

If the result is Success, congratulations, you are now master of the universe!

There is still one more issue, if you go to Settings->Application->Manage Application->Running, you will notice that our Ram has been reduced to ~300Mb. That sucks. To fix this issue we need
to flash one more time.
First, we need to have moto-fastboot, you can find your appropriate version here
In your Terminal Emulator, type:

su
cat /proc/cmdline

If you see tegrapart=mbr:d00:100:800,kpanic:2100:400:800: download this
and if you see tegrapart=mbr:1100:100:800,kpanic:2500:400:800: download this. Unzip
this file to the folder that you previously extract moto-fastboot
Reboot your phone into fastboot mode( (volume-down on boot, then volume-up), connect your USB cabble, go to the moto-fastboot folder and execute:

moto-fastboot flash boot ORFR234-tegrapart-d00100.boot.img (if you see tegrapart=mbr:d00:100:800,kpanic:2100:400:800: previously or otherwise moto-fastboot flash boot ORFR234-tegrapart-1100100.boot.img)

Finally

moto-fastboot reboot

And your precious RAM is back 🙂

Advertisements

Yummo

This is game play trailer for Yummo, our “game programming” course project.
It is an action-shooting- puzzle game for Android and iPhone. Powered by Corona SDK.
The music is hedgewars’s theme song.

We did this in 2 months (2 weeks for coding) with a team of 5 programmers, 2 designers and 1 artist. This is
the first project that I’m involved as a lead programmer. Overall we did a good job, though we screwed up at the
presentation. There were a lot of challenges, both came from the software (Corona, Lua) and the hardware.
– Lua does not support OOP. Our code make use of OOP extensively, therefore it would be disaster if we did it in the
hard core way. In the end we could find a solution with a technique called Metatable.
– Hardware: Yummo uses a lot of sprite animation. The first version of Yummo lags on real device. Therefore we need to optimize the spritesheet using some memory management.
– Physics: Our game is based on real-time physics simulation, mostly collision detection. Although Box2D is good, it sometimes causes some crashes and data inconsistency. To fix this, instead of throwing everything into the collision detector, we only register some classes and trigger the rest via event dispatcher. Luckily for us, our game is small compared to a FPS, so we can do this manually.

Overall, this is the best academic project that I have ever worked on. I’m quite satisfied with my team although the game was not well admired by the judges (professors, lab technicians). But who cares about those losers. They can’t even pronounce Id Software correctly…

Looking forward to summer to see my first game on Appstore 😀

OOP in LUA

Currently I’m writing a game for Android using Corona SDK. And its main scripting language is LUA. My first impression of LUA
was that it is extremely easy to learn. However, being a lightweight scripting language, LUA doesnt support OOP in the direct way like Java/C++. Instead you have to go the round way using its metatable feature. Detail can be found via http://www.lua.org/pil/16.html.

So what if I want to implement my classes in separate files. It brought me surprise that there’s not much information about this on the Internet. So, feeling a bit frustrated, I decided to do it on my own.

First let’s start with the standard way. I.e. declare everything in your main.lua

    Account = {balance = 0}
    
    function Account:new (o)
      o = o or {}
      setmetatable(o, self)
      self.__index = self
      return o
    end
    
    function Account:deposit (v)
      self.balance = self.balance + v
    end
    
    function Account:withdraw (v)
      if v > self.balance then error"insufficient funds" end
      self.balance = self.balance - v
    end

    SpecialAccount = Account:new()
    s = SpecialAccount:new{limit=1000.00}
    function SpecialAccount:withdraw (v)
      if v - self.balance >= self:getLimit() then
        error"insufficient funds"
      end
      self.balance = self.balance - v
    end
    
    function SpecialAccount:getLimit ()
      return self.limit or 0
    end

    function s:getLimit ()
      return self.balance * 0.10
    end

Now let’s do the partitioning
Account class (account.lua)

module  (...,package.seeall)

function new (ret)
	local Account = {balance=0}

	function Account:new (ret) 
		ret = ret or {}
		setmetatable (ret,self)
		self.__index = self
		return ret
	end

	function Account:deposit (amount)
		self.balance = self.balance + amount
	end

	function Account:withdraw (amount)
		if (amount>self.balance) then
			print ("not sufficient money")
		else
			self.balance = self.balance - amount
		end
	end

	return Account:new (ret)
end

SpecialAccount (specialaccount.lua)

module (...,package.seeall)
Account = require ("account")

function new (arg)
	local SpecialAccount = Account.new (arg)
	
        -- method override
	function SpecialAccount:withdraw (amount)
		print ("asdf")
		if (amount>self.balance+self.getLimit()) then
			print ("not surficient money& limit")
		else
			self.balance = self.balance - amount
		end
	end

	function SpecialAccount:getLimit ()
		return self.limit
	end

	return SpecialAccount
end

To use these 2 classes, first we need to import them

SpecialAccount = require ("specialaccount")
Account = require ("account")

The rest is just to create their instances (notice the use of “.” not “:”) and calling corresponding methods

local s =  SpecialAccount.new ({balance=200,limit=500})
s:deposit (50)
s:withdraw (700)
print (s.balance)
local a = Account.new ({balance=100})
a:withdraw (101)
print (a.balance)

2D n-body problem

The n-body problem is a classic example used in parallel programming. The straight forward solution takes O(N^2) time complexity, whereas the Barnes-Hut reduce it to O(NlogN) based on the assumption that, a group of bodies that is far enough from a body can be clustered in to a BIG body.

Based on that assumption, we can divide the space into sub 4 equal rectangles (2D) or 8 cubes (3D). Then we build a quad tree (2D) or oct tree (3D) whose root is the bounding box of all n bodies and divide each node into 4(2D)/8(3D) until each leaf contains no more than 1 body. Each node will need to store its bounding box, its mass (total mass of the children) and center of gravity, which is calculated by the formula:

COM = (C1*M1+C2*M2+C3*M3+C4*M4)/M (for quad tree)
COM = (C1*M1+…+C8*M8)/M (for quad tree)
where Ci is the position of Child ith, Mi is the respective mass and M is the total mass of all children

To compute the total force acting on body X, we just need to traverse down the tree, at each node Y, we calculate the distant R of the node to X. if R >= d*theta, where d is the largest size of the bounding box enclosing Y, theta is a constant (in my code I choose theta = 1.2), we treat Y as a body and calculate its force on X. Otherwise, we go down to Y’s children.

I have written 2 versions. One is Quadtree, and the other uses OpenMp. The quadtre version is no doubt faster but openmp is much shorter and easy to implement. And it considerably faster than naive O(N^2) version.

To compile the sourcode under Linux, you need to add “-lX11 -lm” as the program will rely on XWindows to draw
To compile OpenMp version, add -openmp

QuadTree version

#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/Xos.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <iostream>
#include <sys/time.h>
#include <vector>

#define N 1000000
#define G 6.673e-11
#define T_MAX 100
#define dt 1
#define THETA 1.2
#define  X_RESN	800       /* x resolution */
#define  Y_RESN	800       /* y resolution */

using namespace std;


typedef struct complextype 
{
    float real, imag;
} Compl;

struct  QTreeNode {
	float x0,y0,x1,y1,mass;
	float COM[2]; // center of mass
	bool is_leaf;
	vector<int> id;	
	struct QTreeNode** children;
};



Window        win;  /* initialization for a window */
GC black_gc; // to draw the body
GC white_gc; // to clear the body
GC gc;
Display       *display;
unsigned int  screen; /* which screen */

float m[N]; // mass 
float f[N][2],v[N][2],p[N][2]; // forces, velocity, position
int n,t_max;

void init_display () {
    unsigned 
    int           width, height,  /* window size */
                  x, y, /* window position */
                  border_width, /*border width in pixels */
                  display_width, display_height;
                  /* size of screen */
                  char *window_name = "Mandelbrot Set", 
                  *display_name = NULL;
    unsigned
    long          valuemask = 0;
    XGCValues     values;
    XSizeHints    size_hints;
    Pixmap        bitmap;
    XPoint        points[800];
    FILE          *fp, *fopen ();
    char          str[100];

	XSetWindowAttributes attr[1];

    /* Mandlebrot variables */
    int           i, j, k;
    
    /* connect to Xserver */

    if ( (display = XOpenDisplay (display_name)) == NULL ) {
	  fprintf (stderr, 
                  "Cannot connect to X server %s\n",
                  XDisplayName (display_name) );
	  exit (-1);
    }
	
    /* get screen size */

    screen = DefaultScreen(display);
    display_width = DisplayWidth(display, screen);
    display_height = DisplayHeight(display, screen);

    /* set window size */

    width = X_RESN;
    height = Y_RESN;

    /* set window position */

    x = 0;
    y = 0;

    /* create opaque window */
	border_width = 4;
    win = XCreateSimpleWindow(display, 
                  RootWindow(display, screen),
                  x, y, width, height, border_width, 
                  BlackPixel(display, screen), 
                  WhitePixel(display, screen));

    size_hints.flags = USPosition|USSize;
    size_hints.x = x;
    size_hints.y = y;
    size_hints.width = width;
    size_hints.height = height;
    size_hints.min_width = 300;
    size_hints.min_height = 300;

    XSetNormalHints(display, win, &size_hints);
    XStoreName(display, win, window_name);
    /* create graphics context */
	/*
	values.foreground = WhitePixel (display, screen);
    white_gc = XCreateGC(display, win, valuemask, &values);
	values.foreground = BlackPixel (display, screen);
	black_gc = XCreateGC (display, win, valuemask, &values);
	*/
	gc = XCreateGC (display, win, valuemask, &values);
    XSetBackground(display, gc, 
                  WhitePixel (display, screen));
    XSetForeground(display, gc, 
                  BlackPixel (display, screen));
    XSetLineAttributes(display, gc, 1, 
                  LineSolid, CapRound, JoinRound);

    attr[0].backing_store = Always;
    attr[0].backing_planes = 1;
    attr[0].backing_pixel = BlackPixel(display, screen);

    XChangeWindowAttributes(display, win, 
                  CWBackingStore | CWBackingPlanes | CWBackingPixel, 
                  attr);

    XMapWindow(display, win);
    XSync(display, 0);

}


float abs (float x) {
	return (x>=0)?x:-x;
}

void build_qtree (struct QTreeNode* aNode) {
	if (aNode->id.size () < 2) {
		aNode->is_leaf = true;
		return;
	}
	aNode->children = new QTreeNode*[4];
	float x0,y0,x1,y1;
	x0 = aNode->x0; 
	x1 = aNode->x1;
	y0 = aNode->y0;
	y1 = aNode->y1;
	for (int i=0;i<4;++i) { 
		aNode->children[i] = new QTreeNode;
		aNode->is_leaf = false;
		if (i==0||i==2) {
			aNode->children[i]->x0 = x0;
			aNode->children[i]->x1 = (x0+x1)*0.5;
		}
		else {
			aNode->children[i]->x0 = (x0+x1)*0.5;
			aNode->children[i]->x1 = x1;
		}
		if (i<2) {
			aNode->children[i]->y0 = y0;
			aNode->children[i]->y1 = (y0+y1)*0.5;
		}
		else {
			aNode->children[i]->y0 = (y0+y1)*0.5;
			aNode->children[i]->y1 = y1;
		}
	}
	for (int i = 0;i<aNode->id.size ();++i) {
		// decide which child the body belongs to 
		int pid = aNode->id[i];
		float x = p[pid][0];
		float y = p[pid][1];
		for (int j=0;j<4;++j) {
			if (x>=aNode->children[j]->x0 && x<=aNode->children[j]->x1 && y>=aNode->children[j]->y0 && y<=aNode->children[j]->y1) {
				aNode->children[j]->id.push_back (pid);
				break;
			}
		}
	}
	for (int i=0;i<4;++i)
		build_qtree (aNode->children[i]);
}

void compute_mass (struct QTreeNode* aNode) {
	if (aNode->id.size () == 0) return;
	aNode->mass = 0;
	aNode->COM[0] = aNode->COM[1] = 0;
	if (aNode->is_leaf) {
		aNode->mass = m[aNode->id[0]];
		aNode->COM[0] = p[aNode->id[0]][0];
		aNode->COM[1] = p[aNode->id[0]][1];
	}
	else {
		for (int i=0;i<4;++i) {
			struct QTreeNode* child = aNode->children[i];
			compute_mass (child);
			aNode->mass+=child->mass;
			aNode->COM[0]+=child->mass*child->COM[0];
			aNode->COM[1]+=child->mass*child->COM[1];
		}
		aNode->COM[0]/= aNode->mass;
		aNode->COM[1]/= aNode->mass;
	}
}

void force (const float& x1, const float& y1, const float& x2, const float& y2, const float& m1, const float& m2, float& Fx, float& Fy) {
	float r = sqrt ((x1-x2)*(x1-x2) + (y1-y2)*(y1-y2));
	if (r>0) {
		r*=r*r;
		Fx +=G*m1*m2*(x2-x1) / r;
		Fy +=G*m1*m2*(y2-y1) / r;
	}
}
void cal_force (struct QTreeNode* aNode, const float& mm, const float& x, const float& y, float& Fx, float& Fy) {
	if (aNode->is_leaf) {
		if (aNode->id.size () == 0) return;
		int pid = aNode->id[0];
		force (x,y,p[pid][0],p[pid][1],mm,m[pid], Fx, Fy);
		return;
	} 
	float r = sqrt (pow (aNode->COM[0]-x,2) + pow (aNode->COM[1]-y,2));	
	float d = max (abs (aNode->x1-aNode->x0), abs (aNode->y1-aNode->y0));

	// if only cluster of bodies is far enough
	if (r>d*THETA) {
		force (x,y,aNode->COM[0],aNode->COM[1],mm,aNode->mass,Fx,Fy);
		return;
	}
	for (int i=0;i<4;++i) 
		cal_force (aNode->children[i], mm, x,y,Fx,Fy);
}

void compute_force (struct QTreeNode* aNode) {
	for (int i=0;i<n;++i) {
		cal_force (aNode, m[i], p[i][0], p[i][1], f[i][0],f[i][1]);	
	}
}
/*
void traverse (struct QTreeNode* aNode) {
	if (aNode->id.size ()==0) cout<<"nil"<<endl;
	else if (aNode->is_leaf) cout<<"L: "<<aNode->id[0]<<endl;
	else {
		for (int i = 0;i<aNode->id.size();++i) 
			cout<<aNode->id[i]<<" ";
		cout<<endl;
		for (int i=0;i<4;++i)
			traverse (aNode->children[i]);
	}
}
*/

main (int argc, char* argv[])
{
	init_display ();     	 
	FILE* fin = fopen ("nbody.txt","r");
    /* Calculate and draw points */
    Compl         z, c;
    float         lengthsq, temp;
	fscanf (fin,"%d %d",&n,&t_max);

		
	for (int i = 0;i<n;++i) {
		m[i] = 1;
		f[i][0] = f[i][1] = v[i][0] = v[i][1] = 0;
		fscanf (fin, "%f %f",&p[i][0],&p[i][1]);
	}

	float p_new[n][2],v_new[n][2];
	cout<<n<<endl;

	struct timeval tv1,tv2;
	struct timezone tz1,tz2;
	gettimeofday (&tv1,&tz1);
	for (int t=0;t<t_max;++t) {
		// build quad tree
		struct QTreeNode* root = new QTreeNode;
		root->x0 = root->y0 = 0;
		root->x1 = root->y1 = 1;
		root->is_leaf = false;
		for (int i=0;i<n;++i) 
			root->id.push_back (i);
		build_qtree (root);
		//traverse (root);
		compute_mass (root);
		compute_force (root);
		
		for (int i=0;i<n;++i) {
			v_new[i][0] = v[i][0] + f[i][0]*dt/m[i];
			v_new[i][1] = v[i][1] + f[i][1]*dt/m[i];
			p_new[i][0] = p[i][0]+v_new[i][0]*dt/m[i];
			p_new[i][1] = p[i][1]+v_new[i][1]*dt/m[i];
		}

		for (int i=0;i<n;++i) {
			int x,y;
			x = floor (p[i][0]*X_RESN);
			y = floor (p[i][1]*Y_RESN);
			//XSetForeground (display, gc, WhitePixel (display,screen));
			//XDrawPoint (display, win, gc, x,y);
			v[i][0] = v_new[i][0];
			v[i][1] = v_new[i][1];
			p[i][0] = p_new[i][0];
			p[i][1] = p_new[i][1];
			x = floor (p[i][0]*X_RESN);
			y = floor (p[i][1]*Y_RESN);
			//cout<<x<<" "<<y<<endl;
			//XSetForeground (display, gc, BlackPixel (display,screen));
			//XDrawPoint (display, win, gc, x, y);
		}
		//sleep (1);
		
	}
	gettimeofday (&tv2,&tz2);
	printf ("Eslaped time = %ld\n", (tv2.tv_sec-tv1.tv_sec)*1000000 + tv2.tv_usec-tv1.tv_usec);
    XFlush(display);

    /* Program Finished */

}

OpenPM version

#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <X11/Xos.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>
#include <iostream>
#include <sys/time.h>
#include <omp.h>

#define N 10   				// number of bodies
#define G 6.673e-11
#define T_MAX 100
#define N_THREADS 7			// default number of threads
#define dt 1
#define  X_RESN	800       /* x resolution */
#define  Y_RESN	800       /* y resolution */

using namespace std;


typedef struct complextype 
{
    float real, imag;
} Compl;


Window        win;  /* initialization for a window */
GC black_gc; // to draw the body
GC white_gc; // to clear the body
GC gc;
Display       *display;
unsigned int  screen; /* which screen */

void init_display () {
    unsigned 
    int           width, height,  /* window size */
                  x, y, /* window position */
                  border_width, /*border width in pixels */
                  display_width, display_height;
                  /* size of screen */
                  char *window_name = "Mandelbrot Set", 
                  *display_name = NULL;
    unsigned
    long          valuemask = 0;
    XGCValues     values;
    XSizeHints    size_hints;
    Pixmap        bitmap;
    XPoint        points[800];
    FILE          *fp, *fopen ();
    char          str[100];

	XSetWindowAttributes attr[1];

    /* Mandlebrot variables */
    int           i, j, k;
    
    /* connect to Xserver */

    if ( (display = XOpenDisplay (display_name)) == NULL ) {
	  fprintf (stderr, 
                  "Cannot connect to X server %s\n",
                  XDisplayName (display_name) );
	  exit (-1);
    }
	
    /* get screen size */

    screen = DefaultScreen(display);
    display_width = DisplayWidth(display, screen);
    display_height = DisplayHeight(display, screen);

    /* set window size */

    width = X_RESN;
    height = Y_RESN;

    /* set window position */

    x = 0;
    y = 0;

    /* create opaque window */

    border_width = 4;
    win = XCreateSimpleWindow(display, 
                  RootWindow(display, screen),
                  x, y, width, height, border_width, 
                  BlackPixel(display, screen), 
                  WhitePixel(display, screen));

    size_hints.flags = USPosition|USSize;
    size_hints.x = x;
    size_hints.y = y;
    size_hints.width = width;
    size_hints.height = height;
    size_hints.min_width = 300;
    size_hints.min_height = 300;

    XSetNormalHints(display, win, &size_hints);
    XStoreName(display, win, window_name);
    /* create graphics context */
	/*
	values.foreground = WhitePixel (display, screen);
    white_gc = XCreateGC(display, win, valuemask, &values);
	values.foreground = BlackPixel (display, screen);
	black_gc = XCreateGC (display, win, valuemask, &values);
	*/
	gc = XCreateGC (display, win, valuemask, &values);
    XSetBackground(display, gc, 
                  WhitePixel (display, screen));
    XSetForeground(display, gc, 
                  BlackPixel (display, screen));
    XSetLineAttributes(display, gc, 1, 
                  LineSolid, CapRound, JoinRound);

	//cout<<"here"<<endl;
    attr[0].backing_store = Always;
    attr[0].backing_planes = 1;
    attr[0].backing_pixel = BlackPixel(display, screen);

    XChangeWindowAttributes(display, win, 
                  CWBackingStore | CWBackingPlanes | CWBackingPixel, 
                  attr);

    XMapWindow(display, win);
    XSync(display, 0);
}

main (int argc, char* argv[])
{
	init_display ();     	 
    /* Calculate and draw points */
    Compl         z, c;
    float         lengthsq, temp;
	int n,t_max;
	FILE* fin = fopen ("nbody.txt","r");
	fscanf (fin, "%d %d",&n,&t_max);
	int n_threads = (argc==2)?atoi(argv[3]):N_THREADS;
	double m[n];
	double f[n][2],v[n][2],p[n][2];
	
	#pragma omp parallel for 
	for (int i = 0;i<n;++i) {
		m[i] = 1;
		f[i][0] = f[i][1] = v[i][0] = v[i][1] = 0;
		fscanf (fin,"%lf %lf",&p[i][0],&p[i][1]);
	}
	double p_new[n][2],v_new[n][2];
	omp_set_num_threads (n_threads);
	struct timeval tv1,tv2;
	struct timezone tz1,tz2;
	gettimeofday (&tv1,&tz1);
	for (int t=0;t<t_max;++t) {
		//cout<<t<<endl;
		#pragma omp parallel for 
		for (int i=0;i<n;++i) {
			for (int j=0;j<n;++j) {
				double r = sqrt (pow (p[j][0]-p[i][0],2) + pow(p[j][1]-p[i][1],2));
				r*=r*r;
				if (r>0) {
					f[i][0]+=G*m[i]*m[j]*(p[j][0]-p[i][0]) / r;
					f[i][1]+=G*m[i]*m[j]*(p[j][1]-p[i][1]) / r;
				}
			}
			v_new[i][0] = v[i][0] + f[i][0]*dt/m[i];
			v_new[i][1] = v[i][1] + f[i][1]*dt/m[i];
			p_new[i][0] = p[i][0]+v_new[i][0]*dt/m[i];
			p_new[i][1] = p[i][1]+v_new[i][1]*dt/m[i];

		}
		#pragma omp barrier

		#pragma omp parallel for 
		for (int i=0;i<n;++i) {
			int x,y;
			x = floor (p[i][0]*X_RESN);
			y = floor (p[i][1]*Y_RESN);
			#pragma omp critical 
			{
				XSetForeground (display, gc, WhitePixel (display,screen));
				XDrawPoint (display, win, gc, x,y);
			}
			v[i][0] = v_new[i][0];
			v[i][1] = v_new[i][1];
			p[i][0] = p_new[i][0];
			p[i][1] = p_new[i][1];
			x = floor (p[i][0]*X_RESN);
			y = floor (p[i][1]*Y_RESN);
			#pragma omp critical 
			{
				XSetForeground (display, gc, BlackPixel (display,screen));
				XDrawPoint (display, win, gc, x, y);
			}
		}
		//sleep(1);
	}
	gettimeofday (&tv2,&tz2);
	printf ("Eslaped time = %ld\n", (tv2.tv_sec-tv1.tv_sec)*1000000 + tv2.tv_usec-tv1.tv_usec);
    XFlush(display);

    /* Program Finished */

}

K-d Tree Build

The first version of K-d tree I wrote was a trivial median-split build, and it ran quite slow. Then I found out a technique called Surface Area Heuristic (SAH). Basically, the method works under the assumption that the time of intersecting a ray with a voxel is based on its surface area and the number of triangles inside that voxel. And to find the split plane, it will check all the bounding face of every triangles in the scene. A benefit to this approach is axis-align (or planar) triangles will be preferred, making the algorithm perfect for scene in which there are many flat, parallel surfaces.

So I spent most of my last 2 weeks reading and implementing that build. Although the build is now O(NlogN), the code is till very poorly optimized.

The code is made available at:
http://code.google.com/p/vcraytracer/source/browse/#svn/trunk/wxRaytracer/raytracer/Accel

My implementation is based on this paper:
On building fast kd-Trees for Ray Tracing, and on doing that in O(NlogN)

The SAH is described by Vlastimir Havran in his dissertation which is considered a bible for K-d tree
Heuristic ray shooting algorithm

It took me almost 3 days to understand the method, because I’m rather new to this acceleration scheme. I think I will write down some of my thoughts on this wonderful paper when I have time.

Morphing plug-ins for Realflow

This basically is a bunch of Python scripts I wrote during my attachment at Film Magik, ltd. At that time we have to do a 15 sec animation for a movie distributing company.

Here is the final animation

Some breakdown

Screen shots (Click to enlarge)




Finally, you can download the source code here.

Documentation can be found here:

Point light usually mimics the illuminance of light bulb in which light is emitted in all direction from source.
As the name suggests, Computer Graphics often treat the light source as a single point in space. And by this way, the shadow generated by point light is always hard-edged. But sometimes, we want point light to give soft-edge shadow like in Arealight. This article will explain a trick to implement a basic point light that gives soft-edge shadow.

Here is how the render process happens
1. Primary ray is shoot from camera through each pixel’s coordinate of the view plane.
2. The renderer compute the intersection point of the ray with objects in the scene and return a ShadeRec (shading record) in which the information of the hit point (location, normal, material pointer,…) is stored.
3. From the hit point, a shadow ray is shoot to each light in the scene to determine whether that point is in shadow. If not, compute the incoming radiance from that light and add it to the final color

Let’s start the implementation by first looking at the Point light in Maya
Maya's Pointlight
The most important attributes are intensity, color, and shadow. Every kind of light must have all these 3 attributes. Under shadow, look for “raytrace shadow”, expand it and you will notice 3 sub attributes, i.e. shadow radius, # shadow rays, and # shadow depth. In this tutorial, I will implement the most 2 important (according to me) attributes which are shadow radius and number of shadow rays. Shadow depth is less important and will be 1 by default.

First, we need to define the Light class:

class Light {	
	public:
		Light(void);
		virtual Vector3D	get_direction(const ShadeRec& sr) = 0;				
		virtual RGBColor	L(const ShadeRec& sr);		
		virtual bool in_shadow (const Ray& ray, const ShadeRec& sr) const; //shadow test

		void set_num_shadow_rays (int num_sr);
		int get_num_shadow_rays () const;

	protected:
		bool isArea;   // check if this light is arealight
		int num_shadow_rays; // number of shadow rays being sampled
};

And PointLight class, which is a sub class of Light

class PointLight : public Light
{
public:
	PointLight(void);
	void set_ls (float _ls);
	void set_shadow_radius (float _sr);
	virtual Vector3D get_direction (const ShadeRec& sr);
	virtual RGBColor L (const ShadeRec& sr);
	virtual bool in_shadow (const Ray& ray, const ShadeRec& sr) const;

private:
	float ls,shadow_radius;
	RGBColor color;
	Vector3D source;
};

The most important method that contribute to the shadow generated by point light is get_direction ()

Vector3D PointLight::get_direction (const ShadeRec& sr) {
	return ((sr.hit_point - source).hat()); // normalize the vector
}

By default, this implementation will generate, hard-edge shadow. The reason for this is because the light source is fixed. To fix this, we just need to jitter the source’s location. The shadow_radius control the spreading of the penumbra and is the same as “light radius” attribute in Maya

Vector3D PointLight::get_direction (const ShadeRec& sr) {
	Point3D sample_source;

	sample_source.x = source.x + pow (-1,(float) (rand ()%2))*shadow_radius*rand_float ();
	sample_source.y = source.y + pow (-1,(float) (rand ()%2))*shadow_radius*rand_float ();
	sample_source.z = source.z + pow (-1,(float) (rand ()%2))*shadow_radius*rand_float ();
	
	Vector3D temp = sr.hit_point - sample_source;
	return (temp.hat());
}

Now, when you jitter the light source’s location, a point near the edge of the old shadow, may make it to the light without hitting any object, and therefore is not in shadow. This will give us the below image. Look at the edge, there are noises indicating that many shadow rays did not hit anything because of the variant light source. To eliminate the noise, we just need to increase the number of samples per pixel. The image below is rendered with 1024 samples per pixel

Pointlight 1024samples

By default, we use Monte Carlo integration to estimate the color at each pixel. This is done through sampling the rays and calculate the average color return by each ray. However, this may be costly to reduce noise to acceptable level. Instead of firing many rays at the beginning, we can also shoot ray at each hit point which result in same result but lesser computation since the ray shooting process is pushed inside the pixel loop. The “Shadow Rays” attribute in Maya does exactly the same function.

This image below is rendered with 16 pixel samples and 16 shadow rays.

16 pixel samples, 16 shadow rays

Ambient Occlusion is a vital factor that contribute to global illumination. To see why AO is important, take a look at the following 2 images (from the GPU Gems book)
Diffuse only
The right leg of the dinosour is lit with same amount of light as the left leg, which is something unlikely happen in real world. Note that in real world, light comes from almost everywhere. And because of this characteristic, it is very unlikely that some under-part of the dinosour (e.g. the chest and the under-face of the tail) are complely dark as they are shaded with diffuse shading.

The following picture is shaded with ambient occlusion and you can notice dramatic improvement (note the soft shadow)
Dinosour shaded with Ambient Occlusion

To shade a point using AO, we need to compute the reflected radiance at that point.
Reflected radiance

Where f

is the brdf of the material.
Li

is the incident radiance, this is computed by shooting shadow rays in many directions from the hemisphere over the point being shaded. I use Kevin Suffern’s skeleton renderer. The original code is commented out.

My modified code:

RGBColor AmbientOccluder::L (ShadeRec& sr) {
	w = sr.normal;
	// jitter up vector in case normal is vertical
	v = w^Vector3D(.0072,1,.0034);
	v.normalize ();
	u = v^w;
	/*
	Ray shadow_ray;
	shadow_ray.o = sr.hit_point;
	shadow_ray.d = get_direction (sr);
		
	if (!in_shadow (shadow_ray, sr)) 
		return (ls*color)*min_amount;
	else return (black);
	*/
	
	
	Ray shadow_ray;
	int num_unoccluded = 0;
	for (int i = 0;i<num_ray_shots;++i) {
		shadow_ray.o = sr.hit_point;
		shadow_ray.d = get_direction (sr);
		
		if (!in_shadow (shadow_ray, sr)) num_unoccluded++;
	}
	float accessibility = 1.0*num_unoccluded / num_ray_shots;
	return (ls*color)*min_amount*accessibility;
	
	
}

You need to multiply 1.0 at line 25 to type cast the RHS to float, otherwise accessibility will easily be 0 which results in the following image:

I used Multijittered sampling to sample the direction over the hemisphere. Let’s say we create 256 samples for the hemisphere, however, we don’t need to fire rays in all 256 directions. This following images was rendered with 256 pixel samples and 256 AO samples but we need only to shoot 1 ray.
256_256_1_1min

Although we need to shoot 1 ray, the pixel sample still can do the trick. But since pixel sample is expansive, we will try to decrease it, fortunately 16 samples per pixel almost terminate all the jaggy artifacts, so we will use 16 samples per pixel from now on.

The following 2 images are rendered with 16/16/16 and 16/256/16 (pixel samples/AO samples/number of ray shots) respectively, the render time is 45 and 46 seconds respectively

You can see the 1st image (16/16/16) is more noisy due to the small number ofn AO samples, whereas the 2nd image (16/256/16) is less noisy and the render time is just about the same.

The next image rendered with (16/256/64) is almost noise-free, the render time is 2:50 minutes which is about the same as (256/256/1) in which noise is very visible.

Conclusion: We should use ray sampling as much as simple and only rely on pixel sampling to anti aliasing like e.g. jaggy edge. However shooting ray at hit point level is sometime much more expensive. Consider the path tracing algorithm when you need to shoot extra rays every time a ray hits a diffuse surface. The number of rays would go exponentially. In that case, pixel sampling is the preferred solution.