---------------

Performance Log

---------------

8/2/11 - IPod Touch 64/64/16 
	Test 1
	~ 16 FPS - Restricted colour access to RGBA class to prepare for RGBA4444 texture format
	21020.0ms   64.1%	 	C_VoxelTestSection::Update(float)
		19780.0ms   34.6%	 	C_VoxelTestSection::VoxelUpdate(float)
			7520.0ms   11.5%	 	C_VoxelData::GenerateOutline(int, int, int)
			5900.0ms    8.2%	 	FluidSimulation(C_VoxelDataSim*)
			5340.0ms    6.7%	 	C_VoxelDataSim::Active(int)
			4730.0ms    5.2%	 	memcpy
			4120.0ms    4.2%	 	BlitBlockMan(int, int, int, C_VoxelData*, C_VoxelData::C_RGBA*)
		16160.0ms   15.1%	 	C_VoxelTestSection::CalculateLighting()
		8440.0ms    7.4%	 	C_VoxelDataLit::ApplyLighting()
		3240.0ms    2.6%	 	ColourLookup(unsigned char, unsigned char)
	20260.0ms   15.0%	 	C_BatchedRenderer::Draw()
	21190.0ms   14.8%	 	glTexImage2D
	
	Test 2 - 16 bit colours
	~ 19 FPS
	21740.0ms   72.4%	 	C_VoxelTestSection::Update(float)
		15100.0ms   43.3%	 	C_VoxelTestSection::VoxelUpdate(float)
			7890.0ms   19.6%	 	C_VoxelData::GenerateOutline(int, int, int)
			4010.0ms    7.7%	 	FluidSimulation(C_VoxelDataSim*)
			3700.0ms    6.0%	 	C_VoxelDataSim::Active(int)
			3210.0ms    4.7%	 	BlitBlockMan(int, int, int, C_VoxelData*, C_VoxelData::C_RGBA*)
			2730.0ms    3.6%	 	memcpy
		13220.0ms   15.8%	 	C_VoxelDataLit::ApplyLighting()
		13240.0ms   14.5%	 	C_VoxelTestSection::CalculateLighting()
	10870.0ms   10.0%	 	C_BatchedRenderer::Draw()
	9720.0ms    8.2%	 	glTexImage2D
	
	- added compiler flags
		-DNDEBUG (to strip assert code from release builds)
		-ffast-math (faster maths, apparently with some precision problems.)
		-ftree-vectorize (?)
		-marm (?)

7/2/11 - IPod Touch 64/64/16 - added colour scaling on outlines.
	Test 1
	~ 14 FPS - 
	38380.0ms   73.3%	 	C_VoxelTestSection::Update(float)
		25740.0ms   40.1%	 	C_VoxelTestSection::VoxelUpdate(float)
			10940.0ms   14.5%	 	C_VoxelData::GenerateOutline(int, int, int)
			5780.0ms    6.9%	 	C_VoxelDataSim::Active(int)
			5940.0ms    6.6%	 	FluidSimulation(C_VoxelDataSim*)
			5940.0ms    5.9%	 	memcpy
		19850.0ms   18.3%	 	C_VoxelTestSection::CalculateLighting()
		15000.0ms   12.7%	 	C_VoxelDataLit::ApplyLighting()
	14680.0ms   11.0%	 	glTexImage2D
	12540.0ms    8.5%	 	C_BatchedRenderer::Draw()
	
	Test 2
	~ 15 FPS - using colour lookup table
	38780.0ms   70.3%	 	C_VoxelTestSection::Update(float)
		20790.0ms   37.7%	 	C_VoxelTestSection::VoxelUpdate(float)
			5800.0ms   10.5%	 	C_VoxelData::GenerateOutline(int, int, int)	
			4290.0ms    7.7%	 	memcpy
			4190.0ms    7.6%	 	FluidSimulation(C_VoxelDataSim*)
			3240.0ms    5.8%	 	C_VoxelDataSim::Active(int)
			2540.0ms    4.6%	 	BlitBlockMan(int, int, int, C_VoxelData*, unsigned char*)
		12120.0ms   21.9%	 	C_VoxelTestSection::CalculateLighting()
		4480.0ms    8.1%	 	C_VoxelDataLit::ApplyLighting()
	7770.0ms   14.0%	 	glTexImage2D
	4810.0ms    8.7%	 	C_BatchedRenderer::Draw()
		
	Test 3 -
	~ 16 FPS - various inner loop optimisations
	22180.0ms   63.8%	 	C_VoxelTestSection::Update(float)
		26480.0ms   35.2%	 	C_VoxelTestSection::VoxelUpdate(float)
			9570.0ms   11.9%	 	C_VoxelData::GenerateOutline(int, int, int)
			5900.0ms    6.7%	 	memcpy
			5820.0ms    6.0%	 	C_VoxelDataSim::Active(int)
			5710.0ms    5.5%	 	FluidSimulation(C_VoxelDataSim*)
			4850.0ms    4.4%	 	BlitBlockMan(int, int, int, C_VoxelData*, unsigned char*)
		19620.0ms   16.6%	 	C_VoxelTestSection::CalculateLighting()
		9840.0ms    7.5%	 	C_VoxelDataLit::ApplyLighting()
		4430.0ms    3.2%	 	ColourLookup(unsigned char, unsigned char)
	21470.0ms   14.0%	 	C_BatchedRenderer::Draw()
	21440.0ms   13.2%	 	glTexImage2D
		
3/2/11 - IPhone Touch 64/64/16
	~ 42 FPS - With first pass fluid simulation, moved static scene blits from update to init.

	36800.0ms   64.3%	 	C_VoxelTestSection::Update(float)
		29480.0ms   41.9%	 	C_VoxelTestSection::VoxelUpdate(float)
			13100.0ms   16.3%	 	C_VoxelData::GenerateOutline(int, int, int)
			12900.0ms   14.6%	 	BlitBlockMan(int, int, int, bool)
				8560.0ms    8.4%	 	C_VoxelData::BlitBlockTexture(int, int, int, int, int, int, unsigned char*, int)
				2950.0ms    2.6%	 	C_VoxelData::GenerateOutline(int, int, int)
				1180.0ms    0.9%	 	C_VoxelData::BlockMultiply(int, int, int, int, int, int, unsigned char*)
			8040.0ms    6.0%	 	FluidSimulation(C_VoxelDataSim*)
			4990.0ms    3.3%	 	C_VoxelDataSim::Active(int)
		19400.0ms   12.3%	 	C_VoxelTestSection::CalculateLighting()
		31760.0ms   18.5%	 	C_VoxelDataLit::ApplyLighting()
	12030.0ms    6.3%	 	glTexImage2D

3/2/11 - IPod Touch 64/64/16
	~ 16 FPS - With first pass fluid simulation, moved static scene blits from update to init.

	23170.0ms   51.7%	 	C_VoxelTestSection::Update(float)
		15310.0ms   30.4%	 	C_VoxelTestSection::VoxelUpdate(float)
			4660.0ms    7.5%	 	C_VoxelData::GenerateOutline(int, int, int)
			3390.0ms    4.7%	 	C_VoxelDataSim::Active(int)
			3680.0ms    4.5%	 	BlitBlockMan(int, int, int, bool)
	23330.0ms   19.7%	 	C_VoxelTestSection::CalculateLighting()
		25240.0ms   19.7%	 	C_VoxelDataLit::DirectionalLight(int, int, int, int, int, int, int, int, int, unsigned char)
			25780.0ms   19.2%	 	C_VoxelDataLit::LightBeam(int, int, int, int, int, int, unsigned char)
	24880.0ms   14.2%	 	glTexImage2D
	23330.0ms   12.2%	 	C_BatchedRenderer::Draw()
	24480.0ms   12.0%	 	C_VoxelDataLit::ApplyLighting()

30/1/11 - IPod Touch 64/64/16
	~ 10 FPS - With first pass gas Simulation
	1726720.0ms   72.5%	 	C_VoxelTestSection::Update(float)
		1356670.0ms   56.7%	 	C_VoxelTestSection::VoxelUpdate(float)
			694980.0ms   28.9%	 	Simulation()
			350110.0ms   14.5%	 	C_VoxelData::BlitBlock(int, int, int, int, int, int, unsigned char const*)
			143780.0ms    5.9%	 	C_VoxelData::GenerateOutline(int, int, int)
			54930.0ms    2.2%	 	BlitBlockMan(int, int, int, bool)
			53210.0ms    2.1%	 	C_VoxelData::BlitBlockTexture(int, int, int, int, int, int, unsigned char*, int)
	322160.0ms   13.1%	 	C_VoxelTestSection::CalculateLighting()
			312920.0ms   12.7%	 	C_VoxelData::LightBeam(int, int, int, int, int, int, unsigned char)
			271260.0ms   10.9%	 	C_VoxelData::ApplyLighting()
	180190.0ms    7.2%	 	glTexImage2D

27/1/11 - IPod Touch 64/64/16
	~ 17 FPS - restricted view to top-down. only using 16 y-slices to render scene
	18300.0ms   57.5%	 	C_VoxelTestSection::Update(float)
		13770.0ms   23.0%	 	C_VoxelTestSection::CalculateLighting()
			6220.0ms    8.9%	 	LightBeam(int, int, int, unsigned char)
		16340.0ms   16.4%	 	C_VoxelTestSection::GenerateOutline()
		20260.0ms   16.1%	 	C_VoxelTestSection::VoxelUpdate(float)
	27890.0ms   19.0%	 	C_BatchedRenderer::Draw()
	21530.0ms   13.3%	 	glTexImage2D
			
27/1/11 - IPod Touch 64/64/16, two character scene, beam lighting
	Test 1
	~ 6FPS
	72180.0ms   48.3%	 	C_VoxelTestSection::Update(float)
		31530.0ms   20.1%	 	C_VoxelTestSection::CalculateLighting()
		16230.0ms   10.0%	 	C_VoxelTestSection::UpdateBuffers()
		16050.0ms    9.6%	 	C_VoxelTestSection::GenerateOutline()
		10440.0ms    6.0%	 	C_VoxelTestSection::VoxelUpdate(float)
	40080.0ms   22.5%	 	C_BatchedRenderer::Draw()
		40810.0ms   22.3%	 	C_Display::DrawArrays(int, int, int)
	35120.0ms   19.0%	 	C_VoxelTestSection::Draw()
		34970.0ms   18.9%	 	glTexImage2D
	9230.0ms    4.9%	 	bzero
	
	Test 2:
	~ 7FPS
	Light beams only on two axis
	disabled compile for Thumb
	14870.0ms   44.4%	 	C_VoxelTestSection::Update(float)
		5750.0ms   13.4%	 	C_VoxelTestSection::UpdateBuffers()
		6500.0ms   12.2%	 	C_VoxelTestSection::CalculateLighting()
		5680.0ms    9.7%	 	C_VoxelTestSection::GenerateOutline()
		6440.0ms    7.5%	 	C_VoxelTestSection::VoxelUpdate(float)
	23760.0ms   23.6%	 	C_RenderSection::Draw()
	23440.0ms   21.7%	 	C_VoxelTestSection::Draw()
	5710.0ms    4.9%	 	bzero

24/1/11 - IPod Touch 64/64/16, two character scene, beam lighting
	Test 1: 3 slices
	~ 6 FPS
	13710.0ms   47.4%	 	C_VoxelTestSection::Update(float)
		7640.0ms   19.4%	 	C_VoxelTestSection::CalculateLighting()
		4330.0ms    9.4%	 	C_VoxelTestSection::UpdateBuffers()
		4160.0ms    7.9%	 	C_VoxelTestSection::GenerateOutline()
		3940.0ms    6.6%	 	C_VoxelTestSection::VoxelUpdate(float)
	17570.0ms   24.1%	 	C_RenderSection::Draw()
		20060.0ms   23.9%	 	C_Display::DrawArrays(int, int, int)
	17100.0ms   18.6%	 	C_VoxelTestSection::Draw()
	5060.0ms    4.9%	 	bzero

	Test 2: 1 slice, single texture and more polygons
	~5 FPS (seems geometric comlexity has caused further issues).
	2390.0ms   23.5%	 	C_RenderSection::Draw()
		3000.0ms   23.8%	 	C_Display::DrawArrays(int, int, int)
	3100.0ms   16.6%	 	C_VoxelTestSection::Update(float)
	2680.0ms   12.9%	 	C_VoxelTestSection::Draw()
	10080.0ms   45.2%	 	-[EAGLView presentFramebuffer]

24/1/11 - IPhone 64/64/16, two character scene, beam lighting
	Test 1: 3 slices
	~ 16 FPS
	5740.0ms   80.3%	 	C_VoxelTestSection::Update(float)
		2140.0ms   24.7%	 	C_VoxelTestSection::GenerateOutline()
		2240.0ms   22.7%	 	C_VoxelTestSection::CalculateLighting()
		1890.0ms   18.2%	 	C_VoxelTestSection::VoxelUpdate(float)
		1250.0ms   11.1%	 	memcpy
		270.0ms    2.0%			C_VoxelTestSection::UpdateBuffers()
	2120.0ms   14.3%	 	C_Display::Clear(bool)
	
	Test 2: 1 slice, single texture, more polygons
	~ 12-22 FPS
	Looks like suffering from overdraw
	2020.0ms   66.2%	C_VoxelTestSection::Update(float)
	610.0ms   16.0%	 	C_Display::Clear(bool)
	460.0ms   10.2%	 	memset

24/1/11 - Mac 64/64/16, two character scene, beam lighting, single texture
	Test 1
	~ 40FPS
	1856.5ms   42.6%	 	C_VoxelTestSection::Draw()
	2131.3ms   33.6%	 	SDL_SoftStretch
	885.9ms   10.1%			C_VoxelTestSection::Update(float)

24/1/11 - PC 64/64/16, two character scene, beam lighting
	Test 1:
	11-6ms		Draw
		5-3ms		Render	
		6-3ms		TexUpdate
	Test 2: Using a single texture
	12-8ms		Draw
		8-3ms		Render
		5-1ms		TexUpdate

22/1/11 - IPod Touch 64/64/16, two character scene, beam lighting (not rendered)
	~ 10 FPS
	43400.0ms   46.7%	 	C_VoxelTestSection::Update(float)
		14420.0ms   11.3%	 	LightBeam(int, int, int, unsigned char)
		7310.0ms    5.3%	 	memcpy
		2300.0ms    1.5%	 	BlitBlockTexture(int, int, int, int, int, int, unsigned char*, int)
	25690.0ms   25.9%	 	C_VoxelTestSection::Draw()
	14720.0ms   13.4%	 	C_RenderSection::Draw()

22/1/11 - IPhone 64/64/16, two character scene, beam lighting  (not rendered)
	~ 33 FPS
	36120.0ms   59.1%	 	C_VoxelTestSection::Update(float)
	10070.0ms   11.9%	 	C_VoxelTestSection::Draw()
	5200.0ms    3.6%	 	C_RenderSection::Draw()

20/1/11 - IPhone (not rendered)
	CPU Test on 64/64/16, simple moving block scene, NEGZ lighting
	~ 30 FPS
	
	CPU Test on 128/128/8, simple moving block scene, NEGZ lighting
	~ 30 FPS
	
	CPU Test on 128/128/16, simple moving block scene, NEGZ lighting
	~ 19 FPS

19/1/11 - Mac - Performance on 128/128/32, simple moving block scene
	Test 1
	First implementation of slices
	~ 13 FPS
	20798   52.1%	 	C_DisplayPrimitive::DrawTextured()
	16332   39.4%	 	C_VoxelTestSection::Draw()
	1522    3.5%	 	C_VoxelTestSection::Update(float)
	
	Test 2
	Rendering via static batch
	~19 FPS
	3954   60.2%	 	C_VoxelTestSection::Draw()
	2194   24.8%	 	C_BatchedRenderer::Draw()
	555    5.4%	 		C_VoxelTestSection::Update(float)

	Test 3
	Batched 32 textures into single texture
	~ 27 FPS
	4893   44.2%	 	C_VoxelTestSection::Draw()
	3623   29.2%	 	C_BatchedRenderer::Draw()
	1114    8.1%	 	C_VoxelTestSection::Update(float)

19/1/11 - Problem Identified with rendering z-runs.
	Although this signigicantly improves rendering performance, it hinders the possibility of dynamic lighting per voxel.
	as each voxel is no longer rendered individually.
	Prompted experimentation with texture mapping runs.
	Prompted experimentation with 'slices' approach.

15/1/11 - Work PC - Performance on 128/128/32, randomly blitting 500 blocks per frame 
	Test 1
	~ 43 FPS
	~ 170,000 vertices being sent to DrawArrays	
	0 - 32ms BatchUpdate
	0 - 16ms VoxUpdate
	0 - 16ms BatchDraw

	Test 2
	Simplified inner loop of batch update (converting voxel data to surface data)
	~ 66 FPS
	~ 170,000 vertices being sent to DrawArrays	
	0 - 16ms BatchUpdate
	0 - 16ms VoxUpdate
	0 - 16ms BatchDraw

14/1/11 - MAC - Performance on 128/128/32, randomly blitting 500 blocks per frame 
	Test 1 
	~ 16 FPS
	~ 500,000 vertices being sent to DrawArrays
	4775   38.3%	 	C_BatchedRenderer::Draw()
	3272   23.9%	 	C_VoxelTestSection::Draw()
	2853   18.8%	 	BlitBlock(int, int, int, int, int, int, unsigned char*)
	2126   12.7%	 	FastBatchTriangleSet(float*&, unsigned char*&, int&, int, unsigned char const*, Vector3*)
	
	Test 2
	Reduced the amound of geometry sent to graphics card,
	grouping z-runs of same coloured voxels in a single primitive submit.
	~ 25 FPS
	~ 170,000 vertices being sent to DrawArrays
	6324   57.9%	 	C_VoxelTestSection::Draw()
	2048   15.5%	 	BlitBlock(int, int, int, int, int, int, unsigned char*)
	2083   13.9%	 	C_BatchedRenderer::Draw()
	795    4.6%			FastBatchTriangleSet(void*&, unsigned char*&, int&, int, unsigned char const*, short*)

14/1/11 - PC - Performance on 128/128/32, randomly blitting 500 blocks per frame 
	Test 1:
	~ 12-13FPS
	40-52ms	BatchUpdate
	17-25ms VoxelUpdate	
	9-20ms	BatchDraw
	
	Test 2: 
	Hard wired Batch vertex setting functionality in the inner loop of BatchUpdate
	~ 17FPS
	16-30ms	BatchUpdate
	17-25ms VoxelUpdate	
	9-20ms	BatchDraw

14/1/11 - PC Performance on 128/128/32, randomly blitting 1000 blocks per frame
	Test 1:
	~ 10FPS
	Instruments - unchecked invert call stack
	9630   49.0%	 	BatchRenderPrimitive(C_DisplayPrimitive*, Matrix44&, long long, unsigned char const*, int)
	6010   26.7%	 	C_BatchedRenderer::Draw()
	3154   11.8%	 	BlitBlock(int, int, int, int, int, int, unsigned char*)

Test 2:
	Preformatted triangles for submission when created.
	~ 11FPS
	27795   46.3%	 	BatchRenderPrimitive(C_DisplayPrimitive*, Matrix44&, long long, unsigned char const*, int)
	16643   27.0%	 	C_BatchedRenderer::Draw()
	7497   13.5%	 	BlitBlock(int, int, int, int, int, int, unsigned char*)

14/1/11 - PC Performance on 64/64/8
	Test 1: 
	Fixed bug with new renderering. Now more polygons going doing render pipe.
	~9FPS
	395   33.1%	 	C_BatchedRenderer::Draw()
	364   11.9%	 	C_BatchedRenderer::SetVertex(C_Batch*, Vector3, Vector2*, Matrix44&, char const*, int)
	453   10.6%	 	C_Batch::NewVertexColour(float*, char const*)
	
	Test 2:
	Implemented marching cubes variant.
	~141FPS
	1933   29.4%	C_BatchedRenderer::Draw()
	886	11.0%	 	C_Batch::NewVertexColour(float*, unsigned char const*)
	1008   10.6%	C_BatchedRenderer::SetVertex(C_Batch*, Vector3, Vector2*, Matrix44&, unsigned char const*, int)
	1090    9.2%	Vector3::operator=(Vector3 const&)

13/1/11 - PC Performance on 32/32/4
	Test 1: 
	~37FPS 
	1486   30.1%	 	Matrix44::Multiply(Vector3&)

	Test 2: 
	added flags to skip matrix Multiply
	~60FPS
	6329   23.4%	 	C_BatchedRenderer::Draw()
	5924   20.1%	 	C_BatchedRenderer::SetVertex(C_Batch*, int, C_DisplayPrimitive*, Matrix44&, Vector4 const&, int)
	5364   17.2%	 	C_Batch::NewVertexColour(float, float, float, float, float, float, float)

	Test 3: 
	moved float -> char conversion outside loop. 
	Simplfied accessor functions inside NewVertexColour
	~70FPS
	2833   25.7%	 	C_BatchedRenderer::Draw()
	2709   19.3%	 	C_BatchedRenderer::SetVertex(C_Batch*, int, C_DisplayPrimitive*, Matrix44&, Vector4 const&, int)
	1576   10.0%	 	C_Batch::NewVertexColour(float, float, float, char, char, char, char)

	Test 4:
	removed more accessors from newVertexColour in favour of memcpy.
	~80FPS
	3964   31.1%	 	C_BatchedRenderer::Draw()
	1364   17.4%	 	C_BatchedRenderer::SetVertex(C_Batch*, int, C_DisplayPrimitive*, Matrix44&, char const*, int)
	985   10.0%			C_Batch::NewVertexColour(float, float, float, char const*)

	Test 5:
	removed more accessors from newVertexColour in favour of memcpy.
	~87FPS
	2861   32.5%	 	C_BatchedRenderer::Draw()
	977   13.1%			C_BatchedRenderer::SetVertex(C_Batch*, int, C_DisplayPrimitive*, Matrix44&, char const*, int)
	1216   11.8%	 	C_Batch::NewVertexColour(float*, char const*)
	1356    8.4%	 	Vector3::operator=(Vector3 const&)
	1060    6.0%	 	operator+(Vector3 const&, Vector3 const&)

	Test 6:
	moved code out of inner loop
	~91FPS
	4498   33.9%	 	C_BatchedRenderer::Draw()
	2036   12.8%	 	C_Batch::NewVertexColour(float*, char const*)
	2104   12.2%	 	C_BatchedRenderer::SetVertex(C_Batch*, Vector3, Vector2*, Matrix44&, char const*, int)
	1712    9.1%	 	Vector3::operator=(Vector3 const&)
	1805    8.8%	 	operator+(Vector3 const&, Vector3 const&)