Happen to have a Cray XC-40? Q3Map2 will work on it.
-
VolumetricSteve
- Posts: 449
- Joined: Sat Nov 06, 2010 2:33 am
Happen to have a Cray XC-40? Q3Map2 will work on it.
I've recently started a new job where...for some reason, I'm given largely unfettered access to a Cray XC-40. Naturally, left alone with it long enough, I got q3map2's source isolated from a fork of netradiant and compiled it on its own with the cray proprietary compilers. I can't believe it worked with so little modification, but it's there.
Since the XC-40 is a cluster system, what I've built will only work on one-node at a time, but I've got 32 cores per node.
If my bosses don't immediately escort me out, I'm hoping I can get it to be even more cray-friendly to make use of the Aries network interface to hopefully make use of multiple nodes, but that's a much bigger mountain to climb.
While I watched it compile all I could think was:
https://www.youtube.com/watch?v=wcW_Ygs6hm0
Since the XC-40 is a cluster system, what I've built will only work on one-node at a time, but I've got 32 cores per node.
If my bosses don't immediately escort me out, I'm hoping I can get it to be even more cray-friendly to make use of the Aries network interface to hopefully make use of multiple nodes, but that's a much bigger mountain to climb.
While I watched it compile all I could think was:
https://www.youtube.com/watch?v=wcW_Ygs6hm0
Last edited by VolumetricSteve on Mon Jun 29, 2015 7:01 pm, edited 1 time in total.
-
HM-PuFFNSTuFF
- Posts: 14376
- Joined: Thu Mar 01, 2001 8:00 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I approve of this.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I don't understand this, but I approve of it anyway.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
seremtan wrote:I don't understand this, but I approve of it anyway.
Sounds technical but with largely unfettered access to a Cray XC-40 you're on a winner VolumetricSteve.
Please keep us advised.
[color=#FFBF00]Physicist [/color][color=#FF4000]of[/color] [color=#0000FF]Q3W[/color]
-
YourGrandpa
- Posts: 10075
- Joined: Mon Apr 17, 2000 7:00 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I want to be the clan leader.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
Approve yes, understand no.
[size=85][color=#0080BF]io chiamo pinguini![/color][/size]
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I approve of this, too. Although can you please explain what advantage is there in running Q3Map2 on a supercomputer vs a normal PC?
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
But can you make it run Crysis? 
-
VolumetricSteve
- Posts: 449
- Joined: Sat Nov 06, 2010 2:33 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I guess the real advantage is that I'm not (directly) paying for the power that runs the system. 
Also, it should be fast beyond reason, so that's always fun.
I think my next step is to make a XC40-friendly version of this:
http://www.ciole.net/quake_bench/
(ghost, if you're out there, would you condone my use of your scripts for my madness?)
A few years back I had a 6-core AMD at 4Ghz that could run that benchmark in 40 seconds. I'm betting that 32 cores on a cray will produce some interesting results.
My hope is...if I can really wrap my head around what it is q3map2 is doing under the hood, just for shits and giggles I'd like to see if I could get it to send jobs out across the cluster. My site has a very healthy number of compute-nodes available and I could plausibly get 10 to myself before the system even felt it.
Even if I only got 10, that's 320 cores and 1.2TB of ram (although I'd be stunned if q3map2 ever allocated more than 2GB)
Does anyone know what Id software used to compile the original quake 3 maps? I'd love to do a comparison/infographic/whatever
For the curious among you:
http://www.cray.com/products/computing/xc-series
Also, it should be fast beyond reason, so that's always fun.
I think my next step is to make a XC40-friendly version of this:
http://www.ciole.net/quake_bench/
(ghost, if you're out there, would you condone my use of your scripts for my madness?)
A few years back I had a 6-core AMD at 4Ghz that could run that benchmark in 40 seconds. I'm betting that 32 cores on a cray will produce some interesting results.
My hope is...if I can really wrap my head around what it is q3map2 is doing under the hood, just for shits and giggles I'd like to see if I could get it to send jobs out across the cluster. My site has a very healthy number of compute-nodes available and I could plausibly get 10 to myself before the system even felt it.
Even if I only got 10, that's 320 cores and 1.2TB of ram (although I'd be stunned if q3map2 ever allocated more than 2GB)
Does anyone know what Id software used to compile the original quake 3 maps? I'd love to do a comparison/infographic/whatever
For the curious among you:
http://www.cray.com/products/computing/xc-series
-
Lieutenant Dan
- Posts: 1151
- Joined: Mon Jul 24, 2006 2:25 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
QUAKE3 MAP BENCHMARK 1.3 - RESULTS
==================================
OS = Win 6.2
CPU = AMD FX(tm)-8350 Eight-Core Processor
RAM = 4095 MByte
==================================
Map Compile = 00:01
Vis = 00:04
Bspc = 00:16
Lightning = 00:20
Total = 00:42
Meh.
==================================
OS = Win 6.2
CPU = AMD FX(tm)-8350 Eight-Core Processor
RAM = 4095 MByte
==================================
Map Compile = 00:01
Vis = 00:04
Bspc = 00:16
Lightning = 00:20
Total = 00:42
Meh.
-
VolumetricSteve
- Posts: 449
- Joined: Sat Nov 06, 2010 2:33 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
How did you get 42 seconds on an FX-8350 when I got 40 seconds on a T1100 years ago?
Though...thank you....you just saved me like 200 bucks because I was about to upgrade my old rig for kicks.
How odd. Anyway, the Cray XC-40 uses Intel Haswell processors and I'm growing increasingly convinced that inclusive cache does better in q3map2 than AMD's exclusive cache. There could be a million other things going on at the cpu architecture level as well, but difference in cache layout seems like a biggie to me.
Though...thank you....you just saved me like 200 bucks because I was about to upgrade my old rig for kicks.
How odd. Anyway, the Cray XC-40 uses Intel Haswell processors and I'm growing increasingly convinced that inclusive cache does better in q3map2 than AMD's exclusive cache. There could be a million other things going on at the cpu architecture level as well, but difference in cache layout seems like a biggie to me.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
VolumetricSteve wrote:How odd. Anyway, the Cray XC-40 uses Intel Haswell processors and I'm growing increasingly convinced that inclusive cache does better in q3map2 than AMD's exclusive cache. There could be a million other things going on at the cpu architecture level as well, but difference in cache layout seems like a biggie to me.

-
VolumetricSteve
- Posts: 449
- Joined: Sat Nov 06, 2010 2:33 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I'm weighing cache layout against other internal optimizations. I can see the advantages of inclusive cache, but I don't quite see why it results in the absolute floor-mopping it does - particularly when it comes to q3map2. I can get behind a "well duh" when it comes to Intel vs AMD on general HPC performance, but why this application specifically?
Edit : Tito's allegedly handmade vodka is pretty great.....at causing typos.
Edit : Tito's allegedly handmade vodka is pretty great.....at causing typos.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
Multiple cores on the AMD CPU is having to look at all cache levels for data and creating exchanges between cores ?
or
Floating point instructions ?
or
Scheduling is shit ?
or
Moar
or
Floating point instructions ?
or
Scheduling is shit ?
or
Moar
[color=red] . : [/color][size=85] You knows you knows [/size]
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
I see the lightning, but where's the thunder?
[size=85][url=http://gtkradiant.com]GtkRadiant[/url] | [url=http://q3map2.robotrenegade.com]Q3Map2[/url] | [url=http://q3map2.robotrenegade.com/docs/shader_manual/]Shader Manual[/url][/size]
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
-
VolumetricSteve
- Posts: 449
- Joined: Sat Nov 06, 2010 2:33 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
Thanks again to misantropia for fixing the bspc code so it can be compiled on *nix systems. I just got that to open on the XC-40 but it doesn't work quite right...that'll need some more work.
I'm trying to set up a benchmark that's as close to identical to ghost's standard-ish benchmark. I think I have all the pieces I need now...Though I was hoping to do q3dm7..I thought that was what the test used but it looks like it comes with q3dm1.
Results of my first trial runs:
bsp = 2/2 seconds (no errors)
vis = 2/2 seconds (has issues with PicoLoadModel so md3 models weren't accounted for)
light = 28/17 seconds (no errors)
bspc = fails, reports " 38777ERROR: Tried parent "
not sure what's going on there, but the most interesting thing about this can't be shown by the numbers unless I get a stopwatch or something. When the lightmap phase does its dumps between bounces, that adds a good second or two per-bounce. Our filesystem here is notoriously horrible, and it shows across lots of applications. All of our researchers gripe about the Lustre filesystem day and night. I'm curious to see if I can track down a GPFS system here and see if that makes a difference.
Watching the bounces process (IlluminateRawLightmap) was like nothing I've ever seen q3map2 do. Typically, I see some variance in compute time over each bounce "pass", for instance it would go much faster in the first few ticks, then spend some more time on subsequent ticks. This was more like a machete though warm butter. Nothing slowed down, it didn't matter which pass it was on, it was just a bullet until the end of each set.
What's even stranger is the second time I ran the test, all the results were the same except light...which is now clocking in at 17 seconds. Something must be going on in the background that provides some serious optimization, but I don't know where to start with hunting that down. I just ran it a 3rd time.....16 seconds.
Edit:
I tried compiling bspc with a different compiler, it still fails the same way but I assume at a different cycle of the same loop.
I'm trying to set up a benchmark that's as close to identical to ghost's standard-ish benchmark. I think I have all the pieces I need now...Though I was hoping to do q3dm7..I thought that was what the test used but it looks like it comes with q3dm1.
Results of my first trial runs:
bsp = 2/2 seconds (no errors)
vis = 2/2 seconds (has issues with PicoLoadModel so md3 models weren't accounted for)
light = 28/17 seconds (no errors)
bspc = fails, reports " 38777ERROR: Tried parent "
not sure what's going on there, but the most interesting thing about this can't be shown by the numbers unless I get a stopwatch or something. When the lightmap phase does its dumps between bounces, that adds a good second or two per-bounce. Our filesystem here is notoriously horrible, and it shows across lots of applications. All of our researchers gripe about the Lustre filesystem day and night. I'm curious to see if I can track down a GPFS system here and see if that makes a difference.
Watching the bounces process (IlluminateRawLightmap) was like nothing I've ever seen q3map2 do. Typically, I see some variance in compute time over each bounce "pass", for instance it would go much faster in the first few ticks, then spend some more time on subsequent ticks. This was more like a machete though warm butter. Nothing slowed down, it didn't matter which pass it was on, it was just a bullet until the end of each set.
What's even stranger is the second time I ran the test, all the results were the same except light...which is now clocking in at 17 seconds. Something must be going on in the background that provides some serious optimization, but I don't know where to start with hunting that down. I just ran it a 3rd time.....16 seconds.
Edit:
I tried compiling bspc with a different compiler, it still fails the same way but I assume at a different cycle of the same loop.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
E-PENIS
[color=red][WYD][/color]S[color=red]o[/color]M
-
VolumetricSteve
- Posts: 449
- Joined: Sat Nov 06, 2010 2:33 am
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
This is interesting, I recompiled q3map2 with more optimizations and other tinkering (no code changes, only compile time changes)
It builds and runs...but now it reports that the map file has a leak where it didn't before so now BSP can't create a prt file to pass on to VIS.
It builds and runs...but now it reports that the map file has a leak where it didn't before so now BSP can't create a prt file to pass on to VIS.
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
Can you use the computational power to turn Radiant into a REPL like experience?
Re: Happen to have a Cray XC-40? Q3Map2 will work on it.
Steve, how come the Cray isn't powerful enough to just smash the times regardless of how unoptimised the code is for its architecture (if that's the right word)?
I'd have thought you just have enough power to crush anything made for commercial systems? - is that not the case?
I'd have thought you just have enough power to crush anything made for commercial systems? - is that not the case?

