• SIEMENS PLM has announced officialy that a new release of NX Nastran 9.1 is now available by download. The release contains numerous corrections and an initial prototype support for GPU processing. A complete description of product improvements can be found in the NX Nastran 9.1 Release Guide.
• With the new support of Graphical Processing Units (GPUs) and MIC (Many Integrated Core), NX Nastran users can reduce run times for certain solution types, such as modal frequency response solutions.
• The initial GPU/MIC support release is a prototype implementation that can be tested by all NX Nastran users. In a later release it will be made part of the NX Nastran Advanced Bundle and hence require an advanced license.
• The capability will work on hardware with AMD GPU cards or the NVIDIA Fermi and Kepler GPU cards. Initially, the functionality is enabled for the NX Nastran DCMP module for matrix decomposition and the FRRD1 module for frequency response computations. The decomposition performance improvements will be most noticeable on sparse meshes with a large number of degrees-of-freedom. The frequency response performance improvements will be most noticeable on models with a large number of modes.
• In subsequent releases, the capability will be supported by more modules and will be more generally applicable.
Solved! Go to Solution.
A very important piece of information missing here: GPU computing is a Linux-only functionality for NXN9.1 BTW, you probably meant "sparse matrix", not "sparse meshes", these are a completely different animal
Yes, DCMP and FRRD1 for now if you have AMD/NVIDIA GPU.
On Intel MIC hardware (Xeon Phi), the MKL layer can automatically offload any sufficiently large computation to the MIC core, so this theoretically applies to all modules in all solutions.
Note that the same level of support (same hardware and dmap modules) was delivered on Windows with NX Nastran 10.
See the "GPU Computing" article in the NX Nastran 10 Release Guide for more information:
The question for me is the "automatically offload" and "sufficiently large computation" statements. Is there some threshold? Our mid-size dynamics models are in excess of 5M dof, with 400-500 modes and GPU is not participating in the solve... I would have thought the problem was of decent enough size to trigger the GPU... I'm not sure if there is a way to force GPU Computing just to evaluate performance improvements...
The question I asked at NX9 .1 release still stands: any practical example? So far, I have only seen the release material from Siemens,or a simple copy/paste of it. Has anybody tried it and does anybody care to report performance metrics?
In the new FEMAP V11.2 upcoming release that comes as well with NX NASTRAN V10 we have available the activation of the GPU COMPUTING button.
According the NX Nastran V10 Release Guide (see also https://iberisa.wordpress.com/2015/02/18/62-nueva-version-de-nx-nastran-v10-0-diciembre-2014/) :
"AMD GPU cards, and the NVIDIA GPU cards are supported for NX Nastran matrix decomposition (DCMP module) and frequency response (FRRD1 module) computations"
FRRD1 Performance Example
AMD 24 core Magny-Cours, Tahiti GPU (4GB)
The damping definition in the model produced coupled damping matrices.
Modes were computed up to the given frequency, where e10k = 1785 modes, e20k
= 3631 modes, e30k = 5576 modes, and e40k = 7646 modes. GPU memory was
exhausted around 10,000 modes.
Yes, I am perfectly aware of and have read the published documentation and Siemens examples, this is not the point. I could equally claim this performance gain on my own, the problem is I can't: the models we use do not trigger GPU Computing routines. I know how to setup the environment variables in the NASTRAN rcf, but GPU Computing doesn't get triggered, which is precisely what the FEMAP check box would acomplish as well.
What I was really asking was two-fold:
1) Instead of copying/pasting Siemens examples I already have read, has anyone implemented and succesfully gained significant performance advantages with GPU Computing in NX NASTRAN? Again, not the published Siemens test cases, real life experience (or even experiment)
2) Since our models don't seem to trigger the automatic involvement of GPU Computing, is there a way to force the use of GPU Computing? I just want to see it in action, both in the f04 and on the GPU's performance monitor...