I have a FEM model consisting of 88 000 nodes and about 86 000 Quad4 shell elements. This is distributed between two FEM models connected in Assembly FEM by bolted connections and contact elements.
An iterative contact analysis is performed using the linear solver SOL 101. In total 18 load combinations (subcases) needs to be solved
The following typical data are found in the F04 file:
MEMORY REQR'D TO AVOID SPILL=20403 K WORDS (=0.15GB)
MINIMUM MEMORY REQUIREMENT=6913 K WORDS (=0.05GB)
MEMORY AVAILABLE=2315574 K WORDS (17.3GB)
The 64 bit solver nastran64Lw.exe is chosen, this to utilize more than 8GB RAM. The values in parenthesis above is calculated by me based on this. I use the advanced simulation version NX 10.0.3.5.
The size of the Scratch file (Memfile,Scratch and SCR300)in the F04 file is reported to be less than 3.5GB
The problem is solved on a HP z840 with Intel XEON processor E5-2680v3 (3MB cache, 2.5GHz, 12 cores). My system has 64GB RAM and 500GB SSD disk. Using default values in Nastran gives then a RAM size of 0.45*64=29GB for Nastran. The Scratch memory is by default 20% of this and therefore about 6 GB.
Different values of the MEM and SMEM parameters are tried to reduce the solution time, -higher and lower than the default values, but the effect is negligible on the solution time. This seems sound since the entire problem size is small and should fit inside the RAM without needing to save scratch data to disk (?).
The only parameter I have tried reducing the solution time is setting the parallel to for instance four. This reduces the total solution time from about 52 minutes to 43 minutes.
From the task manager I observe that the CPU usage, when running with parallel = 4, typically is about 4% with shorter peaks up to about 20%. The RAM used is about 12GB, but the total RAM installed is as mentioned 64GB.
To me the system resources seems not to be used in an optimized manner by Nastran (or me).
I would highly appreciate recommendations how to utilize my computer in a better way. Where is the bottleneck and how can I improve the performance running in SMP?
DMP could maybe be a solution, for instance distributing the 18 load combinations between several processors. But am I correct that this is not possible on my current computer having only one processor, even though it has 12 cores? Anyone who has experience with installing more processors and running in DMP?
Having looked into various posts in the forum recommending big SSD and large RAM, I finally convinced my management to invest in a strong and expansive computer with a lot of RAM for FEM analysis. At the moment I have difficult to answer when my management ask how the new computer meets my expectations, it has a lot of resources not being used…
Geir Olav Guddal
Harding Safety as
You write "The 64 bit solver nastran64Lw.exe is chosen", but this is not correct, to activate the NX NASTRAN ILP 64-bit solver you need to use the "nastran64L.exe", this is a common error I see many times.
My opinion regarding the ILP 64-bit solver is "use it only when required", only when the normal LP 64-bit "nastran64.exe" solver give error because not enought RAM memory allocated for the nastran solver (and so take advance of available RAM above 8 GB). In case of error the proper nastran will write in the F06 file the following error message “USE THE ILP64 VERSION OF NX NASTRAN” . Please note the regular LP 64-bit "nastran64.exe" solver with 4-bytes_per_word is faster than the ILP 64-bit solver with 8-bytes_per_word , then for not big models I suggest to use it (in your case 88.000 nodes is an small model).
Also, with the LP 64-bit solver do not use mem=8GB, you will get error, use mem=7.99GB
If you have a 12-cores CPU I suggest to use PARALLEL=12 to takae advance of SMP computing. If the HyperThreading is activated in your CPU I suggest to desactivate it in the BIOS of the computer.
And finally the nastran SCRATCH directory: the most radical way to speed the nastran solver is to use a fast PCI-Express 3.0 x4 NVMe SSD drive for the for Nastran scratch (for instance, the Samsung SSD 950 PRO 512GB M.2) that feature a transfer data speed of 2500 MB/s, ie, around 5 times faster than a regular SSD drive, impressing!!. The more I/O that Nastran performs, the more it helps.
Thanks for your advices!
You are right that I misunderstood the differences between some of the Nastran executables.
I have now tried a huge large of combinations with different executables (32bit and 64bit), MEM and SMEM settings. For my particular problem is seems to have small effect which solver I use. The MEM and SMEM settings also seems to have small effect, but MEM 7.99GB and SMEM 4GB gives a small improvement compared to using default values. The SMEM of 4GB gives no i/o for the Scratch files, and is an increase from the standard value of 0.2*MEM=1.6GB if using default SMEM with MEM=7.99GB. A setting of parallel=6 gave the best performance with respects to utilization of the cores.
Disabling the HyperThreading did not give a big impact on the solution time for my particular problem, but it looks like the CPU usage are higher aft this change. Running with parallell=12 and HyperThreading disabled, increased the solution time by 3 minutes and gave 100% CPU usage. Consequently, I am not sure if it is good to use HuperThreading or not. However from the Nastran installation and operation guide I see that this HyperThreading disabling is recommended.
From the above you see that I still am a little unsecure about the optimum settings, but the good thing is that the default settings seems to be OK.
Most of my problems consist of shell elements and I think I most of the time will manage with the limitation of 7.99GB RAM for the LP-64 solver.
I understand it is recommended to specify as little memory as possible by the MEM keyword. However, if having a big internal memory of let say 64GB, is it then a good idea to specify MEM=7.99GB as default when using the the LP-64 solver? – the system should then have plenty of RAM left for other processes?
Finally, does anyone have experience with the ESTIMATE program as mentioned in the Nastran installation and operations guide? Is it useful and how could it best be used?
Thanks a lot!
Geir Olav Guddal
..., but the good thing is that the default settings seems to be OK.
Dear Geir Olav,
The above statement is the important key, in fact, since NX Nastran V10.0 I am very happy with new nastran configuration file (.rcf) settings, the default configuration help users to take better advantage of machines with more memory easily, automatically, whitout the need to ask for too much RAM resources which leads to performance issues. Using the default NX Nastran settings + DIRECT SPARSE SOLVER is the way I run with Shell models.
Regarding using mem=estimate, forget it at all, it was the old method, I found always low accurate.
You can take a look to the following post in my blog where I am talking about NX NASTRAN Performance Optimization and Hardware Requirements:
I also increase nastran BUFFSIZE keyword to max value:
$Buffsize=8193 if DOF=100000
$Buffsize=16385 if 100000 < DOF = 100000
$Buffsize=65537 if DOF>400000
Hi Geir Olav,
What you are describing (7.9GB MEM, 4GB SMEM) seems like a good default configuration. We have similar machines and have the same settings :-) The rest of the RAM will be used by NASTRAN i/o through Windows, which should help with writting results.
Honestly, I'm not sure SMP would help that much, DMP would... DMP is absolutelty doable on a single CPU, NASTRAN recognizes the cores. SSD in RAID0 would make a tremendous difference, and even adding a cache software (e.g. Primo Cache, but many others exist) that uses some of your RAM as an I/O cache could add a little extra...
Thanks for your comments. I would like to run DMP and LDSTAT method using keywords "dmp=p DSTAT=1", for instance by dividing the 18 load cases in three groups to be solved in parallell.
Problem is that I am doing something wrong and is not able to start the analysis.
In my case with with 1 prosessor, 12 cores and 18 load cases, how should this be defined in the Solver paremeters dialogue as shown below?
Geir Olav Guddal
Hi Geir Olav,
You can add your DMP instructions in the "additional keywords". First though, there is some setup to do before one can run DMP, take a look at the Parallel Processing User's Guide, Paragraph 7.3 for instructions.
Thanks for your comments regarding DMP. I have checked with Siemens and to run DMP analysis I need an enterprise license (NXN001 or NXN021) of NX Nastran in addition to the add-on module NX Nastran DMP.
Unfortunatelly I don't have these licenses and am not able to test if this investment is worth the money or not. In case anyone out there have an interest of comparing SMP and DMP I attach my sampe problem. I also think this has an interest for many users.
Short problem description:
Due to the contact elements a large number of iterations is required. The solution time could possibly be reduced if using DMP and dividing the load cases between for instance between three processors. I am particularly interested in results from a single computer with multiple cores.
Geir Olav Guddal
Harding Safety as
Dear Geir Olav,
Inspecting the mesh quality of your NX AdvSim model in FEMAP (is very easy!!) according NX NASTRAN admissible threshold you have some elements that failed, I suggest to improve your model:
Check Element Quality 85862 Element(s) Selected... Quality Check Number Failed Worst Value Quad Skew 0 33.68721 Quad Taper 183 0.908086 Quad Warp 0 0.00505076 Quad IAMin 1 28.87704 Quad IAMax 23 158.4539 Quad AR 0 35.88587 Tria Skew 4 0.600091 Tria IAMax 1 163.6902 Tria AR 0 26.81386 200 Elements Failed out of 85563 Checked.
If I run a Normal Modes/Eigenvalue analysis using SMP parallel processing with 4 cores I note the solution takes around one minute & 2 GB of scratch disk space, then this model is not so big to take advanced of DMP in a linear static analysis (by the way, a natural frequency value f1=51.1 Hz means that the structure is quite stiff!!):