From time to time we perform modal random response analysis for solid models with 1.000.000 nodes and more. In the frequency response step, NX Nastran always allocates all the memory of the machine, independent of memory configuration in nast11.rcf. If I run the job on a 64 GB machine it grabs 64 GB, if I run on 128 GB it grabs 128 GB. The strange thing is, in a modal analysis with a relative small amount of modes used, there is not so much to do for the solver in this step. If I run the same job on an older NEi Nastran installation, memory allocation is about 10 to 15 GByte.
This memory problem results in unnecessary low preformance of this type of analysis, because of the fact that the machine starts paging. Any workaround?
Solved! Go to Solution.
Can you post the rcf file along with the log and f04 files from a run? That should contain all the info to track down what is happening.
Can you clarify what you are looking at when you say Nastran allocates all of the machine memeory? Are you looking at Windows task manager/perfomance? or something else?
The rcf file(memory=.45*physical) has limited Nastran to 45% and the f04 file confirms this is what was allocated. The memory highwater mark from the f04 shows the following:
This shows indicates the highwater mark for memory was about 25gb, but it also confirms that the DDRMM module performed over 400gb of I/O(this is the data recovery portion of the run)
Also in the f04 I found the following where multiple passes were required during the data recovery:
10:11:27 53:07 926.1G 89385.0 10449.5 103.3 MPYAD BGN P=7
10:18:13 59:53 1010.9G 86804.0 10807.2 357.7 MPYAD PASS= 1
10:23:41 65:21 1095.7G 86797.0 11132.6 325.5 MPYAD PASS= 2
10:29:05 70:45 1180.4G 86798.0 11453.5 320.9 MPYAD PASS= 3
10:34:46 76:26 1265.2G 86810.0 11790.2 336.7 MPYAD PASS= 4
10:40:22 82:02 1350.0G 86797.0 12122.8 332.6 MPYAD PASS= 5
10:46:00 87:40 1434.7G 86797.0 12457.7 334.8 MPYAD PASS= 6
10:51:21 93:01 1518.5G 592.4G 12775.9 2326.4 MPYAD END
Assuming you were looking at the windows task manager and it showed all of the memory being used, and the machine became somewhat unresponsive, this is likely another case where the windows I/O cache is the process that is grabbing all of the memory and actually nastran(although nastran is certainly causing this request for I/O).
The default limit for the windows I/O cache is ridiculously large and can grab the entire memory. I did a Femap blog post on this topic a while back, and there is a utility that can reset this limit to something reasonable like 40% of the ram so the combination of nastran and i/o cache does not exceed the machine limit.
Here is a link to the presentation,
We can provide the utility if you agree that this seems to be the likely problem. The other option would be to find a way to increase the memory allocation to the DDRMM module which seems to be the source of the resource issue you are encountering. I will look at options for doing that also.
thank you for answering. Your assumption is correct, I was looking at Windows task manager and resource monitor. And the systems become unresponsive, at least those with pagefile.sys on slower drives.
It sounds reasonable, that this is an I/O cache issue. (I once read your presentation, but it seems, I did not understand all of it.) So, I would appreciate your utility.
What does it mean, that multiple passes are needed during data recovery?
I have attached a zip file with the cache tool and a sample bat file that you can edit and use to run the cache. You can execute the tool only when you want to make this big run(setting goes back to default when you reboot), or I actually have it in my startup folder so it runs when my desktop boots up.
Another option is to run the batch file when Femap starts. You can enable this in the Preferences/Library/Startup form. The "8" in my batch file is number of GB for the cache file. So, for example, your machine has 128gb total, Nastran is allocating about 60 gb, if we set the cache limit to 50gb, then there is still 18gb for the OS and other processes to function, so your machine does not become unresponsive, you can at least check job status etc. I have found no bad effects by setting this all of the time via the windows startup folder.
As far as passes in the f04, it means that Nastran could not fit the entire operation into available RAM and had to make multiple passes(paging) to complete the calculations. In your case the message occurs when the DDRMM module executes, I believe the memory available to this module is controlled by the "buffpool" setting. This is being set in your rcf file "buffpool=20.0X"; so 20% of the total Nastran memory is used by buffpool; you could try increasing this to 30.0X, it might reduce the number of passes, that could speed the run up.
Thank you for the tool. This is the solution! I have done some little experiments.