Considering the buffer has no or a very high price the answer would be a different one for sure. As well as the price, many other parameter (WIP, due time, process limits like temperature or age) do influence such a decision.
Keeping all this out of scope, the most simple way would be to place a buffer at any meaningful position in your lines (between each Single- or ParallelProc) and to setup an ExperimentManager modifying each buffers capacity in two (min, max) or three steps (min, mid, max) first. If throughput is your main goal, just pick the winning configuration. The number of steps and the number of to be considered result values (instead of soleily throughput) would be a next iteration step.
Usually it´s a little bit more complicated, of course.