Increasing shared memory limits on gpu25/27
Predrag Punosevac
predragp at andrew.cmu.edu
Thu May 11 09:38:00 EDT 2023
Hi Brian,
The honest answer is that I have no idea whether or not increasing shared
memory segments is a good or a bad idea. The good news is that one of our
staff members, Piggy Yarroll, has done some significant work on the Linux
kernel, so he might be able to give us some ideas. I would not be
surprised to find out that some other Auton Lab account holders have a much
better idea what is the right cause of action than me.
My knee jerk reaction is that kernel variables are set there for a reason.
The prevailing philosophy in the OpenBSD camp to which I belong is that one
should not touch them unless one really knows what she/he is doing. I don't
:-( The only tunables I have ever played in the Auton Lab are the ZFS ZIL
and SLOG.
https://www.truenas.com/blog/zfs-zil-and-slog-demystified/
In my personal experience, it takes a real computer scientist (which I am
not) even to correctly measure performance. The most benchmarks I have seen
around the internet are just plain wrong and the tests are set up
incorrectly. If I really knew anything about squeezing performance out of
the system I would be making big bucks working in industry :-)
Best,
Predrag
On Thu, May 11, 2023 at 3:35 AM Brian Yang <brianyan at andrew.cmu.edu> wrote:
> Hi Predrag,
>
> I've been running into issues with loading larger data batches on these
> machines to make use of the bigger GPUs. Basically we seem to have ample
> RAM / swap / shared memory / GPU memory available to load much larger
> batches, but I think I'm running up against the system-wide limit on the
> number of shared memory segments.
>
> Would it be possible for us to increase the number of shared memory
> segments allowed on GPU25 / 27, maybe from 4096 to 16384?
>
> I checked the current limit with "cat /proc/sys/kernel/shmmni". I think
> you can change the limit with "echo 16384 > /proc/sys/kernel/shmmni",
> although you would also need to do something like "echo
> "kernel.shmmni=16384" >> /etc/sysctl.conf" to make the change persist on
> rebooting.
>
> If there's something I'm missing / a reason why increasing the limit is a
> bad idea, please let me know. There might be a way to rewrite my data
> loader to avoid this issue, but I think this would be much easier if it's
> doable without major side effects.
>
> Thanks,
> Brian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.srv.cs.cmu.edu/pipermail/autonlab-users/attachments/20230511/62124034/attachment.html>
More information about the Autonlab-users
mailing list