Hello, getting some Errors on our PStorage - not much ... should I care about them or is it "normal" that they can happen? Thx Andreas Schnederle-Wagner
Hello Andreas These messages indicate some slowdown on pstorage side. According to messages posted pstorage-mount is slowing things down - disks and network is fine. Basically it means that some I\O request took longer than 8 seconds to complete. According to your logs - the delay is always in pstorage-mount. I've been hearing about this problem more and more often recently, I believe it's not a coincidence, and needs to be investigated properly. Please contact support - this smells like a bug we need to investigate.
Ok, thanks. I've seen recommendation - it does make sense. It might not solve the problem entirely but should make things better. Please note, cpuunits configuration is not reflecting some actual limit - it shows container's weight, cpu time share container gets. Also, node's processes have their own share which is defined in /etc/vz/vz.conf as VE0CPUUNITS= value. Having node CPUUNITS at 1000 and containers' CPUUNITS at 10k+ or even 100k causes CPU scheduling suffocation for node's processes. Since pstorage-mount is one of them you should consider increasing cpuunits for the node, or decreasing all the cpuunits for containers to make them more reasonable. I.e. decreasing all cpuunits of containers by 100 times you'll keep the same distribution of the cpu time share, however, will grant a bit more time for the ve0 processes. Besides, to avoid scheduling issues for containers with a small share it's recommended to keep min and max CPUUNITS among the containers not too far away. Like, max shouldn't be larger than min*10. Take a look at "vzcpucheck -v" or "vzlist -aHo ctid,cpuunits" and you'll see that your environment is optimal in terms of cpuunits.
alright - I configured it now that Root Node got 500.000 CPUUNITS and CTs get 50.000 CPUUNITS - will then adjust Containers based on how important they are ... but not that wide spread as before ... ;-) Unfortunately it did not solve the "8000 ms" Problem ... got another Warning a few Minutes ago! :-/ 11-05-16 14:02:10 CLN WRN IO requests on client #2224 serviced more than 8000 ms Also updated Ticket ps) What about CTID 1 - it got only 1000 Units per default - should I also set it to 50k?
It might not resolve the issue entirely, but at least improve situation. Can you assess the volume of messages before and after? Also, you can renice pstorage-mount process to give it a higher priority. E.g. Code: # renice -n -19 -p $(pgrep pstorage-mount) Or, you can use another nice value of your choice.
alright - will try this one. But Root Nodes are NOT short on CPU ... so this should normally be no problem?!? When restarting a Node participating within PStorage - it's normal to have lot's of those Errors?!? Shouldn't the Client request the Data from another CS sooner?
You're missing the point. Problem is not in the actual load in CPU, but the scheduling latency. Supposedly. Restarting a node shouldn't trigger such messages. These are not errors - just a warnings for a long I\O. That's not how it works. "Write" operation is not considered complete until data is successfully written to all chunks involved in I\O.
alright ... now I got it! ;-) Unfortunately I'm not really experienced with CPU Scheduling ... if that's the "problem" - what can we do?!
Dear Andreas, It's a bad idea to discuss this in parallel with support and on a forum with me simultaneously. That throws in a lot of confusion. I'll let support do their job
I posted in a different thread recently...after we upgraded our RAID controller firmware and Intel SSD drive firmware these incidents went way down and almost disappeared. I suspect the Intel SSD firmware since even though we're using the recommended model we had some really wonky behavior on a Windows server made late last year until we updated those drives' firmware. Intel has updated the drive firmware multiple times in the past six months.
Hi Steve, thx for your input. We are using Samsung SM863 SSDs. Will check if there is a new Firmware.
I posted a different error in http://forum.odin.com/threads/neighbour-table-overflow-on-hardware-node.338149/, a "Neighbour table overflow" error. Per that thread we updated the ARP cache sizes for IPv4 and IPv6, and in the 14 days since have had zero "8000 ms" warning messages. Usually we got some every few days.
Steve, that's quite interesting. Although that would only help if you have the "overflow" messages, I assume
Hi Steve, thx for pointing that our - but "unfortunately" no overflow messages on our Servers ... just lot's of "client serviced more than 8000 ms" Errors we can't get rid of ... and that on a SSD only Storage Sub-System ... ;-)
Pavel, I agree that logically it should only affect the neighbour overflow message. I forgot to mention that I did go back and look, and most of our recent "8000 ms" instances did not get the "neighbour" message. However the only other change was a week earlier (the 11th) when we installed update 11 hotfix 11, and we got both messages after that (the 16th). However we have not had the "8000 ms" warning since. If I do see one I'll post back but I think this our longest period by far. Edit: the behavior for the "neighbour" issue was different also...the CS was up and down over (from memory) 10 minutes or thereabouts. Usually we get just 1-3 "8000 ms" event entries.
Following up on this thread, we upgraded our cluster storage servers to Virtozzo 7 over the past few weeks. The "serviced more than 8000 ms" messages started cropping up again and became daily by the time we were done upgrading (2-10 times or so per day, in bunches). Remembering this thread, I found that VZ 7 has these settings in a file /etc/sysctl.d/99-vzctl.conf which contains, among other lines: net.ipv4.neigh.default.gc_thresh2=2048 net.ipv4.neigh.default.gc_thresh3=4096 net.ipv6.neigh.default.gc_thresh2=2048 net.ipv6.neigh.default.gc_thresh3=4096 Notably the default (thresh1) is left at 128. I created an alphabetically-later-named (so it loads last) file /etc/sysctl.d/z-filename.conf with the lines we had used in Virtuozzo 6: net.ipv4.neigh.default.gc_thresh3 = 8192 net.ipv4.neigh.default.gc_thresh2 = 4096 net.ipv4.neigh.default.gc_thresh1 = 1024 net.ipv6.neigh.default.gc_thresh3 = 8192 net.ipv6.neigh.default.gc_thresh2 = 4096 net.ipv6.neigh.default.gc_thresh1 = 2048 After running "sysctl -p /etc/sysctl.d/z-filename.conf" to import the settings the message hasn't been logged since (4 days).
Sure. We didn't have any "8000 ms" errors for several days in a row there but the last few days we've had generally one per day, all but one around 4:12 am - 4:16 am. That seems oddly specific but I'm not sure it can be tied to overnight backups or something like that, from looking at when those run.