vmsyslog filling up vCenter database and how to stop it

Posted December 11th, 2014 in Vmware by Dieter

In our datacenter we had a few customers where their vCenter database filled up at an alarming rate. After checking in MSSQL what table was the culprit, we found out the VPX_EVENT and VPX_EVENT_ARG tables kept growing. Sure enough, in vCenter we saw a constant flow of these alerts:


Also in the vmkernel.log file we saw similar messages:

2014-11-18T22:52:03.354Z cpu40:3167700)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T22:52:03.355Z cpu40:3167700)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T22:55:03.383Z cpu42:3167699)ALERT: vmsyslog logger lost 756 log messages
2014-11-18T22:55:03.384Z cpu42:3167699)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T22:55:03.386Z cpu42:3167699)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T22:55:03.395Z cpu37:3167700)ALERT: vmsyslog logger lost 852 log messages
2014-11-18T22:58:03.419Z cpu44:3167699)ALERT: vmsyslog logger lost 991 log messages
2014-11-18T22:58:03.422Z cpu37:3167700)ALERT: vmsyslog logger lost 992 log messages
2014-11-18T22:58:03.424Z cpu37:3167700)ALERT: vmsyslog logger lost 2 log messages
2014-11-18T23:01:03.454Z cpu44:3167699)ALERT: vmsyslog logger lost 1090 log messages
2014-11-18T23:01:03.467Z cpu36:3167700)ALERT: vmsyslog logger lost 1102 log messages
2014-11-18T23:01:03.468Z cpu36:3167700)ALERT: vmsyslog logger lost 3 log messages
2014-11-18T23:04:03.482Z cpu44:3167699)ALERT: vmsyslog logger lost 1073 log messages
2014-11-18T23:04:03.494Z cpu38:3167700)ALERT: vmsyslog logger lost 1153 log messages
2014-11-18T23:07:03.521Z cpu38:3167699)ALERT: vmsyslog logger lost 525 log messages
2014-11-18T23:07:03.532Z cpu38:3167700)ALERT: vmsyslog logger lost 893 log messages
2014-11-18T23:10:03.554Z cpu37:3167699)ALERT: vmsyslog logger lost 718 log messages
2014-11-18T23:10:03.566Z cpu37:3167700)ALERT: vmsyslog logger lost 796 log messages
2014-11-18T23:10:03.568Z cpu37:3167700)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T23:10:03.568Z cpu37:3167700)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T23:13:03.589Z cpu44:3167699)ALERT: vmsyslog logger lost 939 log messages
2014-11-18T23:13:03.595Z cpu40:3167700)ALERT: vmsyslog logger lost 987 log messages
2014-11-18T23:16:03.628Z cpu46:3167699)ALERT: vmsyslog logger lost 811 log messages
2014-11-18T23:16:03.629Z cpu44:3167700)ALERT: vmsyslog logger lost 926 log messages
2014-11-18T23:19:03.659Z cpu38:3167700)ALERT: vmsyslog logger lost 844 log messages
2014-11-18T23:19:03.670Z cpu46:3167699)ALERT: vmsyslog logger lost 846 log messages
2014-11-18T23:19:03.671Z cpu46:3167699)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T23:22:03.690Z cpu42:3167700)ALERT: vmsyslog logger lost 868 log messages
2014-11-18T23:22:03.691Z cpu42:3167700)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T23:22:03.704Z cpu36:3167699)ALERT: vmsyslog logger lost 948 log messages
2014-11-18T23:25:03.723Z cpu42:3167700)ALERT: vmsyslog logger lost 609 log messages
2014-11-18T23:25:03.741Z cpu46:3167699)ALERT: vmsyslog logger lost 585 log messages
2014-11-18T23:28:03.760Z cpu42:3167700)ALERT: vmsyslog logger lost 756 log messages
2014-11-18T23:28:03.774Z cpu40:3167699)ALERT: vmsyslog logger lost 790 log messages
2014-11-18T23:31:03.794Z cpu42:3167700)ALERT: vmsyslog logger lost 882 log messages
2014-11-18T23:31:03.806Z cpu40:3167699)ALERT: vmsyslog logger lost 963 log messages
2014-11-18T23:31:03.808Z cpu40:3167699)ALERT: vmsyslog logger lost 1 log messages
2014-11-18T23:34:03.828Z cpu42:3167700)ALERT: vmsyslog logger lost 719 log messages
2014-11-18T23:37:03.851Z cpu39:3167699)ALERT: vmsyslog logger lost 1616 log messages
2014-11-18T23:37:03.862Z cpu36:3167700)ALERT: vmsyslog logger lost 761 log messages

An issue with vmsyslog so it seems. Our syslog configuration however was perfectly fine:


loghost = <none>
default_timeout = 180
logdir_unique = false
rotate = 8
logdir = <none>
size = 1024
loghost = tcp://, tcp://
rotate = 8
size = 1024

Digging a bit deeper into the syslog error log revealed this:


2014-11-18T23:37:03.861Z vmsyslog.msgQueue        : ERROR ] - lost 761 log messages
2014-11-18T23:37:03.866Z vmsyslog.loggers.network : ERROR   ] - socket error : [Errno 32] Broken pipe
2014-11-18T23:40:03.895Z vmsyslog.msgQueue        : ERROR ] - lost 1045 log messages
2014-11-18T23:40:03.899Z vmsyslog.loggers.network : ERROR   ] - socket error : [Errno 32] Broken pipe
2014-11-18T23:43:03.933Z vmsyslog.msgQueue        : ERROR ] - lost 661 log messages
2014-11-18T23:43:03.937Z vmsyslog.loggers.network : ERROR   ] - socket error : [Errno 32] Broken pipe
2014-11-18T23:46:03.969Z vmsyslog.msgQueue        : ERROR ] - lost 761 log messages
2014-11-18T23:46:03.973Z vmsyslog.loggers.network : ERROR   ] - socket error : [Errno 32] Broken pipe
2014-11-18T23:49:04.002Z vmsyslog.msgQueue        : ERROR ] - lost 573 log messages
2014-11-18T23:49:04.006Z vmsyslog.loggers.network : ERROR   ] - socket error : [Errno 32] Broken pipe
2014-11-18T23:52:04.036Z vmsyslog.msgQueue        : ERROR ] - lost 665 log messages
2014-11-18T23:52:04.040Z vmsyslog.loggers.network : ERROR   ] - socket error : [Errno 32] Broken pipe

At this stage, I was starting to wonder if the other side was actually working fine. We use VMware Syslog Collector to capture these logs and although the service was running fine, we noticed this in the debug log:

(C:\ProgramData\Vmware\Vmware Syslog Collector\Logs\debug.log)

2014-11-19 06:26:27,345 - sc.twisted - Log observer &lt;bound method DefaultObserver._emit of &lt;twisted.python.log.DefaultObserver instance at 0x01095F80&gt;&gt; failed.
Traceback (most recent call last):
File "twisted\python\context.pyo", line 59, in callWithContext
File "twisted\python\context.pyo", line 37, in callWithContext
File "twisted\internet\selectreactor.pyo", line 154, in _doReadOrWrite
File "twisted\python\log.pyo", line 221, in err
--- <exception caught here> ---
File "twisted\python\log.pyo", line 292, in msg
File "twisted\python\log.pyo", line 623, in _emit
exceptions.IOError: [Errno 9] Bad file descriptor
2014-11-19 06:26:27,348 - sc.core - TCP connection lost from
2014-11-19 06:26:27,348 - sc.log - Error cleaning up state: ''
2014-11-19 06:26:47,546 - sc.core - New TCP connection from IPv4Address(TCP, '', 58438)
2014-11-19 06:26:47,549 - sc.twisted - Unhandled Error
Traceback (most recent call last):
File "twisted\python\log.pyo", line 84, in callWithLogger
File "twisted\python\log.pyo", line 69, in callWithContext
File "twisted\python\context.pyo", line 59, in callWithContext
File "twisted\python\context.pyo", line 37, in callWithContext
--- <exception caught here> ---
File "twisted\internet\selectreactor.pyo", line 146, in _doReadOrWrite
File "twisted\internet\tcp.pyo", line 455, in doRead

After some digging around so see how this service actually works (it basically is a Python script), I’ve noticed the Twisted framework being used is actually quite old (v8.2). I’ve then replaced SyslogCollectorLibrary.zip (the whole Python 2.6 environment compressed) with a version in which I’ve included the latest version on Twisted for Python 2.6 (v12.0). I stopped the service, replaced the zip file and restarted the service and almost immediately the flow of events stopped.

The modified SyslogCollectorLibrary.zip is posted on the VMWare Communities forum (https://communities.vmware.com/docs/DOC-28466) , let me know if this patch works for you as well!

Connection failed after large file transfer between Mac and Windows

Posted November 7th, 2011 in Networking by Dieter

I recently switched to the Mac side and so far I hadn’t got any big problems. That is until one rainy day I suddenly lost the ability to connect to my Windows shares on my desktop! The really strange thing is nothing had changed…

I’ve searched high and low for solutions and found out all I had to do was change two values in the Windows registry :)

You can read all about it on http://alan.lamielle.net/2009/09/03/windows-7-nonpaged-pool-srv-error-2017, but for those who don’t want to click, here is a quick summary:

Apparently Windows doesn’t like it when you’re transferring large files trough the samba protocol which leads to memory allocation errors. Change the following registry keys to fix this once and for all:


HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management\LargeSystemCache (change to “1″)

HKLM\SYSTEM\CurrentControlSet\Services\LanmanServer\Parameters\Size (change to “3″)


After changing these values and restarting the “Server” service, I was back able to connect to my beloved shares :)


Ldapsearch without all the line wrapping

Posted January 8th, 2011 in Networking by Dieter

Anyone who have  used ldapsearch (you know, that handy dandy tool to query your LDAP-database) in a script, will eventually notice one big problem. For reasons beyond my understanding, the original author of this tool decided that it would be cool to apply line wrapping when output is generated with lines over 76 characters. Guess what, it’s not… I searched on the almighty Internet for a solution and found several, but the following I liked the most.

perl -p00e 's/\r?\n //g'
# Example
ldapsearch -xLLL -h oh.mighty.ldapserver -b dc=example,dc=com | perl -p00e 's/\r?\n //g'

Why? It’s short, does it’s job fast and perl is widely supported. I found this little gem on a mailinglist, but because it’s so easily overlooked, I posted it here.

Happy New Year!

Posted January 1st, 2011 in Random by Dieter

Although I don’t have a lot of visitors, I wish each and everyone of you a great 2011. May the best day of last year become the worst one of this year!

Setting up a fileserver with Samba, OpenLDAP and Kerberos

Posted December 29th, 2010 in Networking, Tutorial by Dieter

I searched high and low for a good guide on how to set up a Samba server that uses Kerberos for authentication and OpenLDAP to get the users. I stumbled upon the Ubuntu Community Guide, which gave me some insight on how to get Samba to play along with Kerberos, but it didn’t provide any details about LDAP integration. After some searching and a lot of testing, I finally completed this complex puzzle.

Because I’ve recently acquired the good habit of documenting the stuff I do (well, at least I try to…), I’m putting it here for future reference. I’m also sure it will be of some use for someone else :) . I’ve tested this on Ubuntu 10.04 and 10.10 without significant problems. I’ve set up a Kerberos KDC using the Ubuntu Server Guide (Kerberos) and the Ubuntu Server Guide (Kerberos and LDAP) . So let’s get going, first we’ll start by installing some packages.

sudo apt-get install samba libnss-ldap krb5-user

This installs samba (duh…), krb5-user, which is needed for the Kerberos part and libnss-ldap, which is needed for the LDAP part. Doing this will trigger some configuration screens to pop up. When configuring ldap-auth-config, use the following settings:

LDAP server Uniform Resource Identifier: <your LDAP-server, something like ldap://ldap.example.com>

Distinguished name of the search base: <your search base, something like dc=example,dc=com>

LDAP version to use: 3

Make local root database admin:  No

Does the LDAP database require login: No

Next, you’ll be presented with some configuration screens for Kerberos, type in the following:

Default Kerberos version 5 realm: <your Kerberos realm, something like EXAMPLE.COM>

Now, if you’ve done your homework properly and your DNS-server has got some appropriate SRV-records, the wizard will notice this and you’re done. Otherwise, you’ll have to give it some more information like the FQDN of the server hosting the Kerberos KDC daemon. Normally, you’re back at the commandline. Execute the following command to update your /etc/nsswitch.conf file:

sudo auth-client-config -t nss -p lac_ldap

Switch to the computer that acts as the Kerberos KDC and generate a keytab for the Samba server using the following commands:

kadmin -p admin/admin
kadmin: addprinc -randkey cifs/fileserver.example.com
kadmin: ktadd -k /path/to/keytab -e rc4-hmac:normal cifs/fileserver.example.com

Replace fileserver.example.com with the FQDN of the Samba server, this is important! Next get the resulting keytab on the Samba server, for example by using scp. Next, we’ll adjust some files. Edit /etc/ldap/ldap.conf as follows:

# LDAP Defaults
# See ldap.conf(5) for details
# This file should be world readable but not world writable.
BASE    dc=example,dc=com
URI     ldap://ldap.example.com
#SIZELIMIT      12
#TIMELIMIT      15
#DEREF          never

And finally, adjust /etc/samba/smb.conf and make following adjustments:

   workgroup = LINKUP.LOCAL
   server string = %h server (Samba, Ubuntu)
   dns proxy = no
   log file = /var/log/samba/log.%m
   max log size = 1000
   syslog = 0
   panic action = /usr/share/samba/panic-action %d
   security = ADS
   realm = LINKUP.LOCAL
   password server = your.kdc.server
   kerberos method = dedicated keytab
   dedicated keytab file = /etc/krb5.keytab
   encrypt password = true
   obey pam restrictions = no
   unix password sync = no
   passwd program = /usr/bin/passwd %u
   passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* .
   pam password change = yes
   map to guest = bas user
   usershare allow guests = no
   comment = "Public share for everyone"
   path = /data/public
   browsable = yes
   guest ok = yes
   read only = no
   create mask = 0755
   comment = "Private share"
   path = /data/private
   browsable = yes
   guest ok = no
   read only = no
   create mask = 0755
   # This is a group from OpenLDAP
   valid users = @testgroup

In this example, there are two folder I shared, make sure whatever you want to share, actually exists (in my example /data/public and /data/private). Believe it or not, that was all that was necessary! Do the following quick tests to check everything is working properly:

getent passwd

Should return your users, including the users from OpenLDAP. Next, get a client computer and log in. Make sure your user got a valid ticket (check klist) and try the following:

smbclient -k \\\\fileserver.linkup.local\\public

If this results in a smb prompt, your work is done! If not, leave a comment and I’ll try to help you out :)

LPIC-1: part 1

Posted December 27th, 2010 in Certificates by Dieter

It wasn’t simple and required a lot more experience to succeed than the Microsoft certificates, but I managed to pass my first exam! Things I immediately noticed, were the quality of the available course material. While the MS-books are fairly detailed and clear, the books covering LPI certification are surprisingly short and are lacking in detail. That led to an increasing amout of personal experience required if you would want to pass this. Luckily, I had my fair share off Linux experience, but still…

Nonetheless, still only 3 exams and 405€ (!) to go to become LPIC-1 (and there’s also 2 and 3!!!!!) certified…

Site finally finished!

Posted December 26th, 2010 in Random by Dieter

It has taken quite some time, but without further ado, I present you  my new site :) I hope you all like it because every minute I spend working on this, I had better spent on finishing my school assignments. Nonetheless I think putting together this site wasn’t a waste a time, mainly because now I finally got a place to dump my non-stop rambling! Even more important, at last I got a place where I can post information I find on the internet,  knowing that that piece of information will be offline within the next few weeks (for example, the bits and pieces for implementing a Kerberos KDC and client login!)

Thanks for visiting and leave a comment if you like the site!

From now on MCTS!

Posted December 26th, 2010 in Certificates by Dieter

After a long, hard time of studying, I finally acquired my first two Microsoft certificates! For those who don’t know me IRL, the last few weeks I’ve been busy learning for the 70-640 exam (Configuring Windows Server 2008 Active Directory) and the 70-642 exam (Configuring Windows Server 2008 Network Infrastructure) and the trouble finally paid off! With a very nice score of 880 and a more average of 716 respectively I passed both exams and became a Microsoft Certified Technology Specialist! Needless to say, I’m more than happy with this achievement :)