GFI Software

Welcome to the GFI Software community forum! For support please open a ticket from https://support.gfi.com.

Home » GFI User Forums » Kerio Connect » Serious problem since 9.03 (Message queue backlog)
Serious problem since 9.03 [message #129660] Fri, 13 May 2016 16:58 Go to next message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
We upgraded to 9.03 over the weekend (just before 9.04 was released, such is our luck). Since then, we have had a lot of trouble with delays processing the message queue. The queue climbs into the hundreds of messages, and it often takes several HOURS for mail to be delivered.

We are using Debian 7, and the mail store is on a 3TB btrFS volume. This was working fine with version 8.5.3.

Is this one of the 'stability issues' that was corrected in version 9.04? I'd rather not downgrade unless I really have to. I don't want to confuse and annoy users by flip-flopping between web mail interfaces.

I'm being told to fix it now, and it doesn't help that we just had a consultant on-site preaching O365.
Re: Serious problem since 9.03 [message #129662 is a reply to message #129660] Fri, 13 May 2016 17:13 Go to previous messageGo to next message
Pavel Dobry (Kerio) is currently offline  Pavel Dobry (Kerio)
Messages: 2057
Registered: October 2003
Location: Czech Republic
There is no known bug in 9.0.3 that can cause queue size growing. I recommend checking the status of emails in the queue, mail log and also debug log with "Queue operations" messages.

[Updated on: Fri, 13 May 2016 17:13]

Report message to a moderator

Re: Serious problem since 9.03 [message #129664 is a reply to message #129662] Fri, 13 May 2016 17:38 Go to previous messageGo to next message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
Nothing in the Debug log stands out as an error. Ditto the Mail log. In the queue monitor, the status is either blank or 'recipient's mailbox busy'. The busy message makes up about 45% of the messages in the queue, and it is many users.

Update:
Searching for 'error' or 'fail' in the debug/queue processing log finds only a message about failing to 'encode the x-ffoter header for the domain'

Searching 'Delay' shows many delays for 'mailbox busy' as shown above.

Update 2:
We tried to remove some messages from the queue. Now they are still there, but are displaying "Spoofed Sender Check Cannot be Performed". We do not have spoofed sender checking enabled in the security tab of the admin page.

[Updated on: Fri, 13 May 2016 19:19]

Report message to a moderator

Re: Serious problem since 9.03 [message #129704 is a reply to message #129664] Mon, 16 May 2016 17:17 Go to previous messageGo to next message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
One of the things I've noticed is that there is a significant delay ('Outlook Not responding') when messages are sent. I see it hit the outbox folder, but it sometimes takes a a minute or more to recover and send the message. I see nothing of interest in any of the server logs, but my local KOFF debug log has these lines, but I'm not sure what they mean. Extra spaces removed for readability:

[16/05/2016 11:05:27.592](9628){err}{convertor} In MapiConvertor\MapiConvertor.cpp:238 (storeFromMapiToMime)
[#216] (B59E)  Exception of class HResultException: MapiConvertor\MapiName.cpp(46), MapiName::fromPropertyTag: 0x80004005 E_FAIL // Conversion has failed.

[16/05/2016 11:05:27.593](9628){err}{scp-worker} In SCProvider\ConvertorHelper.cpp:272 (ConvertorHelper::convertMapiStore)
[#217] (B59E) Exception of class HResultException: SCProvider\ConvertorHelper.cpp(266), ConvertorHelper::convertMapiStore: 0x80004005 E_FAIL // HRESULT: 0x80004005 E_FAIL

[16/05/2016 11:05:27.593](9628){err}{scp-worker} In SCProvider\Worker_msgstore.cpp:88 (SyncRequestStore::processSyncMsgStore)
[#218] (B59E) Exception of class HResultException: SCProvider\Worker_msgstore.cpp(229), SyncRequestStore::putStoreToServer: 0x80004005 E_FAIL 

[16/05/2016 11:05:27.593](9628){err}{scp-worker} In SCProvider\Worker_refresh.cpp:314 (SyncRequestRefresh::testResultSynchronizedPersonalStore)
[#219] (B59E) Personal store synchronization failed, proceeding to synchronization of hierarchy. 
Re: Serious problem since 9.03 [message #129706 is a reply to message #129664] Mon, 16 May 2016 17:41 Go to previous messageGo to next message
j.a.duke is currently offline  j.a.duke
Messages: 239
Registered: October 2006
Bud Durland wrote on Fri, 13 May 2016 11:38
Nothing in the Debug log stands out as an error. Ditto the Mail log. In the queue monitor, the status is either blank or 'recipient's mailbox busy'. The busy message makes up about 45% of the messages in the queue, and it is many users.


Bud,

I was seeing this with 8.5.x on a Mac Mini with a software RAIDed ThunderBolt enclosure.

I fixed the problem by moving to a hardware RAIDed ThunderBolt enclosure and upgrading to 9.0.x.

My mailbox busy issue would move between accounts-there were 4 or 5 repeaters, but often the culprit would be a random account. A reboot would cure the problem for about a day or two before it returned.

How full is your mail store volume?

I'm trying to remember all the variables that contributed to the issue, but in the end, it seemed to come down to both disk speed and KC version.

Cheers,
Jon
Re: Serious problem since 9.03 [message #129710 is a reply to message #129706] Mon, 16 May 2016 19:36 Go to previous messageGo to next message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
j.a.duke wrote on Mon, 16 May 2016 11:41


How full is your mail store volume?

I'm trying to remember all the variables that contributed to the issue, but in the end, it seemed to come down to both disk speed and KC version.


As far as we can tell, we've slao come to the conclusion that it is a disk issue. In this case the volume is on a SAN configured to support VMWare HA. The SAN supports many other VM's, and we're working on the conclusion (based in part of observation) that the I/O channel to the SAN is saturated. We're moving our mail store (3.0 TB, 1.3TB used) to a different, more lightly loaded SAN to see if that helps.

We're also changing the volume's file system from BTRFs to Ext4. I'm not sure if that's a factor, though. We had switched from EXT4 to BTRfs when we increased the size of the mail store, hoping for better performance. It ran fine under 8.5x, though we didn't really see any performance increase. It was when we upgraded to 9.03 that we started having trouble. I don't know that the file system has any impact in this case, but we're switching back to make sure.
Re: Serious problem since 9.03 [message #146315 is a reply to message #129710] Sun, 28 July 2019 13:38 Go to previous messageGo to next message
simion.chis is currently offline  simion.chis
Messages: 20
Registered: February 2014

Hi Bud,
this is an old thread but, can you share what you have learned from that experience?
I'm about (I have to) to change my server and I'm not sure what it is best for me.
Now, I have KC 8.xx on windows server 2013. The store it is on a local hdd 1TB (700GB in use,~650000 files, 70 mailboxes).

I want to switch to linux (debian). I can not decide (lack of experience):
1) local storage or some remote file system NAS?
2) EXT4 or btrfs

Any advice it is more than welcome!

Thank you,
Simion
Re: Serious problem since 9.03 [message #146326 is a reply to message #146315] Tue, 30 July 2019 18:06 Go to previous messageGo to next message
Maerad is currently offline  Maerad
Messages: 275
Registered: August 2013
simion.chis wrote on Sun, 28 July 2019 13:38
Hi Bud,
this is an old thread but, can you share what you have learned from that experience?
I'm about (I have to) to change my server and I'm not sure what it is best for me.
Now, I have KC 8.xx on windows server 2013. The store it is on a local hdd 1TB (700GB in use,~650000 files, 70 mailboxes).

I want to switch to linux (debian). I can not decide (lack of experience):
1) local storage or some remote file system NAS?
2) EXT4 or btrfs

Any advice it is more than welcome!
Yeah, with that many mailboxes, it's time to get a bit more professional - at least it sounds it's running on a small, self build server Very Happy

For Kerio, if you use any kind of Online-Access (Website, Active Sync, IMAP etc.), I would first check the I/O you currently have.
An external storage is IMHO out of question if you can't connect it with FC(oE) or any other, low latency network. And at least 4 GB/s IF you gonna use a fast raid.
If you keep it local (nothing bad IMHO, I still prefer local storage if there is no FC in play and at least 2 SAN's), get some SATA or SAS enterprise hardware with a good hardware raid controller and enjoy Raid 10, 5 or 6. SSD is fine too. Really depends on the I/O.
The underlying file system dosn't matter at all. EXT4 and BTRFS have some differences, but nothing that could be a dealbreaker for kerio in any way.

Heck, depending on the I/O you could also do with 2 1-2TB SSD's in Raid 1 Very Happy

And last but not least - if you are not too much into the linux part, I would suggest to stick with windows. Yes, license costs are a pain in the ass, but it's easier to manage AND to get external help if somehtings fucked up.
Re: Serious problem since 9.03 [message #146342 is a reply to message #146326] Fri, 02 August 2019 09:18 Go to previous messageGo to next message
simion.chis is currently offline  simion.chis
Messages: 20
Registered: February 2014

Maerad, thank you for your advice.

Yep! I need to go a little bit professional. I cannot afford FC so I'll go with local storage. That was also my first choice but now it is better cemented. And if something goes wrong it is your fault too Smile (small joke)

I am thinking to btrfs because of snapshot and send/receive features. Now the backup is made with windows backup service (onto a external hdd) and also with KC backup (21 hours for full backup Mad ).
From what I've read btrfs it is not adopted (as default) in debian. Same problem with ZFS. With debian distro I am more comfortable.
Maybe I'll go with RAID10 for storage but I still don't know how to solve the backup. Rsync for 700.000 or more files? I think that this is a bit crazy idea.

Re: Serious problem since 9.03 [message #146351 is a reply to message #146315] Fri, 02 August 2019 23:12 Go to previous message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
simion.chis wrote on Sun, 28 July 2019 07:38

this is an old thread but, can you share what you have learned from that experience?
I'm about (I have to) to change my server and I'm not sure what it is best for me.
Now, I have KC 8.xx on windows server 2013. The store it is on a local hdd 1TB (700GB in use,~650000 files, 70 mailboxes).
Some advice based on experience (some more painful than others)
On similar hardware, Connect run much faster in Linux than in Windows. At least that was the case when we migrated from Win2008R2 to Debian about 5(?) years ago. Win2016 may have narrowed the gap, but I have neither the time or resource to test.

Using Internal storage is fine, as long as you use a hardware based RAID (raid 10 being preferred), and fast disks. Many people are using SSDs for storage with good success,and I wouldn't be afraid to put high end Samsung or Intel SSDs in the system. Some of the newer NVMe drives are proving to be pretty robust in such an environment. NVMe raid is a bit specialized, tho, so probably higher $$$.

If you do move to linux, investigate using rsync for backup. The native backup system in Connect is OK, but takes forever on larger data stores (which is kinda where you are). You might be able to attach a cheap NAS to the network (or server) and use it as the rSync target. Very little difference between ext4 and btrfs, so I'd stick with ext4.


Previous Topic: Split Delivery (accounts partly on Kerio, partly s/w else)
Next Topic: Behind NAT and PROXY
Goto Forum:
  


Current Time: Fri Feb 03 14:53:16 CET 2023

Total time taken to generate the page: 0.02154 seconds