GFI Software

Welcome to the GFI Software community forum! For support please open a ticket from https://support.gfi.com.

Home » GFI User Forums » Kerio Connect » Optimizing disk utilization on a 3.2TB store
icon1.gif  Optimizing disk utilization on a 3.2TB store [message #136781] Tue, 19 September 2017 02:35 Go to next message
Hartz is currently offline  Hartz
Messages: 10
Registered: June 2014
Location: Australia
Hi all,

Looking for suggestions on optimizing storage for our Kerio Connect server.

Domains: 24
Users: 412
Mailbox Store: 3.2TB
Scenario: every user must keep every message back until the dawn of time, just in case.

We are about to migrate to a new server with additional storage on a Dell Compellent SAN. CentOS with ZFS pool for mail store is what I am investigating (we already run CentOS).

I really want to try deduplication but ZFS dedupe seems out of reach. 961 million 4k blocks for our existing 3.2TB store would require 307GB of memory to store the dedupe tables (providing my calcs are correct). My test of a single 200GB domain produced a dedupe ratio of 1.2 (shy of the 2.0 recommended for dedupe) so it seems to not be worth it either way; I thought this dedupe ratio seemed low though.

ZFS compression seems like a instant yes with improved disk I/O performance and only a hit to CPU load which is acceptable in our scenario.

Is there anything I am missing? Any alternate solutions? Some built in dedupe would be amazing.


Regards,
H.

[Updated on: Tue, 19 September 2017 03:27]

Report message to a moderator

Re: Optimizing disk utilization on a 3.2TB store [message #136804 is a reply to message #136781] Tue, 19 September 2017 23:52 Go to previous messageGo to next message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
I'm not 100% certain, but from what I've seen of how Kerio stores messages on the server, I'm not sure a de-dupe application would be all that effective anyway. During a recent "all is well" GFI road map webinar, they said that a new message store database is coming in early 2018 (?), that will yield radically higher performance for searches and such. Maybe they will actually use a real dbms rather than flat text files.

How do you plan to migrate the message store? We used rsync and a creative script for a similar sized store to minimize user downtime.
Re: Optimizing disk utilization on a 3.2TB store [message #136805 is a reply to message #136804] Wed, 20 September 2017 00:07 Go to previous messageGo to next message
Hartz is currently offline  Hartz
Messages: 10
Registered: June 2014
Location: Australia
Hi Bud,

I plan to rsync over n days and then have a final sync overnight for cutover. We already use rsync for backups rather than the built in backup process as it's quicker and just as effective.

Interesting news about the dbms in the roadmap. Definitely won't be rolling that update out on day 1.
Re: Optimizing disk utilization on a 3.2TB store [message #136816 is a reply to message #136805] Wed, 20 September 2017 14:43 Go to previous messageGo to next message
Bud Durland is currently offline  Bud Durland
Messages: 586
Registered: December 2013
Location: Plattsburgh, NY
Just to be clear, GFI did NOT specifically say there was a dbms based message store in the future, just that there would be a new datastore mechanism that would provide radically improved performance. I just don't see how that will happen without some type of dbms.
Re: Optimizing disk utilization on a 3.2TB store [message #145761 is a reply to message #136781] Fri, 10 May 2019 20:01 Go to previous message
robvas is currently offline  robvas
Messages: 4
Registered: November 2018
Location: USA
Hartz wrote on Mon, 18 September 2017 20:35
Hi all,
We are about to migrate to a new server with additional storage on a Dell Compellent SAN. CentOS with ZFS pool for mail store is what I am investigating (we already run CentOS).

I really want to try deduplication but ZFS dedupe seems out of reach. 961 million 4k blocks for our existing 3.2TB store would require 307GB of memory to store the dedupe tables (providing my calcs are correct). My test of a single 200GB domain produced a dedupe ratio of 1.2 (shy of the 2.0 recommended for dedupe) so it seems to not be worth it either way; I thought this dedupe ratio seemed low though.

ZFS compression seems like a instant yes with improved disk I/O performance and only a hit to CPU load which is acceptable in our scenario.

Is there anything I am missing? Any alternate solutions? Some built in dedupe would be amazing.
Old thread, but I figured I would chime in:

From ZFS website:

RAM Rules of Thumb
If this is all too complicated for you, then let's try to find a few rules of thumb:

For every TB of pool data, you should expect 5 GB of dedup table data, assuming an average block size of 64K.

This means you should plan for at least 20GB of system RAM per TB of pool data, if you want to keep the dedup table in RAM, plus any extra memory for other metadata, plus an extra GB for the OS.


4K blocks would to small to use IMO. You should be good with 64-96GB of RAM on the storage side.
Previous Topic: Shared folders created in Outlook can't be seen by others
Next Topic: Sending emails in Kerio Connect Client
Goto Forum:
  


Current Time: Mon Mar 20 19:13:16 CET 2023

Total time taken to generate the page: 0.02344 seconds