RE: Storing duplicate attachments [UPDATED AGAIN]
|
Logged in as: Guest
|
|
Users viewing this topic:
none
|
|
Login | |
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 1.May2005 12:32:00 PM
|
|
|
chrishaas
Posts: 54
Joined: 20.May2004
Status: offline
|
Thanks for keeping up with me on this. I hope more people use this so that we can get duplicate attachments removed from future builds.
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 10.May2005 11:19:00 PM
|
|
|
Guest
|
It strikes me that this should be a built-in function of this product, since identical attachments to multiple users is such a frequent occurance in most organizations. For that matter, true SIS (single-instance storage) functionality should really be implemented, since that is essentially the result of a well-designed (normalized) database.
GFi, what do you say?
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 12.May2005 8:57:00 AM
|
|
|
chrishaas
Posts: 54
Joined: 20.May2004
Status: offline
|
Technically this does handle it as SIS. If you send one message to 20 people, Exchange only stores one copy and so does GFI (one copy goes to the Archive mailbox which GFI scans). The problem is that if that mail is then forwarded on to someone else with the attachment changing, GFI will store another copy because to it it is a new message. I'm pretty sure Exchange does the same thing, though.
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 13.Jun.2005 4:01:00 AM
|
|
|
PACEGMBH
Posts: 62
Joined: 23.Mar.2005
From: DE - Berlin
Status: offline
|
After a couple of weeks I wanted to take a look at how much space is wasted in my 3 gig archive because of duplicate attachments. Unfortunately the Shapoc tool won't work anymore.
It exits with following error:
Error retrieving stats. Application exiting.
Chris, have you got an idea ?
Rgds, John
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 19.Jun.2005 5:52:00 PM
|
|
|
chrishaas
Posts: 54
Joined: 20.May2004
Status: offline
|
I get this sometimes, too, when the SQL Server is busy. I probably need to increase the connection timeout period but I just don't have time yet. Usually after trying 2 or 3 times it goes through. If you're really curious and have time you can pause the GFI archiver and anything else that might be pounding the DB, run the program and then start things up again. When I update the program with a timeout fix I'll report here. BTW, here's where I'm at now
code:
Current number of files : 48,792 Optimized number of files : 39,730 Difference : 9,062 (18.573%)
Current size of files : 5,539,386,938 Optimized size of files : 3,864,426,437 Difference : 1,674,960,501 (30.237%)
i also go through the DB every couple of months and try to remove as much spam as possible as well as attachments that shouldn't be there in the first place. luckily we don't have to worry about the SO act here.
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 21.Jun.2005 8:08:00 PM
|
|
|
PACEGMBH
Posts: 62
Joined: 23.Mar.2005
From: DE - Berlin
Status: offline
|
Hi Chris,
i testet your tool again after a reboot and no one connected. I had no luck. I copied the database to another server, too. Same behavior on this server.
Have you got any other ideas ?
Regards, John
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 24.Jun.2005 3:25:00 AM
|
|
|
SWZzorg
Posts: 1
Joined: 23.Jun.2005
From: Son
Status: offline
|
Hello,
I tried running this tool against a backupdatabase, but I stil get the error that it can't connect.
I run it locally on the Mailarchive server. serverip: 10.1.1.7 databasename: TestMailarchive
I think i answer the first question of this tool wrong. What should be my input??
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 1.Jul.2005 10:12:00 AM
|
|
|
mjohnson
Posts: 17
Joined: 23.May2005
Status: offline
|
how about a version that can handle a DB that has 11 gigs of data?
sp_spaceused
database_name database_size unallocated space -------------------------------------------------------------------------------------------------------------------------------- ------------------ ------------------ mailarc 20264.25 MB 10028.23 MB
reserved data index_size unused ------------------ ------------------ ------------------ ------------------ 10479896 KB 10186352 KB 288888 KB 4656 KB
Please enter the database name/IP:db-ov Should I use your current NT credentials? [Y/n/q]:y
Successfully connected to SQL Server
Please enter the database name that contains your GFI Mail Archiver database. Enter @list to list databases, @quit to quit:mailarc
Database appears to be a valid GFI Mail Archiver database The next step will alter the physical schema of the database. Do you wish for me to continue? [Y/n/q]:y
Column already exists, continuing.
Should I begin creating SHA-512 hashed? This process might take a while. [Y/n/q] y
Error retrieving stats. Application exiting
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 4.Jul.2005 12:46:00 PM
|
|
|
chrishaas
Posts: 54
Joined: 20.May2004
Status: offline
|
Okay, I've updated it again. Its not an 11GB barrier exactly, it was the program timing out when connecting to larger databases. The default timeout is 30 seconds but I've updated that to 120 seconds which should be fine enough. If anyone needs more I can kick it up further. The link above contains the most recent version.
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 5.Jul.2005 4:33:00 AM
|
|
|
mjohnson
Posts: 17
Joined: 23.May2005
Status: offline
|
ok, now it dies
Column already exists, continuing.
Should I begin creating SHA-512 hashes? This process might take a while. [Y/n/q] y
There are 67,199 total attachments (65,999 of which need hashing) totaling 8,732 ,863,805 bytes.
Unhandled Exception: System.InvalidCastException: Specified cast is not valid. at GFIMailArchiverSHAPOC.Module1.CreateHashes() at GFIMailArchiverSHAPOC.Module1.Main()
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 18.Jul.2005 3:51:00 PM
|
|
|
cameron
Posts: 6
Joined: 20.Dec.2004
Status: offline
|
Successfully updated hashes. Current number of files : 96,067 Optimized number of files : 76,698 Difference : 19,369 (20.162%)
Current size of files : 9,174,574,947 Optimized size of files : 6,537,154,574 Difference : 2,637,420,373 (28.747%)
Almost a 3rd of the DB is wasted... Please GFI try and put this in V3!
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 4.Aug.2005 2:00:00 AM
|
|
|
PACEGMBH
Posts: 62
Joined: 23.Mar.2005
From: DE - Berlin
Status: offline
|
Hi Chris, unfortunately your new version dies on my server, too. The error message is the the same one that mjohnson gets.
Have you got any suggestions ?
Rgds,
John
|
|
|
|
RE: Storing duplicate attachments [UPDATED AGAIN] - 8.Aug.2005 3:44:00 AM
|
|
|
robgibbs
Posts: 33
Joined: 15.Jul.2004
From: Richmond
Status: offline
|
This sounds wonderful. One question, after deleting duplicates are they correctly linked to the ONE copy so that they can be retreived through the MailArchive web interface?
I would assume so, but needed to ask.
Tks
|
|
|
|
New Messages |
No New Messages |
Hot Topic w/ New Messages |
Hot Topic w/o New Messages |
Locked w/ New Messages |
Locked w/o New Messages |
|
Post New Thread
Reply to Message
Post New Poll
Submit Vote
Delete My Own Post
Delete My Own Thread
Rate Posts |
|
|