The forum got a haircut!

neptronix

Administrator
Staff member
Joined
Jun 15, 2010
Messages
17,528
Location
Utah, USA
I spent this sunday cleaning up our database and file system. Because we have so much data that our server will barf if you just look at it wrong. 11 years of data can get pretty insane in size.

One thing to note is that ~9000 members were wiped from the ranks. These are users that in the last 10 years have signed up but never logged in or posted. Quite amazing how many accounts met that criteria. Prospective spammers perhaps?

If you guys see anything missing, let me know. I have a backup. :)


Another thing is, We have half a gigabyte of text in the form of private messages. Totally nuts!
I'm going to delete PMs that are older than 5.5 years. So if you have any 'sentimental' PMs you're keeping around.. make sure to copy and paste or save the page as a save file, however you want to do it.

I'm giving you all 30 days notice before the super ancient PMs are evicted.
 
Half a gig is nothing. I haven't deleted my own old PM's for specific reasons. They're like old emails I keep.

Why the name change from ebike photos and video, to now "build logs"?
 
John in CR said:
Half a gig is nothing. I haven't deleted my own old PM's for specific reasons. They're like old emails I keep.

Why the name change from ebike photos and video, to now "build logs"?

Because it's BRILLIANT! "Build Logs" is what has ALWAYS BEEN MISSING!
Vote Nep for God!

There could be new sections for:
"Me go so fast on ebike" videos
"Help, my commercial ebike crapped out"
"Don't do this at home!" for battery fires and explosion threads
"E bike porn" for photos of cool new stuff
"Learn & Chuckle" for threads by LockH
"Get annoyed, flame and burn" for threads by (or invaded by) You-Know-Who
Too bad it's impossible to to re-categorize everything.
 
John in CR said:
Half a gig is nothing. I haven't deleted my own old PM's for specific reasons. They're like old emails I keep.

It might not seem like much, but it's the difference between us being able to either get more performance and stability out of the system, or cut our server bills. The database is huge and i'm looking for cutting data anywhere i can because i don't want to touch the posts. :)

John in CR said:
Why the name change from ebike photos and video, to now "build logs"?

Because that's what most people are using that subforum for ( fits the description too ). The weird naming always bugged me, and i decided to be a dictator and change it. :pancake:

Actually on second thought, i changed it to E-Bike Build Threads / Photos & Video, so that all the people who are used to the old name don't get too confused over the matter.
 
How about a plugin that allows automatic export of all PMs as individual messages into a local folder, saved by PM name and author? Has anyone made something like that?

Cuz we're gonna need it. At least, I will.


I have 27 pages of PMs I've saved for reference over the years. That's just in the Inbox, and doesn't even include the subfolders with things I've categorized for easier finding (I'm gonna guess without looking that those contain at least that much more). And it doesn't include my replies to people, which will probably be that much more again.

I'd REALLY not like to have to dig them up locally every time I need them, expecially since the export functions all suck (see details below).



The only functions that exist in the forum right now to export them are to take one page at a time, in one folder at a time, and Export as (CSV, CSV (Excel), or XML).

None of these works in a really usable way, because the exported file is all just one giant page of text. Anything quoted just has the word quote with some characters around it, on each end, and if there wasn't a CR/LF between it and the replied text, it'll just be concatenated together.

CSV doesn't even save who the message is from or to, or the date, time, etc--it's just the message text, more or less.

XML has that info, at least, but still has the other problems.

The only way I can see to save things in a usable way is to manually open every single PM, one at a time, and copy/paste into a freshly made local file, and then save that file with the date, subject, sender/receiver as part of the name (so I can find the info later).

None of the attachments are exported, either. I don't see any way to do that other than opening every single PM one at a time, seeing if it has an attachment, and then saving that locally, too.

If I have to do this manually, it's probably going to take me weeks, probably months, so you are going to have to give a MUCH MUCH longer grace period than just 30 days.

I may need at least till the start of the new year to get it done, since I cannot spend very much time each day on it, and I expect it'll take a few minutes per PM to do. With thousands of messages, that's a very very long time.

And spending that time saving the PMs means not helping anyone on the forum for those weeks or months--I won't have time to read anything or help with people's projects, etc.


So either we need to delay this "haircut" quite a long time, or a better, actually useful, export function needs to be installed or created first.
 
Here's a question for you:

Have you made a test copy of the forum, and then wiped the PMs from the test copy to see what performance difference there is?

Cuz if there's no difference, there's no reason to put everyone thru all this.
 
Dang, you got me looking back at those PM's. Occasionally a good one, such as Mr. Magnets giving me a link to some flashlights that I save separately and don't need the original. A few thanking me for things like helping get a troll off their backs, but I made the world a better place and that's over and done with. Oh, I'd say half are the trolls themselves trolling because doing it on the board isn't enough for them, though every now and then one apologizes for it. I don't think I need to save my old PM's.

Maybe you just need a box to check for some of us to get the haircut but Amberwolf can not check his and they're saved still, that'll probably solve most of the data problem.
 
Trim away. Anyone that has pages of saved PM's ought to manage them. NOT the forums job.
 
Yeah, i could come up with a way to export old PMs before making this change.

You are right, the export to CSV per page option will create an ugly mess of files.. there should be something more like export to html..

Looking at the database structure, it wouldn't be too hard to export the PMs. Of course, half an hour of searching turns up no prebuilt tools to do this, so it's up to me to write the code.
 
I've noticed that you are seldom appreciated for the time you spend working on this forum. So here it is; thanks neptronix.
 
Hwy89 said:
I've noticed that you are seldom appreciated for the time you spend working on this forum. So here it is; thanks neptronix.

Thank you :)
 
Yes, thank you. I know I couldn't do the programming stuff (I could probably hack stuff until it did something different, and if I did it long enough I might find solutions to certain things, but it's not programming and not like what you probably do :oops: ).

Regarding the exports, I'm in the process of exporting all the pages as each of the three types it does have, just in case no other solutions end up being possible, but I really hope that some other solution becomes available.

The best solution is to leave the PMs as they are. ;)


How exactly does the PM section of the system/database really work, access-wise? Unless the system is doing something i'd consider unusual, I don't see a reason why the PMs would make a significant difference to performance of the public part of the forum. I would expect they would be a separate section of the database, if not a separate database completely, and not be accessed unless they're being read or written by the member they're for or by.


is it possible to do the test i suggested before, using a backup of the forum running on a test server, and comparing the performance before and after a "haircut"?
 
I don't know what has gone wrong, but the PM section is broken now.

The sent messages folder is completely empty, and the inbox has about twice as many pages of messages as it did before.

So I presume (hope!) that all of my sent messages are now in my inbox, making my existing exports useless because I don't know where I left off since the stuff isn't displayed on teh pages the same nor are the page numbers the same.

First I will have to move my sent messages individually back to the sent messages folder. That could take a few weeks or more by itself, before I can restart the process of saving pages of messages.

I haven't gone thru my other subfolders to see if stuff is still there or not.

Before I do any of this, could you tell me if it is something you can restore at your end without data loss?

Or if it is going to require my manual fix?


Side note: Harold in CR posted this thread about a different problem in the PM section, too:
https://endless-sphere.com/forums/viewtopic.php?f=1&t=95417
 
Working on it right now. phpbb uses negative values to store some data and things got garbled. Don't bother manually fixing it, i should have it taken care of in 30 mins or less.
 
Okay, this is fixed. Sorry for the bugs... i've been doing database surgery today.

I'm going to quit for the day and resume the last of the surgery very very early tomorrow morning.. like 6am pacific standard time.
There's not much left to do. :)
 
Thanks--it looks normal, but I'm still going to re-export everything from the beginning in case anything is not where it was before, so nothing gets missed, in case you can't make a better export tool before the haircut.




BTW, the forum also is having a "hiccup" problem where it is creating multiple copies of the same posts.

In some cases this even results in several listings of a thread in a row in my subscribed threads page in the UCP, though they are all pointing to the same thread (I think they each are pointing to the individual dupe posts).

It's possible that some of the posts are from people clicking the submit button multiple times when it doesn't quickly go to the next page, as it has been rather slow at certain times, but I had one of my own posts duplicated, and I definitely didn't click the button more than once. I just went to another thread in a different tab, and when I eventually came back to the first one after someone's reply, it had posted normally...except there were two copies of my post.

Some threads, like this one
https://endless-sphere.com/forums/viewtopic.php?p=1397437#p1397437
have several copies of the same post. If I counted right, this one has nine of the same post total (original and 8 copies).
 
i know all about it. side effect of the surgery i was doing. There will be more tomorrow morning. I will go and clean up things later.
 
I think an old backup of the PM stuff must have been restored.

Some of my latest PMs are missing, at least in the Sent messages folder, and the inbox.

There was also a message in my spam folder that I'd deleted that was back again.

I had sent a response to Thoroughbread on Saturday, and it's not there anymore, and neither is his response includign contact info, or mine after that including tracking info. None of the messages Dogman and I had been exchanging this weekend are there either, including contact info. There may be other messages missing from the inbox that I hadn't gotten to reading or replying to; I don't know.

I verified in a different browser that tehy're definitely not there.

Can you please restore these messages?
 
If you r trying to save half a gig by deleting PMs the forum needs a new sponsor. Sorry to be blunt
 
flathill said:
If you r trying to save half a gig by deleting PMs the forum needs a new sponsor. Sorry to be blunt

The problem is not disk space.
 
Some good news. I manged to find the memory/cpu savings i needed by performing a few dozen rounds of optimizations to the post data over the last 24 hours. I ended up getting a better improvement than i was looking for, but it was a hell of a lot more work.

The PM data is safe and i will not be removing old PMs.

</thread>
 
fechter said:
I've been cleaning up some of the duplicate posts. There are a lot of them.

Thank you so much for that. I've been focusing on trying to get the optimizations done.. but each one ended up locking up the posts table and therefore whenever you'd make a new post/reply, the site would hang for 15-30 minutes.. and go figure, people would try to submit the same post again..

Better to have too much data than not enough in this case.. :lol:
 
Just tried to update the CA download post by replacing a png and a pdf and attached file references got completely scrambled, showing attachments interchanged and missing.

I re-downloaded all old confused attachments, deleted the confused versions, re-inserted stuff into the page and got it so it displays okay in Preview. Now I can't save it and get this:





This looks like the number of post edits is too large. There should be 400+ edits on this post.
I have opened this post for edit in other windows and just tried to save it after doing nothing to it and always get the same error. I am locked out of changing the post. Same problem across different browsers so it's not a browser issue.

  • Have you re-defined the vartype of this 'edit count' column in the table or the code?
  • Can you restore the post to it's original form of a day ago and get this so I can edit it again?

This post is for distribution of the CA3 Guide and I need to get it fixed. That particular post is hyperlinked from PDF documents and many posts... ( https://endless-sphere.com/forums/viewtopic.php?p=571345#p571345 ) A scrambled version displays now.
 
Back
Top