synthetikal.com Forum Index


The Hive Filez Offline Searchable
Goto page Previous1, 2
Post new topic   Reply to topic    synthetikal.com Forum Index -> The Hive Files
Author Message
user

Joined: 04 Apr 2005
Posts: 9
303.36 Points

Wed Apr 06, 2005 11:57 pm
Reply with quote

Yes, thanks to nubee for taking the time.

Thinking about the hive database has occupied a few idle minutes for me here and there. When the hive was up I would, in my mind's eye, mock up pseudo code to perform a non-intrusive, below the radar, spidering of the archives.

Now many of the threads are available as threads on static pages of html and I'm back to idle fantasy; this time thinking about reverse engineering the static pages back into database records. A doable, if somewhat tedious job.
Or not; programming has it's own rewards, and the finished product would be quite worthwhile.

Is this item available as a database dump? That'd be the ticket.

So anyhow, I'm speculating. For the sheer entertainment value I guess, because in time it will probably all be back up, or available on every fileshare from here to godknowswhere.

~user

Never wear a shirt that matches your pants. Wear a wrinkled shirt whenever possible. Your shirt never be tucked in completely. Button the top button without wearing a tie. This will maximize your "nerd" mystique
Back to top
nubee
Master Archiver
Joined: 18 Feb 2005
Posts: 215
Location: homeless
18648.26 Points

Mon Apr 11, 2005 3:48 pm
Reply with quote

...

Last edited by nubee on Sat Apr 16, 2005 6:53 am; edited 1 time in total
Back to top
orqan

Joined: 10 Apr 2005
Posts: 1
27.92 Points

Fri Apr 15, 2005 1:41 am
Reply with quote

i get the error message that says the site has reached its bandwidth limit. Any other ways of downloading the archive ? like p2p's ?

thanks
Back to top
nubee
Master Archiver
Joined: 18 Feb 2005
Posts: 215
Location: homeless
18648.26 Points

Fri Apr 15, 2005 6:40 am
Reply with quote

i just uploaded them into rapidshare, to get unti the next month , and just as another dl source, see the first post in this thread for the url's. Very Happy
Back to top
CherrieBaby
chouchou
Joined: 01 Mar 2005
Posts: 67
3070.02 Points

Fri Apr 15, 2005 8:33 am
Reply with quote

User - I didn't spider the hive but I spidered one of the recent hive file archive sites so I have many of the html files but my archive is far from complete.

I nearly wrote that program to put the hive posts back into a database. I have the WSH code still. I never finished it but it just needs code added to do the actual database import which I could do if you needed it. The hard bit - parsing post info is done. Naturally the system can't cope with images very well and there may be some other issues for a minority of posts.

PS: I'm willing to complete and document my WSH script to re-database-ise them if anyone seriously needs it. PM me if you really need it. We need to standardise on the actual database required. I'll probably do it first for SQL Server, which is what I'm most familiar with.
Back to top
nubee
Master Archiver
Joined: 18 Feb 2005
Posts: 215
Location: homeless
18648.26 Points

Fri Apr 15, 2005 9:21 am
Reply with quote

user, the archives mentioned in the first post of this thread contain all the html files available from the online archives that you talk about getting some of. it would reallly save the servers here if you downloaded that instead of from the original site , as there is a large reduction in size by comppressing.
Back to top
user

Joined: 04 Apr 2005
Posts: 9
303.36 Points

Fri Apr 15, 2005 10:52 am
Reply with quote

nubee,

yes, I got that ok from the 'mobile info' site. I must have been one of the first customers.

And yes, that is interesting cherriebaby.

Appreciated. I feel a bit like a vulture, being so enthusiastic about picking over these scraps of the downed site, but .. er, there you go,

cheers,

~vulture
Back to top
user

Joined: 04 Apr 2005
Posts: 9
303.36 Points

Fri Apr 15, 2005 11:04 am
Reply with quote

>PM me

hey cherrie,

I've done just that. Sorry for reduntant lines of communication, but having not got any back and forth pm happening yet I don't quite trust it,

~user

"with all-new minty flavour"
Back to top
Guest

0.00 Points

Mon Apr 18, 2005 11:09 am
Reply with quote

In light of a program that can turn the html pages of the hive, into it's forum state again;
If anybody does pursue this, or wants to arrange a way of actually doing this with the help of other board members, then this is something that we would definately be interested in, and encourage, and make room for.

I understand that is definately possible, and it is something which we would generously host, and include in our resources,

It's funny, we are also trying to extract this forum(synthetikal) into indexed html topics, someting we are having a little trouble with,To date we have found that spiders will probably be the answer, but the work is again tedious.

Nubee's archive site is a great resource for our Site, and we welcome more collaborative work to achieve a greater good

Syn
Back to top
user

Joined: 04 Apr 2005
Posts: 9
303.36 Points

Tue Apr 19, 2005 3:00 am
Reply with quote

Hi

syn:

Quote:
It's funny, we are also trying to extract this forum(synthetikal) into indexed html topics, someting we are having a little trouble with,To date we have found that spiders will probably be the answer, but the work is again tedious.


How do you mean that? Something like, put the contents of a thread into one static html page?

~user
Back to top
user

Joined: 04 Apr 2005
Posts: 9
303.36 Points

Wed Apr 20, 2005 7:01 am
Reply with quote

ubb with some modifications? or what?
Back to top
Guest

0.00 Points

Wed Apr 20, 2005 8:44 am
Reply with quote

Thanks for your reply User,

Yes, we need to turn this forum, ie, all the topics, and catergories,and their posts need to be made into html static web pages,

Similar as to our Hiveboard Online Revided Files,

We need the software, that can tear out the posts and index them in order on html pages,
Spiders don't seem to do what we need,

It needs to be nothing flash, just plain text, with each catorgory, having every post in it to search,

syn
Back to top
user

Joined: 04 Apr 2005
Posts: 9
303.36 Points

Wed Apr 20, 2005 12:34 pm
Reply with quote

Hey nooow,

Quote:
Yes, we need to turn this forum, ie, all the topics, and catergories,and their posts need to be made into html static web pages,


I guess one of the hurdles is the way that the pages seem to cap the posts at 15. Not being familiar with the specifics of this bb, I'd say that you'd want to look for a configurable option, or a 'switch' in the url query string, or what would very likely be a minor edit in the sql query fetching the posts, typically a 'limit' clause. Do one of those , temporarily if you want, and do a scripted 'fetch', 'wget' or the likes to cycle through the urls by topic number. If you used a scripting language to do the fetch you could do some mining of info while they being fetched. Grab the first 'Post subject' field and massage into a filename, so the item gets a friendlier name, ie. The_Hive_Filez_Offline_Searchable.html, or process urls or ..

That would get you close to the state of the archived hive files.

Disclaimer: I rarely get anything finished for long before discovering that there was a _much_ easier way to do it. There's probably an admin option to dump the whole thing to html.

ciao
Back to top
Davidus

Joined: 18 Feb 2005
Posts: 7
Location: Australia
267.90 Points

Fri Apr 22, 2005 12:59 pm
Reply with quote

I've checked out the new version of the mfi site, but haven't saved it all yet; everything seems to work. Do you know if the Hive posts java is posting are different to what is in the Hive archive? What a huge amount of work nubee has done on this! (as well as Syn, java & many others) Anyway, a big thumbs-up to nubee for his mammoth effort on what is a true information "vault."
Thanks man Wink It's people like you that make me want to keep on living! Very Happy
Back to top
nubee
Master Archiver
Joined: 18 Feb 2005
Posts: 215
Location: homeless
18648.26 Points

Sat Apr 23, 2005 6:05 am
Reply with quote

in the new posts section are alot more threads, im not sure whether they are what's been posted online, i think some of them are but other's are not, i did get a lot of extra's which are contained in a seperate index, but are indexed in the search engine.

best thing is to have a look, if you feeling up to it, keep a backup copy of the archive, and then open up afsearch.exe in the afsearch folder in the archive, and you can add and remove more files to the search engine index yourself, but bewarned, it has a tendency to get mucked up, so keep a backup.
Back to top
Display posts from previous:   
Post new topic   Reply to topic    synthetikal.com Forum Index -> The Hive Files All times are GMT + 5.5 Hours
Goto page Previous1, 2
Page 2 of 2

 



Powered by phpBB 2.0.11 © 2001, 2002 phpBB Group

Igloo Theme Version 1.0 :: Created By: Andrew Charron