|
|
| Author |
Message |
user
|
| Joined: 04 Apr 2005 |
| Posts: 9 |
|
303.36 Points
|
|
|
appetite for archives
Wed Apr 06, 2005 11:57 pm |
|
|
Yes, thanks to nubee for taking the time.
Thinking about the hive database has occupied a few idle minutes for me here and there. When the hive was up I would, in my mind's eye, mock up pseudo code to perform a non-intrusive, below the radar, spidering of the archives.
Now many of the threads are available as threads on static pages of html and I'm back to idle fantasy; this time thinking about reverse engineering the static pages back into database records. A doable, if somewhat tedious job.
Or not; programming has it's own rewards, and the finished product would be quite worthwhile.
Is this item available as a database dump? That'd be the ticket.
So anyhow, I'm speculating. For the sheer entertainment value I guess, because in time it will probably all be back up, or available on every fileshare from here to godknowswhere.
~user
Never wear a shirt that matches your pants. Wear a wrinkled shirt whenever possible. Your shirt never be tucked in completely. Button the top button without wearing a tie. This will maximize your "nerd" mystique |
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Mon Apr 11, 2005 3:48 pm |
|
|
...
Last edited by nubee on Sat Apr 16, 2005 6:53 am; edited 1 time in total |
|
| Back to top |
|
 |
orqan
|
| Joined: 10 Apr 2005 |
| Posts: 1 |
|
27.92 Points
|
|
|
Fri Apr 15, 2005 1:41 am |
|
|
i get the error message that says the site has reached its bandwidth limit. Any other ways of downloading the archive ? like p2p's ?
thanks |
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Fri Apr 15, 2005 6:40 am |
|
|
i just uploaded them into rapidshare, to get unti the next month , and just as another dl source, see the first post in this thread for the url's.  |
|
| Back to top |
|
 |
CherrieBaby
chouchou
|
| Joined: 01 Mar 2005 |
| Posts: 67 |
|
3070.02 Points
|
|
|
Fri Apr 15, 2005 8:33 am |
|
|
User - I didn't spider the hive but I spidered one of the recent hive file archive sites so I have many of the html files but my archive is far from complete.
I nearly wrote that program to put the hive posts back into a database. I have the WSH code still. I never finished it but it just needs code added to do the actual database import which I could do if you needed it. The hard bit - parsing post info is done. Naturally the system can't cope with images very well and there may be some other issues for a minority of posts.
PS: I'm willing to complete and document my WSH script to re-database-ise them if anyone seriously needs it. PM me if you really need it. We need to standardise on the actual database required. I'll probably do it first for SQL Server, which is what I'm most familiar with. |
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Fri Apr 15, 2005 9:21 am |
|
|
| user, the archives mentioned in the first post of this thread contain all the html files available from the online archives that you talk about getting some of. it would reallly save the servers here if you downloaded that instead of from the original site , as there is a large reduction in size by comppressing. |
|
| Back to top |
|
 |
user
|
| Joined: 04 Apr 2005 |
| Posts: 9 |
|
303.36 Points
|
|
|
got that ok
Fri Apr 15, 2005 10:52 am |
|
|
nubee,
yes, I got that ok from the 'mobile info' site. I must have been one of the first customers.
And yes, that is interesting cherriebaby.
Appreciated. I feel a bit like a vulture, being so enthusiastic about picking over these scraps of the downed site, but .. er, there you go,
cheers,
~vulture |
|
| Back to top |
|
 |
user
|
| Joined: 04 Apr 2005 |
| Posts: 9 |
|
303.36 Points
|
|
|
cherriebaby
Fri Apr 15, 2005 11:04 am |
|
|
>PM me
hey cherrie,
I've done just that. Sorry for reduntant lines of communication, but having not got any back and forth pm happening yet I don't quite trust it,
~user
"with all-new minty flavour" |
|
| Back to top |
|
 |
|
|
|
software
Mon Apr 18, 2005 11:09 am |
|
|
In light of a program that can turn the html pages of the hive, into it's forum state again;
If anybody does pursue this, or wants to arrange a way of actually doing this with the help of other board members, then this is something that we would definately be interested in, and encourage, and make room for.
I understand that is definately possible, and it is something which we would generously host, and include in our resources,
It's funny, we are also trying to extract this forum(synthetikal) into indexed html topics, someting we are having a little trouble with,To date we have found that spiders will probably be the answer, but the work is again tedious.
Nubee's archive site is a great resource for our Site, and we welcome more collaborative work to achieve a greater good
Syn |
|
| Back to top |
|
 |
user
|
| Joined: 04 Apr 2005 |
| Posts: 9 |
|
303.36 Points
|
|
|
Tue Apr 19, 2005 3:00 am |
|
|
Hi
syn:
| Quote: |
|
It's funny, we are also trying to extract this forum(synthetikal) into indexed html topics, someting we are having a little trouble with,To date we have found that spiders will probably be the answer, but the work is again tedious.
|
How do you mean that? Something like, put the contents of a thread into one static html page?
~user |
|
| Back to top |
|
 |
user
|
| Joined: 04 Apr 2005 |
| Posts: 9 |
|
303.36 Points
|
|
|
hive bb was?
Wed Apr 20, 2005 7:01 am |
|
|
| ubb with some modifications? or what? |
|
| Back to top |
|
 |
|
|
|
phpbb forum ported to Static index topics
Wed Apr 20, 2005 8:44 am |
|
|
Thanks for your reply User,
Yes, we need to turn this forum, ie, all the topics, and catergories,and their posts need to be made into html static web pages,
Similar as to our Hiveboard Online Revided Files,
We need the software, that can tear out the posts and index them in order on html pages,
Spiders don't seem to do what we need,
It needs to be nothing flash, just plain text, with each catorgory, having every post in it to search,
syn |
|
| Back to top |
|
 |
user
|
| Joined: 04 Apr 2005 |
| Posts: 9 |
|
303.36 Points
|
|
|
Wed Apr 20, 2005 12:34 pm |
|
|
Hey nooow,
| Quote: |
|
Yes, we need to turn this forum, ie, all the topics, and catergories,and their posts need to be made into html static web pages,
|
I guess one of the hurdles is the way that the pages seem to cap the posts at 15. Not being familiar with the specifics of this bb, I'd say that you'd want to look for a configurable option, or a 'switch' in the url query string, or what would very likely be a minor edit in the sql query fetching the posts, typically a 'limit' clause. Do one of those , temporarily if you want, and do a scripted 'fetch', 'wget' or the likes to cycle through the urls by topic number. If you used a scripting language to do the fetch you could do some mining of info while they being fetched. Grab the first 'Post subject' field and massage into a filename, so the item gets a friendlier name, ie. The_Hive_Filez_Offline_Searchable.html, or process urls or ..
That would get you close to the state of the archived hive files.
Disclaimer: I rarely get anything finished for long before discovering that there was a _much_ easier way to do it. There's probably an admin option to dump the whole thing to html.
ciao |
|
| Back to top |
|
 |
Davidus
|
| Joined: 18 Feb 2005 |
| Posts: 7 |
| Location: Australia |
267.90 Points
|
|
|
The Hive Files
Fri Apr 22, 2005 12:59 pm |
|
|
I've checked out the new version of the mfi site, but haven't saved it all yet; everything seems to work. Do you know if the Hive posts java is posting are different to what is in the Hive archive? What a huge amount of work nubee has done on this! (as well as Syn, java & many others) Anyway, a big thumbs-up to nubee for his mammoth effort on what is a true information "vault."
Thanks man It's people like you that make me want to keep on living!  |
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Sat Apr 23, 2005 6:05 am |
|
|
in the new posts section are alot more threads, im not sure whether they are what's been posted online, i think some of them are but other's are not, i did get a lot of extra's which are contained in a seperate index, but are indexed in the search engine.
best thing is to have a look, if you feeling up to it, keep a backup copy of the archive, and then open up afsearch.exe in the afsearch folder in the archive, and you can add and remove more files to the search engine index yourself, but bewarned, it has a tendency to get mucked up, so keep a backup. |
|
| Back to top |
|
 |
|
|
|
Powered by phpBB 2.0.11 © 2001, 2002 phpBB Group
Igloo Theme Version 1.0 :: Created By: Andrew Charron
|