Its all in the title.
Dongle
- Larvae

- Posts: 47
jboogie
- Larvae

- Posts: 30
its probably because these forums have safeguards to prevent crawlers and spiders from copying anything and everything here.
imo its very rude to use a spider or web tracker to copy a board without permission first.
it wastes bandwidth and slows down the board for everyone else, just so you get a copy of shit youll never read...
or maybe im an asshole.
imo its very rude to use a spider or web tracker to copy a board without permission first.
it wastes bandwidth and slows down the board for everyone else, just so you get a copy of shit youll never read...
or maybe im an asshole.
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
Nah, it is fine.. I was going to do it actually -- but I never got around too it.
It isn't that much bandwidth -- I mean, the actual MySQL database is only ~17MB, and than all the other files only put it at max maybe 5GBs and that is being generous.
I don't know why it wouldn't work, other than the fact that it is HTTPS, which I almost certainly bet is the problem --- try it out on hyperlab as well.. bet you have the same problem..
But boy if you can get HTTrack to mirror hyperlab.. I'd try to do that, than get it so google translates all of it...and than share
Wish I could help... not to sure how I can though... Unless I of course install another SMF forum under non-HTTPS, and display that... hmmm
It isn't that much bandwidth -- I mean, the actual MySQL database is only ~17MB, and than all the other files only put it at max maybe 5GBs and that is being generous.
I don't know why it wouldn't work, other than the fact that it is HTTPS, which I almost certainly bet is the problem --- try it out on hyperlab as well.. bet you have the same problem..
But boy if you can get HTTrack to mirror hyperlab.. I'd try to do that, than get it so google translates all of it...and than share

Wish I could help... not to sure how I can though... Unless I of course install another SMF forum under non-HTTPS, and display that... hmmm
Sedit
- Global Moderator
- Foundress Queen





- Posts: 2,099
If you don't mind me askin doogle why the intrest in copying this site? To be honest im no very conforted at the moment when someone I know little about shows a desire to mirror this site because they have not stated there intended use.
Vesp if you desire a copy then best would be to sign into thru CORE then DL and zip all the files. Obviously cleaning the databse of personal information before releasing it publicly.
Vesp if you desire a copy then best would be to sign into thru CORE then DL and zip all the files. Obviously cleaning the databse of personal information before releasing it publicly.
Dongle
- Larvae

- Posts: 47
Sedit,
I have no intention of mirroring this site for republication on the web. I believe that this information is sacred and having lived through the Hive debacle, I think don't want to ever lose any of it. The only reason we have as much of the Hive as we do is precisely because of such backups. I didn't mention the reason because I thought it blatantly obvious.
Nevertheless, Vesp can attest to the fact that I did discuss this issue with him.
While you may not know me, I've lurked in this area for years. Some of us like knowledge for knowledge's sake and nothing more.
I've read your many posts over the years at SM and PN and even had thread discussions with you.
I have no intention of mirroring this site for republication on the web. I believe that this information is sacred and having lived through the Hive debacle, I think don't want to ever lose any of it. The only reason we have as much of the Hive as we do is precisely because of such backups. I didn't mention the reason because I thought it blatantly obvious.
Nevertheless, Vesp can attest to the fact that I did discuss this issue with him.
While you may not know me, I've lurked in this area for years. Some of us like knowledge for knowledge's sake and nothing more.
I've read your many posts over the years at SM and PN and even had thread discussions with you.

Sedit
- Global Moderator
- Foundress Queen





- Posts: 2,099
Sorry if I came off rude but surely you must undertstand the caution behind my words.
jboogie
- Larvae

- Posts: 30
well, i just gave it a shot fer shits and giggles, and it works fine.
scroll down to see my advice on how to make WinHTTrack capture the vespiary
UTFSE!
not this one, but the one on HTTrack's webpage. look for the tutorial entitled "Capture URL! Tutorial"
if you run into the problem with setting the proxy, then UTFSE and look for the resolution for IPv6...
scroll down to see my advice on how to make WinHTTrack capture the vespiary
UTFSE!
not this one, but the one on HTTrack's webpage. look for the tutorial entitled "Capture URL! Tutorial"
if you run into the problem with setting the proxy, then UTFSE and look for the resolution for IPv6...
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
Quote
Nevertheless, Vesp can attest to the fact that I did discuss this issue with him.
That is true.
Even though I make MySQL backups on a daily basis, and download the attached files weekly, or at least try too... it is still a very good thing to have non-active HTML files with all of the content present for the sake that he pointed out.
NeilPatrickHarris
- Dominant Queen




- Posts: 274
does it just download the login page over and over? then it's because of the requirement for authentication. i had the same problem trying to spider psychonaut. the only way i was able to get it done was by having a mod temporarily allow anonymous access to the forum so authentication wasn't required. winhttrack supports authentication but no matter what i did i couldn't get it to work on these forums
Dongle
- Larvae

- Posts: 47
NPH, that's exactly what happened. I saw 1.2Gb and was like, WOW. Then looked and it was all the same damn login page! I tried URL capturing and the live, to no avail.
Jboogie, scroll down to see my thanks....
...................../´¯/)
....................,/¯../
.................../..../
............./´¯/'...'/´¯¯`·¸
........../'/.../..../......./¨¯\
........('(...´...´.... ¯~/'...')
.........\.................'...../
..........''...\.......... _.·´
............\..............(
..............\.............\...
With love, of course.
Why can't you just provide me with the command line params? Fer shits and giggles?
Jboogie, scroll down to see my thanks....
...................../´¯/)
....................,/¯../
.................../..../
............./´¯/'...'/´¯¯`·¸
........../'/.../..../......./¨¯\
........('(...´...´.... ¯~/'...')
.........\.................'...../
..........''...\.......... _.·´
............\..............(
..............\.............\...
With love, of course.
Why can't you just provide me with the command line params? Fer shits and giggles?
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
I wonder, is there a way to get it so it crawls this site constantly - but not using any significant bandwidth and updates changed pages and archives the newly formed ones?
Don't start or participate in trolling!
Don't start or participate in trolling!
NeilPatrickHarris
- Dominant Queen




- Posts: 274
Dongle, hate to say it but i tried everything under the sun and couldn't get the authentication to work. it works with some forums great, i seriously think it has to do with php.
vesp, you can use bandwidth throttling with winhttrack. in fact i always use the bandwidth throttling and i limit the amount of sessions open out of kindness cuz i don't want to intrude on a forum's bandwidth too badly. as far as having it constantly going, i've never tried it. i'm certain there is software out there that can do that, otherwise perhaps you can create a script that utilizes command link httrack and schedule it daily or something.
vesp, you can use bandwidth throttling with winhttrack. in fact i always use the bandwidth throttling and i limit the amount of sessions open out of kindness cuz i don't want to intrude on a forum's bandwidth too badly. as far as having it constantly going, i've never tried it. i'm certain there is software out there that can do that, otherwise perhaps you can create a script that utilizes command link httrack and schedule it daily or something.
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
The more I think about it, the less I believe one can mirror an HTTPS forum - if that is possible, why the hell hasn't someone mirrored hyperlab, set up the archive - blocking the bots and spiders that would index it, and than keep it as just HTTP, which would allow google to translate it into english? Do you know how much that would be appreciated? Especially if it could than be saved in English, and turned back into HTML to be viewed easily without the aid of google...
Than, in return one could do the same thing for the russians -- say, archive this site, and translate the whole thing into russian so that they can read the content here, or on another site. It really is a shame that we let communication barriers get in the way like that, yeah?
if it were possible, you'd think shroomedalice wouldn't be going through the trouble he is trying to translate it.
Than, in return one could do the same thing for the russians -- say, archive this site, and translate the whole thing into russian so that they can read the content here, or on another site. It really is a shame that we let communication barriers get in the way like that, yeah?
if it were possible, you'd think shroomedalice wouldn't be going through the trouble he is trying to translate it.
jboogie
- Larvae

- Posts: 30
but its not something that is as simple as clicking a few checkboxes abd letting it ride out...
your a good sport! hahah! that was actually pretty funny, plus im a sucker for ascii art.
i cant tell you exactly what to do because your system and setup is not identical to mine. your network setting will vary, plus we probably arent using the same version of windows. from what i gather just from playing with this software, the OS makes a difference with this program. mostly it has to do with the network configuration and the addition of IPv6 support in both vista and Win7...
the filters are another essential aspect of making this progy work right. you will need to set up a good deal of filters and have a decent understanding of the Vespiary's file system and setup. utilize boolean phrases (boolean is supported in this progy) to maximize your results.
the first advice i will give you is to setup a filter to EXCLUDE any links with the word "logout"... with your boolean **'s in the right place. so if you UsedTheirFSE you would already know how to right that one.. it look like ' - *logout* ' in your 'scan rules' in the 'set options' box.
then you have the rest of the shit you dont want... frig sample, the shit like the 'help' screen and the other garbage like theme info. if you dont exclude the theme directory, then you have an assload of shit that will get downloaded. so youll want to want to have something like ' -*/*theme*/* ' again, notice the placement of your boolean modifiers. see also that i didnt use the word 'themes', as your modifier would cause the word 'themes' to be included already.
you will have to decide what you do and dont want to mirror, so its impossible for me to create your filters for you without knowing what you want downloaded... and plus i dont know what the folders/threads are named for everything here. im relatively new here and i dont work on this site. i have admin privileges at WD, so i can openly browse the sites folders and i know where and how things are configured.
and the last thing that i will spoon feed you is the configuration of the one-time proxy setup.. though you should have been able to figure it out by reading the tutorial i told you to look at... so again, i will tell you to read this link:
http://httrack.kauler.com/help/CatchURL_tutorial
now, once you get that page read and comprehended, you will notice that you cannot set up the 'capture URL!' proxy. this is hard to explain without knowing what your OS is and its version. the problem is with the IPv6 protocol... the address that HTTrack gives you will bee in IPv6 format if your system is using IPv6. this isnt really bad, but an IPv6 address is not something that you can use with firefox or iE. if your using XP pre-SP3 then you wont have that problem unless you have installed that specific hotfix yourself or something else did. so with XP, you need to go to network connections properties, and then disable or uninstall the IPv6 in the adapter properties, then reboot. you must reboot or the changes will not happen.
now, vista and win7 are another story. i dont know that they can function properly on a WAN without IPv6, but i dont know cause im not using either atm.
hope that helps more... but as they say around these parts 'your mileage may vary'
JB
Jboogie, scroll down to see my thanks....
your a good sport! hahah! that was actually pretty funny, plus im a sucker for ascii art.

i cant tell you exactly what to do because your system and setup is not identical to mine. your network setting will vary, plus we probably arent using the same version of windows. from what i gather just from playing with this software, the OS makes a difference with this program. mostly it has to do with the network configuration and the addition of IPv6 support in both vista and Win7...
the filters are another essential aspect of making this progy work right. you will need to set up a good deal of filters and have a decent understanding of the Vespiary's file system and setup. utilize boolean phrases (boolean is supported in this progy) to maximize your results.
the first advice i will give you is to setup a filter to EXCLUDE any links with the word "logout"... with your boolean **'s in the right place. so if you UsedTheirFSE you would already know how to right that one.. it look like ' - *logout* ' in your 'scan rules' in the 'set options' box.
then you have the rest of the shit you dont want... frig sample, the shit like the 'help' screen and the other garbage like theme info. if you dont exclude the theme directory, then you have an assload of shit that will get downloaded. so youll want to want to have something like ' -*/*theme*/* ' again, notice the placement of your boolean modifiers. see also that i didnt use the word 'themes', as your modifier would cause the word 'themes' to be included already.
you will have to decide what you do and dont want to mirror, so its impossible for me to create your filters for you without knowing what you want downloaded... and plus i dont know what the folders/threads are named for everything here. im relatively new here and i dont work on this site. i have admin privileges at WD, so i can openly browse the sites folders and i know where and how things are configured.
and the last thing that i will spoon feed you is the configuration of the one-time proxy setup.. though you should have been able to figure it out by reading the tutorial i told you to look at... so again, i will tell you to read this link:
http://httrack.kauler.com/help/CatchURL_tutorial
now, once you get that page read and comprehended, you will notice that you cannot set up the 'capture URL!' proxy. this is hard to explain without knowing what your OS is and its version. the problem is with the IPv6 protocol... the address that HTTrack gives you will bee in IPv6 format if your system is using IPv6. this isnt really bad, but an IPv6 address is not something that you can use with firefox or iE. if your using XP pre-SP3 then you wont have that problem unless you have installed that specific hotfix yourself or something else did. so with XP, you need to go to network connections properties, and then disable or uninstall the IPv6 in the adapter properties, then reboot. you must reboot or the changes will not happen.
now, vista and win7 are another story. i dont know that they can function properly on a WAN without IPv6, but i dont know cause im not using either atm.
hope that helps more... but as they say around these parts 'your mileage may vary'
JB
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
Ah, that does look useful!
Anyone get it figured out with those instructions?
Anyone get it figured out with those instructions?
Dongle
- Larvae

- Posts: 47
Thanks for the spoonfeeding, as you put it, Jboogie. I will go ahead and give this a whirl at some point and make sure to not let you have the archive it creates. 
Thanks again. Vesp, if it works, I'll put it up an archive to whatever FTP you want me to

Thanks again. Vesp, if it works, I'll put it up an archive to whatever FTP you want me to
but its not something that is as simple as clicking a few checkboxes abd letting it ride out...Jboogie, scroll down to see my thanks....
your a good sport! hahah! that was actually pretty funny, plus im a sucker for ascii art.
i cant tell you exactly what to do because your system and setup is not identical to mine. your network setting will vary, plus we probably arent using the same version of windows. from what i gather just from playing with this software, the OS makes a difference with this program. mostly it has to do with the network configuration and the addition of IPv6 support in both vista and Win7...
the filters are another essential aspect of making this progy work right. you will need to set up a good deal of filters and have a decent understanding of the Vespiary's file system and setup. utilize boolean phrases (boolean is supported in this progy) to maximize your results.
the first advice i will give you is to setup a filter to EXCLUDE any links with the word "logout"... with your boolean **'s in the right place. so if you UsedTheirFSE you would already know how to right that one.. it look like ' - *logout* ' in your 'scan rules' in the 'set options' box.
then you have the rest of the shit you dont want... frig sample, the shit like the 'help' screen and the other garbage like theme info. if you dont exclude the theme directory, then you have an assload of shit that will get downloaded. so youll want to want to have something like ' -*/*theme*/* ' again, notice the placement of your boolean modifiers. see also that i didnt use the word 'themes', as your modifier would cause the word 'themes' to be included already.
you will have to decide what you do and dont want to mirror, so its impossible for me to create your filters for you without knowing what you want downloaded... and plus i dont know what the folders/threads are named for everything here. im relatively new here and i dont work on this site. i have admin privileges at WD, so i can openly browse the sites folders and i know where and how things are configured.
and the last thing that i will spoon feed you is the configuration of the one-time proxy setup.. though you should have been able to figure it out by reading the tutorial i told you to look at... so again, i will tell you to read this link:
http://httrack.kauler.com/help/CatchURL_tutorial
now, once you get that page read and comprehended, you will notice that you cannot set up the 'capture URL!' proxy. this is hard to explain without knowing what your OS is and its version. the problem is with the IPv6 protocol... the address that HTTrack gives you will bee in IPv6 format if your system is using IPv6. this isnt really bad, but an IPv6 address is not something that you can use with firefox or iE. if your using XP pre-SP3 then you wont have that problem unless you have installed that specific hotfix yourself or something else did. so with XP, you need to go to network connections properties, and then disable or uninstall the IPv6 in the adapter properties, then reboot. you must reboot or the changes will not happen.
now, vista and win7 are another story. i dont know that they can function properly on a WAN without IPv6, but i dont know cause im not using either atm.
hope that helps more... but as they say around these parts 'your mileage may vary'
JB
but its not something that is as simple as clicking a few checkboxes abd letting it ride out...Jboogie, scroll down to see my thanks....
your a good sport! hahah! that was actually pretty funny, plus im a sucker for ascii art.
i cant tell you exactly what to do because your system and setup is not identical to mine. your network setting will vary, plus we probably arent using the same version of windows. from what i gather just from playing with this software, the OS makes a difference with this program. mostly it has to do with the network configuration and the addition of IPv6 support in both vista and Win7...
the filters are another essential aspect of making this progy work right. you will need to set up a good deal of filters and have a decent understanding of the Vespiary's file system and setup. utilize boolean phrases (boolean is supported in this progy) to maximize your results.
the first advice i will give you is to setup a filter to EXCLUDE any links with the word "logout"... with your boolean **'s in the right place. so if you UsedTheirFSE you would already know how to right that one.. it look like ' - *logout* ' in your 'scan rules' in the 'set options' box.
then you have the rest of the shit you dont want... frig sample, the shit like the 'help' screen and the other garbage like theme info. if you dont exclude the theme directory, then you have an assload of shit that will get downloaded. so youll want to want to have something like ' -*/*theme*/* ' again, notice the placement of your boolean modifiers. see also that i didnt use the word 'themes', as your modifier would cause the word 'themes' to be included already.
you will have to decide what you do and dont want to mirror, so its impossible for me to create your filters for you without knowing what you want downloaded... and plus i dont know what the folders/threads are named for everything here. im relatively new here and i dont work on this site. i have admin privileges at WD, so i can openly browse the sites folders and i know where and how things are configured.
and the last thing that i will spoon feed you is the configuration of the one-time proxy setup.. though you should have been able to figure it out by reading the tutorial i told you to look at... so again, i will tell you to read this link:
http://httrack.kauler.com/help/CatchURL_tutorial
now, once you get that page read and comprehended, you will notice that you cannot set up the 'capture URL!' proxy. this is hard to explain without knowing what your OS is and its version. the problem is with the IPv6 protocol... the address that HTTrack gives you will bee in IPv6 format if your system is using IPv6. this isnt really bad, but an IPv6 address is not something that you can use with firefox or iE. if your using XP pre-SP3 then you wont have that problem unless you have installed that specific hotfix yourself or something else did. so with XP, you need to go to network connections properties, and then disable or uninstall the IPv6 in the adapter properties, then reboot. you must reboot or the changes will not happen.
now, vista and win7 are another story. i dont know that they can function properly on a WAN without IPv6, but i dont know cause im not using either atm.
hope that helps more... but as they say around these parts 'your mileage may vary'
JB
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
Awesome, it will be very much appreciated.
I've been trying to figure out how to convert it into an HTML archive for a while, however it wasn't very clear to me how to do it.
I've been trying to figure out how to convert it into an HTML archive for a while, however it wasn't very clear to me how to do it.
mumbles
- Larvae

- Posts: 42
I dream of a translated search-able hyperlab... someday. There is a thread on psychonaut on the settings required to mirror the forum but then I realised there wasn't much worth saving from the eventual deletions.
Vesp
- Administrator
- Foundress Queen





- Posts: 3,130
Quote
I dream of a translated search-able hyperlab... someday.
What we ought to do is make a copy of this site, translate it into russian, and than give it to them as a gift. Than perhaps they would be more willing to do the same for us, as there are, from how I understand it, a decent amount of conversation that is for the "active/contributing" members, which due to the language barrier, we often cannot get into...
Ha just an idea
But an English hyperlab would be great!Dongle
- Larvae

- Posts: 47
I can't get this to work guys. JBoogie, seeing as you were able to do so, could you kindly provide us with a copy of the site? I know Vesp would be really happy to get one.
