|
|
| Author |
Message |
IndoleAmine
Dreamreader Deluxe
|
| Joined: 09 Feb 2005 |
| Posts: 681 |
| Location: Bahamas |
18717.10 Points
|
|
|
no prob
Wed Mar 09, 2005 7:32 pm |
|
|
No - whether you view a html document (or make a backup on your hd), you always have to download the whole file, and save it, either temporarily or forever.
Downloading the whole archive is equivalent to viewing the whole archive, and we have enough bandwidth to allow a good number of users to view the whole archive every month, so this shouldn't be a problem.
And I think not many bees will have the patience to d/l all single files by hand, and those who have the endurance hopefully will not frequent our server that often, once they have all goodies on their hd - so we can in fact spare even more bandwith to others when you d/l it...
(at least I hope so!? )
i_a |
|
| Back to top |
|
 |
Novalis
|
| Joined: 28 Feb 2005 |
| Posts: 3 |
| Location: Europe |
0.00 Points
|
|
|
Wed Mar 09, 2005 8:10 pm |
|
|
I hope so too.
But you certainly know that it's possible to mirror such an archive easily with tools like WinHTTrack, etc. And since Rhodium's pdfs are quite popular and where not available for a long time, I think some bees will try to download the whole 1000 MB. |
|
| Back to top |
|
 |
java
Consumer
|
| Joined: 07 Feb 2005 |
| Posts: 736 |
| Location: The Mexican Republic |
21796.14 Points
|
|
|
Re: Picproxie-doc
Wed Mar 09, 2005 10:08 pm |
|
|
| Quote: |
|
java ......where are all the the fucking picproxie-doc
|
3base....I've been off line for a week hence was unable to respond to your inquiry.... as stated , we have tried to recover as much as we can of the Hive archives and the associated articles, this is what we have , and you're welcome to search and read to your heart's content .......I can't reply to your request since we have no direct access to the Hive Archives, se are simply trying to paste together what the contributing bee's have pull together, so don't be an ingrate and stop your unkind demands as no one owe's you anything........java |
|
| Back to top |
|
 |
|
|
|
novalis, good point
Thu Mar 10, 2005 11:31 am |
|
|
Yes, we don't really want want people downloading the archive,
As hypocritical as that may sound, we do, and we don't,
And we will have to look into this,
We want everyone to have access to it, but complete downloads, no, not yet, anyway,
Since we only have 50gb bandwidth per month,
This would kill us,
I would say, that public mirror will have to be set-up,
I am open for suggestions,.
syn, |
|
| Back to top |
|
 |
Polverone
|
| Joined: 12 Feb 2005 |
| Posts: 28 |
|
846.64 Points
|
|
|
distribution
Fri Mar 11, 2005 8:07 am |
|
|
The plain HTML takes up much less space than all the picproxie stuff, right? Distribution of the HTML as a zip or rar file from the web site might be acceptable with 50 GB a month, depending on how many people want it. You might even be able to get away with distributing it in one or two pieces using rapidshare.de.
The much more scalable solution, which could easily distribute everything and not just the html, is to set up a .torrent for it. The efficiency will depend on people who download staying online long enough to share, of course.
Another possible solution: create a new gmail account, mail a multipart .rar archive of everything to this account as multiple attachments, then share the password to the account here on the forum. As long as nobody decides to be a smartass and delete the files, people should then be able to rapidly download the files from the shared account. If you don't want google to have a clue about file contents, encrypt all the attachments with a freshly generated public key and share the private key here on the forum along with the login information. |
|
| Back to top |
|
 |
jackoozzi
specialist
|
| Joined: 10 Feb 2005 |
| Posts: 135 |
| Location: Australia |
39384.40 Points
|
|
|
Fri Mar 11, 2005 8:48 am |
|
|
yousendit.com is probably the best option for this sort of thing i think each link is limited to 25 downloads or 7 days but then you upload it again and change the link and it will take files up to 1gb
http://s23.yousendit.com/ |
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Fri Mar 11, 2005 10:40 am |
|
|
i started a httrack last night and got 30MB through it on a dialup...
when i zipped it up it was like 3-6 mb !!
and of coarse if was plain text even smaller,
we need a packaged dl for sure as it makes an important reference to have available without having to get entangled in the whole loss of anonymoty involved in connecting to an internet account each time you want to check something out....  |
|
| Back to top |
|
 |
mind
|
| Joined: 10 Feb 2005 |
| Posts: 39 |
|
362.48 Points
|
|
|
Fri Mar 11, 2005 3:37 pm |
|
|
what about a torrent?
ofcourse paranoid people should not connect..
or a more secure filesharing program like mute or filetopia |
|
| Back to top |
|
 |
IndoleAmine
Dreamreader Deluxe
|
| Joined: 09 Feb 2005 |
| Posts: 681 |
| Location: Bahamas |
18717.10 Points
|
|
|
| Back to top |
|
 |
Polverone
|
| Joined: 12 Feb 2005 |
| Posts: 28 |
|
846.64 Points
|
|
|
clarification
Sat Mar 12, 2005 8:36 am |
|
|
The hive HTML is much smaller than HTML + all images and documents. Distributing a compressed archive of the HTML alone might be possible even without creative solutions.
Actually, I believe that there is a way to easily shrink a lot of the Rhodium/Hive PDFs, now that I think of it. Most journal articles (from JACS and other sources) are stored as PDFs with OCR text underneath bitonal page images. All JACS archives and other journals that I know of use only G4 compression for these bitonal images. JBIG2 compression can do considerably better, like 1/2 to 1/4 the size, and of course the images take up most of the space in the files. The Xerox "Silx" PDF compressor is a command line tool that will convert bitonal images in PDFs to JBIG2 and leave everything else alone. A cracked version of this compressor has been available for a while. It can be provided if you can't find it on your own.
Then you just apply the tool to all of your PDFs. Under Linux, you might do it something like:
for k in *.pdf;do Silx $k smaller.pdf; mv smaller.pdf $k;done
You could do something similar with a batch file in a command shell with windows.
This will compress all images that can be compressed and leave other images alone, for all pdf files in the directory. I think this would yield quite a bit of space-savings. |
|
| Back to top |
|
 |
|
|
|
Polverone
Sat Mar 12, 2005 4:41 pm |
|
|
polverone,
I like the Gmail idea,
Especially because of it fast bandwidth,
You can get a good 50k/s from them,
I have seen those webhosting places, that will host it, but make you wait 20s while on a page of adds,
We've just insallled the imageshack file upload function, that stores linked images for free, off our server, and it came with php intergration code,
How sweet, except it had a few bugs in the php code,
But since Nazlfrag, our coder was near, it was quickly repaired,
With the Gmail, it would be a real trust thing, but I think that would work here, to regular uses etc,
This is a good chance for admins of similar board themes to discuss board related problems,
We have noticed a dedicated server host with 120gb space and 500gb bandwidth for $60US, that's pretty cheap,
syn |
|
| Back to top |
|
 |
Polverone
|
| Joined: 12 Feb 2005 |
| Posts: 28 |
|
846.64 Points
|
|
|
A problem for search engines
Mon Mar 14, 2005 12:10 pm |
|
|
| I noticed that the posts stored in the Hive archive still have <meta name="robots" content="noindex,nofollow"> near the top of each thread. This means that Google and other mainstream search engines will never index these files. If you want them to be more accessible and searchable, you need to remove these directives from the HTML files. It's just a big search-and-replace across all the HTML. |
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Mon Mar 14, 2005 12:25 pm |
|
|
let me admit i downloaded the hive files with htttrack:
its over 8,500 files and 260+mb, but when zipped its only 88mb,
im currently creating an offline search page/index for it then it done... |
|
| Back to top |
|
 |
IndoleAmine
Dreamreader Deluxe
|
| Joined: 09 Feb 2005 |
| Posts: 681 |
| Location: Bahamas |
18717.10 Points
|
|
|
| Back to top |
|
 |
nubee
Master Archiver
|
| Joined: 18 Feb 2005 |
| Posts: 215 |
| Location: homeless |
18648.26 Points
|
|
|
Mon Mar 14, 2005 1:15 pm |
|
|
| i didnt know that, ill have a look, where do i find the reference to change in what file ??? |
|
| Back to top |
|
 |
|
|
|
Powered by phpBB 2.0.11 © 2001, 2002 phpBB Group
Igloo Theme Version 1.0 :: Created By: Andrew Charron
|