Log in

View Full Version : Keep personal info out of your uploads


Rhadon
April 20th, 2004, 10:05 AM
An increasing number of people are publishing files as attachments or uploads to the Forum FTP which contain personal information (in many cases the uploader's real name). I noticed this problem on DOC files, but certain other file types may be affected as well. Even if you remove the personal information in the respective edit fields in Word, it will still be visible in plain text.

For that reason, please don't upload DOC files anymore. Use PDF (which is to be preferred anyway), or if you can't because you don't have Acrobat, upload them in another, safer format such as RTF (Rich Text Format). Notice that PDF files may still contain some personal information; perhaps someone who uses Acrobat in combination with Microsoft Word (I don't) can find out more details.

Before uploading any file which you're not sure of if it contains personal information, it is advisable to view the file in plain text, searching it for your name, etc. and having a closer look at the begin and end of the file.

Those members who uploaded DOC files shouldn't be concerned as our FTP's host plans to convert them to PDF format where we should be able to come by the problem.

Boomer
April 20th, 2004, 11:22 AM
I just read this SECONDS after uploading an attachment! It is a pdf of a schematic, exported from a CAD system. It was only 11kB, but in a text editor I saw again (you remember?) my full name, company etc. :mad:

Deleting these passages destroys the file. But we found a solution: I export in dxf, import to CORAL DRAW, save as pdf and get an anonymous pdf file. Unfortunately it was 50kB, 2% too big to upload.

Exporting in "draft" mode and converting to "PDF for WWW" solved that problem too. :)

Does anyone know if saving a doc as rtf destroys formated tables like saving as txt does? My last explosives table looked good in ASCII and when put on this website it was corrupted. :(

Rhadon
April 20th, 2004, 01:24 PM
The best solution is of course to not enter real data when you're asked for your name, address and company by Microsoft Programs. It can't hide something in files if it doesn't know anything :).

As for the attachment, I've deleted it from the moderation queue, so there's no reason to worry. If you want to edit out the information, it is always better to overwrite the information you want to hide (not delete), e.g. with spaces. Unfortunately this doesn't work with PDFs either.

If you can't get the file any smaller, just compress it and upload it in ZIP or RAR format.

As for the corrupt tables: You probably typed it up using a font in which all characters have the same width. But the default font of The Forum has a variable character width. If you put your ASCII table between [ code ] and [ /code ] tags (without the spaces inside the brackets), it should display correctly.

nbk2000
April 20th, 2004, 06:21 PM
NEVER put anything that could possibly identify you on your computer, either at home or at work, as there's so many ways that the information could be compromised.

When you install your O/S, if it asks for a name, make one up. I use "Jesus Christ" for all my programs. :)

Business? "God INC." :D

And so on and so forth. Bogus phone numbers, non-existant zip codes, etc. Anything else, like files, that has to have real information that could be connected to me, gets PGP'd. :p

Your business should know about these risks and understand the need for obscuring such information and keeping it out of the hands of insecure programs.

Text files should be exactly that, a .TXT file, not RTF or DOC. I found a floppy at kinkos that someone had their resume o, and it was in .DOC format. Upon opening with a text viewer, not only did I get their resume, but also the names of all the previous files saved to floppy, their company name, the windows OS used, the local LAN name, on and on! :rolleyes:

And this fool works for a big corporation, thus giving me potential access to their internal network if I decided to social engineer it, using the insider names and other info I found on the floppy. :D

ProdigyChild
October 26th, 2004, 04:31 PM
Linux systems often contain programs pdf2ps and ps2pdf.
Convert your PDF to PS. Postscript is text mostly as it's a programming language. You can use a text editor to find your name and personal data (mostly in comment lines starting with %...). Delete these and convert back to PDF. Advanced handling woud be a sed-script that automatically deletes your personal data lines.

malzraa
March 26th, 2006, 05:36 PM
I would recommend OpenOffice, just don't use any actual personal info. It can export directly to PDF. www.openoffice.org
PS- Plus, it is open-source!

nbk2000
May 12th, 2006, 12:25 AM
I saw a story on TV at work about a local kid who's now fucked for life.

School cop is surfing one of those emo-blog sites when he sees a picture of a kid he knows from the school smoking a bong.

This gives school reason to search his locker and backpack.

They find marijuana.

This gives probable cause for a search warrant of his house.

There, they find a scale, baggies, knives, chemicals, and THE ANARCHIST COOKBOOK! :eek:

I love the way the reporter says:

"...was there something more sinsister planned?"

*Once they say that, you already know where this case is going...*

Then comes the speil from the cops about how "Building smoke bombs or small explosive devices pose a great hazard to the community".

About how the kid (18 y.o.) "May not have had any immediate plans for a school assault (WTF?!), but we'll be investigating this more carefully now."

Oh shit...

This kid is fucked, because they've got him charged with possession of narcotics for sale, possession of explosive precursors, and several other felony charges.

Oh, and a quarter million ($250,000) for bail.

All because he thought it'd be cool to post a picture of him smoking pot on the internet. :rolleyes:

And just think what he'd be in for if he had posted pictures of him holding a BOMB and not a BONG.

And all you newbies who've posted such pictures here in the past wonder why we immediately ban you for doing so. :)

At least the emo-blog site is going to catch the flack on this one, not us. :p

ShadowMyGeekSpace
May 12th, 2006, 01:30 AM
Hey, I'm all for letting people smoke up if they are responsible about it, part of that responsibility is not advertising it like a fricken moron, all over myspace or something. If the kid DID do that, he deserved to get caught... atleast for the weed. It is sort of stupid, the smoke bomb propaganda charges and stuff... although the kid would probably have killed himself trying anything from the AACB, so I guess they probably saved his life or something indirectly.

nbk2000
May 12th, 2006, 01:53 AM
He did, so he does.

And they may have saved his life, but no one's going to save his ass in prison! :p

ShadowMyGeekSpace
May 12th, 2006, 02:12 AM
Depends, maybe he can woe the fellow residents with his kewl stories, nbk2000.

Pubocyno
May 15th, 2006, 06:29 PM
Here's a free tool (for personal use) that can analyze and remove involuntary metadata from .doc files.

DocScrubber - http://www.docscrubber.com/

Even if your doc is scrubbed for any personal information, it should still be considered an "unsafe" format, as anymore can edit it and upload it as his own, thereby possibly "tainting" the contents.

The PDF Format is by comparison designed not to be edited after it is compiled.
After installing PDFCreator (also free), it will be available in every program that supports printing.

PDFCreator - http://sector7g.wurzel6.de/pdfcreator/index_en.htm
Direct Download - http://sourceforge.net/projects/pdfcreator/


As a sidenote, do all the new people here get paranoid about being banned whenever they post something, or is it just me?

megalomania
May 16th, 2006, 02:06 PM
Before the (un)Official Forum FTP is reborn I will see to it that PDF metadata is scrubbed of any personally identifying information. Verypdf has a tool, PDF Metadata - Advanced PDF Tools, that can change all metadata in a PDF. It even works in batch to change multiple files at once.

I have attached a few files that may be of interest. The US government has even made one just this Xmas. Better late than never, but it is simple to follow. See "Redacting with Confidence: How to Safely Publish Sanitized Reports Converted From Word to PDF" for the how-to.

From this article at PlanetPDF, http://www.planetpdf.com/enterprise/article.asp?ContentID=6877, using PDF files may not be so bad, but they are still not totally secure if converted wrong.

sprocket
May 16th, 2006, 03:51 PM
I was just wondering about journal articles in PDF-format downloaded from Wiley, ACS, etc. Do they add any user/download specific information to the PDF to identify people spreading the articles?

If we're uncertain it might be a good idea to have two different people download the same articles and compare them. That way you could tell if there's such a fingerprint in place right now.

nbk2000
May 16th, 2006, 07:47 PM
I wouldn't put it past them to do such a thing. Of course, if you export all pages as TIFF images and reconvert them to PDF, there goes any hidden 'fingerprints'. :p

megalomania
May 17th, 2006, 06:12 PM
I just checked the metadata for a few PDF journals from ACS and Elsevier. I didn't find any kind of useful metadata at all. There was a "modified date" metadata category that may be useful if they keep logs of exactly what time a file was downloaded.

I do most of my journal downloading at a university computer terminal that is open to the public. No signing in, no cameras, just a truly anonymous terminal. The best they could find out is what university it was downloaded from.

FUTI
May 18th, 2006, 04:09 PM
In my case Mega that would narrow the search from 10000000 to approximate of 5000 and less to about 700 chemist. If they find someone to narrow the search for the one that might be interested in subject I would end up on a very short list. Not that I have to be scared of that since I haven't done anything wrong (and I don't plan to do it).

This is OT but...Search is the key of those activities. NSA has started logging all phonecalls. Why? Well let's see... It is reasonable to assume they always have a list of guys that they should closely watch. A person can have contacts with limited number of other persons. By finding out who those are (and how frequent they are in contact to exclude *sorry wrong number* (if it is true that they don't record voice conversation completely)) they can start to profile his activity, posible ways of action and estimate a treat it pose to *security*. I guess they can try to expand the search to finding out *area of influence* of those person to some reasonable number like 3 *hyperlink* length (I just coined the term out of my head...please propose another more suitable). Person A has call person B, that called person C and it in the end of search phoned person D. I think it is reasonable to assume they don't make larger searches since...well theory say there is only 6 men between me and Mega or NBK, but we have never spoke on the phone (or relayed voice messages to each other) so I guess it would be BIG overestimate to expand that number to 6 person *chains* as it would include about 66,7% of mankind and would loose its purpose.

hammer
February 25th, 2007, 12:49 AM
You should also right click all .docs and .pdfs and go to properties, there you can see some details about the author and .etc, also on word you can change you registered name under settings or tools

Ubermensch
January 15th, 2008, 02:54 AM
Check out "the revisionist".
"The Revisionist is a tool for extracting and indexing hidden metadata (such as deleted or modified text) from large collections of MS Word files. It can operate whole Web sites or SMB or NFS directories. It is handy for pen-testing, or it can be used just to spot embarrassing secrets."-Darknet

http://lcamtuf.coredump.cx/strikeout/

It is similar to Metagoofil

http://www.edge-security.com/metagoofil.php

Goodluck and happy mining!

parmegianno
February 2nd, 2008, 10:52 AM
Ubermensch, the problem is that you can never know what kind of data there is hidden in binary files. There could be a lot of information hidden which nobody was able to identify yet (think about storing data in amore or less encrypted form).

syntaxnero
September 18th, 2008, 04:49 PM
I use Doc Scrubber myself.


Features

Analyze Word Documents
And discover hidden or potentially embarrassing data they may contain.

Scrub Word Documents
Remove hidden or potentially embarrassing data from your documents.

Scrub Multiple Documents at a Time
Scrub selected Word documents in a folder, or all documents in a folder, all at once - saving you time and effort.

Tested Compatibility with Word 97, 2000, and XP documents
Doc Scrubber can clean documents from multiple versions of Word.

Free for personal and educational use
Doc Scrubber is completely free for personal and educational use! (Business users must purchase an appropriate license.)

http://www.javacoolsoftware.com/docscrubber/index.html



Some other good thing to have on hand are:
http://www.javacoolsoftware.com/products.html