Web lists-archives.com

Re: searching non plain text files




On Sun, 2018-12-16 at 14:33 -0500, Tedd Sperling wrote:
> > On Dec 14, 2018, at 11:19 PM, Jeffry Killen <jekillen@xxxxxxxxxxx> wrote:
> > 
> > 
> > Can anyone point me to instruction/advice about
> > opening and reading files that are not plain text:
> 
> Jeffry:
> 
> I don’t know if this will help you, but most “honest” files have a header that
> states what it is.
> 
> Try using a hex-editor app and observing the first 10 characters in their
> headers (i.e., start of the file). For example, a PDF file will state ‘PDF’, a
> jpg will state ’JFIF’, a rtf file will state ‘rtf’, a zip file will state
> ‘PK”, a png will state ‘PNG”, and so on.
> 
> From that observation, you might try working with bin2hex or dechex and other
> such bin/hex functions PHP provides to check what the file type is in the
> header and what it reports itself to be in the extension. 
> 
> Additionally, you can always Google it.
> 
Just to confuse the issue, both zip files and odt (Open Document) files have the
'PK' header id.  Looking around for some other formats, I found that most tgz
files (tar, gzipped) have 'vV' followed by the file name (but not all of them!).
So Jeffry will have to do some sort of a secondary test to be sure.

John
=============================
> Cheers,
> 
> Tedd
> 
> _______________
> tedd sperling
> tedd@xxxxxxxxxxxx
> 
> 
> 
> 
>