Web lists-archives.com

Re: Microsoft Does It Again




	Hi.

On Tue, Aug 21, 2018 at 06:28:57PM +0200, tomas@xxxxxxxxxx wrote:
> On Tue, Aug 21, 2018 at 07:02:32PM +0300, Reco wrote:
> > On Tue, Aug 21, 2018 at 05:48:31PM +0200, tomas@xxxxxxxxxx wrote:
> 
> [...]
> 
> > >   tomas@trotzki:~$ apt search ooxml
> > >   Sorting... Done
> > >   Full Text Search... Done
> > >   docx2txt/stable,stable,stable 1.4-0.1 all
> > >     Convert Microsoft OOXML files to plain text
> > 
> > Not relevant. Input is xlsx.
> 
> Well, xlsx *is* OOXML (I like to call it "MOOXML" as in
> "Microsoft's..." -- you get the idea :)

That's like saying that apples and oranges are both fruits. 
I.e. that's truth, but one does not compare apples to oranges usually.

Both docx and xlsx are zip archives with xml inside. Their parsing is
different, and applying parsing rules from one to another yields no
useful result.

Parsing docx is easy, even I can do it (and did it, actually).
Parsing xlsx with all its gross formulas (sp?), macros and arcane date
formats is the definition of pain. I gave it up and became a happy
xlsx2csv user.

Reco