Is it possible to import dates and times directly as AbsoluteTime and by pass DateLists?

The data in the file test.xls are

03/Jan/2000   45.46
  04/Jan/2000   43.92
  05/Jan/2000   44.38
  06/Jan/2000   42.9
  07/Jan/2000   43.46
  10/Jan/2000   43.78
  11/Jan/2000   42.65
  12/Jan/2000   41.26
  13/Jan/2000   42.04
  14/Jan/2000   43.78

An alternative approach is to exploit the fact that Office documents are zipped collections of XML files. So,

Step 1: rename the source file by adding .zip to the file name: test.xlsx.zip.

Step 2: Import the appropriate xml file in the zip file, extract the data elements and re-format:

 Cases[Import["C:\\ your directory \\test.xlsx.zip", {"ZIP", "xl\\worksheets\\sheet1.xml"}], 
  XMLElement["v", {}, {value_}] :> value, Infinity] 
  // Partition[#, 2] &

This gives:

{{"36528", "45.46"}, {"36529", "43.92"}, {"36530", "44.38"}, {"36531", "42.9"}, {"36532", "43.46"}, {"36535", "43.78"}, {"36536", "42.65"}, {"36537", "41.26"}, {"36538", "42.04"}, {"36539", "43.78"}}

where the first entry in each sublist is Excel's DATEVALUE (serial date number that counts the number of days from 1/1/1900).

Puzzle: I would expect that converting Excel's DATEVALUE to Mathematica's AbsoluteTime (number of seconds from 1/1/1900) would be as simple as multiplying the former by 24*60*60. But doing that with:

  excelDateValues = {"36528", "36529", "36530", "36531", "36532", "36535", "36536", "36537", "36538", "36539"}

and

  DateList /@ (24*60*60*ToExpression@excelDateValues)

gives

{{2000, 1, 5, 0, 0, 0.}, {2000, 1, 6, 0, 0, 0.}, {2000, 1, 7, 0, 0, 0.}, 
  {2000, 1, 8, 0, 0, 0.}, {2000, 1, 9, 0, 0, 0.}, {2000, 1, 12,  0, 0, 0.}, 
  {2000, 1, 13, 0, 0, 0.}, {2000, 1, 14, 0, 0, 0.}, {2000, 1, 15, 0, 0, 0.}, 
  {2000, 1, 16, 0, 0, 0.}}

which is off by two days. Hopefully, there is a less naive approach to the get the right conversion factor to go from excel Datevalues to Mma AbsoluteTime so that a modified version of Cases[] above gives the desired result.

Puzzle resolved: Thanks to Mr.Wizard's reference, the historical background to the two-day discrepancy is explained beautifully in Joel Spolsky's great story . So, unless your data does contain dates going back early 1900's for most cases just subtracting 2 from final output dates should be ok. But ... things can get more complicated considering possible excel date system settings and varying defaults accross OSs. (see XL 1900 and 1904 date systems)

EDIT: Import uses the filename extension if no format is provided as the second argument. For zip files it returns the filenames in the zipped archive. For the example case

   Import["C:\\ your directory \\test.xlsx.zip"]

returns

 {"[Content_Types].xml", "_rels\\.rels","xl\\_rels\\workbook.xml.rels", "xl\\workbook.xml", "xl\\styles.xml",  "xl\\worksheets\\sheet1.xml", "xl\\theme\\theme1.xml", "customXml\\item1.xml", "customXml\\_rels\\item1.xml.rels", "customXml\\_rels\\item2.xml.rels", "docProps\\app.xml", "customXml\\itemProps2.xml", "customXml\\item2.xml", "customXml\\itemProps1.xml", "docProps\\core.xml"}

Unfortunately, according to the documentation, Import automatically converts things that look like dates in an Excel spreadsheet to DateList format.

I tried messing about with alternative ways of importing such as realrawdata = ("Data" /. Import["Data for graphs.xlsx", "Rules"])[[1]], but no luck.

It might be possible to develop your own import converter, but this seems like an awful lot of work.


I have not yet figured out how to change or configure the J/Link modules that do the actual translation, but there is a wasteful (from your perspective) application of ToDate which we can bypass, thereby returning dates in AbsoluteTime format and saving a bit of time during the Import. (specifying the format is a bit faster than letting Mathematica detect it.)

Block[{ToDate = Identity},
  Import["test.xls", "XLS"]
]

The low level conversion function appears to be inside:

\SystemFiles\Converters\Java\Convert.jar\System\Convert\ExcellDump.class

I don't know how to decompile a .class file much less how to write one, but it appears to be quite simple.