[openstack-dev] [nova] Timestamp formats in the REST API
Mark McLoughlin
markmc at redhat.com
Tue Apr 29 13:48:37 UTC 2014
Hey
In this patch:
https://review.openstack.org/83681
by Ghanshyam Mann, we encountered an unusual situation where a timestamp
in the returned XML looked like this:
2014-04-08 09:00:14.399708+00:00
What appeared to be unusual was that the timestamp had both sub-second
time resolution and timezone information. It was felt that this wasn't a
valid timestamp format and then some debate about how to 'fix' it:
https://review.openstack.org/87563
Anyway, this lead me down a bit of a rabbit hole, so I'm going to
attempt to document some findings.
Firstly, some definitions:
- Python's datetime module talk about datetime objects being 'naive'
or 'aware'
https://docs.python.org/2.7/library/datetime.html
"A datetime object d is aware if d.tzinfo is not None and
d.tzinfo.utcoffset(d) does not return None. If d.tzinfo is None,
or if d.tzinfo is not None but d.tzinfo.utcoffset(d) returns
None, d is naive."
(Most people will have encountered this already, but I'm including
it for completeness)
- The ISO8601 time and date format specifies timestamps like this:
2014-04-29T11:37:00Z
with many variations. One distinguishing aspect of the ISO8601
format is the 'T' separating date and time. RFC3339 is very closely
related and serves as easily accessible documentation of the format:
http://www.ietf.org/rfc/rfc3339.txt
- The Python iso8601 library allows parsing this time format, but
also allows subtle variations that don't conform to the standard
like omitting the 'T' separator:
>>> import iso8601
>>> iso8601.parse_date('2014-04-29 11:37:00Z')
datetime.datetime(2014, 4, 29, 11, 37, tzinfo=<iso8601.iso8601.Utc object at 0x214b050>)
Presumably this is for the pragmatic reason that when you stringify
a datetime object, the resulting string uses ' ' as a separator:
>>> import datetime
>>> str(datetime.datetime(2014, 4, 29, 11, 37))
'2014-04-29 11:37:00'
And now some observations on what's going on in Nova:
- We don't store timezone information in the database, but all our
timestamps are relative to UTC nonetheless.
- The objects code automatically adds the UTC to naive datetime
objects:
if value.utcoffset() is None:
value = value.replace(tzinfo=iso8601.iso8601.Utc())
so code that is ported to objects may now be using aware datetime
objects where they were previously using naive objects.
- Whether we store sub-second resolution timestamps in the database
appears to be database specific. In my quick tests, we store that
information in sqlite but not MySQL.
- However, timestamps added by SQLAlchemy when you do e.g. save() do
include sub-second information, so some DB API calls may return
sub-second timestamps even when that information isn't stored in
the database.
In our REST APIs, you'll essentially see one of three time formats. I'm
calling them 'isotime', 'strtime' and 'xmltime':
- 'isotime' - this is the result from timeutils.isotime(). It
includes timezone information (i.e. a 'Z' prefix) but not
microseconds. You'll see this in places where we stringify the
datetime objects in the API layer using isotime() before passing
them to the JSON/XML serializers.
- 'strtime' - this is the result from timeutils.strtime(). It doesn't
include timezone information but does include decimal seconds. This
is what jsonutils.dumps() uses when we're serializing API responses
- 'xmltime' or 'str(datetime)' format - this is just what you get
when you stringify a datetime using str(). If the datetime is tz
aware or includes non-zero microseconds, then that information will
be included in the result. This is a significant different versus
the other two formats where it is clear whether tz and microsecond
information is included in the string.
but there are some caveats:
- I don't know how significant it is these days, but timestamps will
be serialized to strtime format when going over RPC, but won't be
de-serialized on the remote end. This could lead to a situation
where the API layer tries and stringify a strtime formatted string
using timeutils.isotime(). (see below for a description of those
formats)
- In at least one place - e.g. the 'updated' timestamp for v2
extensions - we hardcode the timestamp as strings in the code and
don't currently use one of the formats above.
My conclusions from all that:
1) This sucks
2) At the very least, we should be clear in our API samples tests
which of the three formats we expect - we should only change the
format used in a given part of the API after considering any
compatibility considerations
3) We should unify on a single format in the v3 API - IMHO, we should
be explicit about use of the UTC timezone and we should avoid
including microseconds unless there's a clear use case. In other
words, we should use the 'isotime' format.
4) The 'xmltime' format is just a dumb historical mistake and since
XML support is now firmly out of favor, let's not waste time
improving the timestamp situation in XML.
5) We should at least consider moving to a single format in the v2
(JSON) API. IMHO, moving from strtime to isotime for fields like
created_at and updated_at would be highly unlikely to cause any
real issues for API users.
(Following up this email with some patches that I'll link to, but I want
to link to this email from the patches themselves)
Mark.
More information about the OpenStack-dev
mailing list