Cara menggunakan python dateutil unix timestamp

Saat mencoba membuat semuanya bekerja dengan modul datetime, sebagian besar pengguna Python menghadapi titik ketika kami menggunakan menebak-dan-memeriksa sampai kesalahan hilang. datetime adalah salah satu API yang tampaknya mudah digunakan, tetapi mengharuskan pengembang untuk memiliki pemahaman mendalam tentang apa arti beberapa hal sebenarnya. Jika tidak, mengingat kompleksitas masalah terkait tanggal dan waktu, memperkenalkan bug yang tidak terduga itu mudah.

  • Standar waktu
    • UT1
    • TAI
    • UTC
    • Bagaimana semua ini diputar bersama di komputer Anda
  • Zona waktu
  • DST
  • Serializing your datetime objects
    • String
    • Integer
    • Objects
  • Wall times
  • Differences when working with pytz
  • Quick tips
  • Libraries worth mentioning

Standar waktu

Konsep pertama yang harus dipahami ketika bekerja dengan waktu adalah standar yang mendefinisikan bagaimana kita dapat mengukur satuan waktu. Dengan cara yang sama kita memiliki standar untuk mengukur berat atau panjang yang menentukan kilogram atau meter, kita memerlukan cara yang akurat untuk menentukan apa ‘kedua’ cara. Kita kemudian dapat menggunakan referensi waktu lain—seperti hari, minggu, atau tahun—menggunakan standar kalender sebagai kelipatan detik (lihat Kalender Gregorian sebagai contoh).

UT1

Salah satu cara paling sederhana untuk mengukur satu detik adalah sebagai pecahan hari, mengingat kami dapat menjamin matahari akan terbit dan terbenam setiap hari (di sebagian besar tempat). Ini melahirkan Waktu Universal (UT1), penerus GMT (Greenwich Mean Time). Hari ini, kami menggunakan bintang dan quasar untuk mengukur berapa lama waktu yang dibutuhkan Bumi untuk melakukan rotasi penuh mengelilingi matahari. Bahkan jika ini tampaknya cukup tepat, masih ada masalah; karena tarikan gravitasi bulan, pasang surut, dan gempa bumi, panjang hari berubah sepanjang tahun. Meskipun ini bukan masalah untuk sebagian besar aplikasi, ini menjadi masalah non-sepele ketika kita membutuhkan pengukuran yang benar-benar tepat. Triangulasi GPS adalah contoh yang baik dari proses peka waktu, di mana menjadi yang kedua menghasilkan lokasi yang sama sekali berbeda di dunia.

TAI

Akibatnya, Waktu Atom Internasional (TAI) dirancang seakurat mungkin. Menggunakan jam atom di beberapa laboratorium di seluruh bumi, kami mendapatkan ukuran detik yang paling akurat dan konstan, yang memungkinkan kami menghitung interval waktu dengan akurasi tertinggi. Ketepatan ini merupakan berkah sekaligus kutukan karena TAI sangat tepat sehingga menyimpang dari UT1 (atau yang kita sebut waktu sipil). Ini berarti bahwa kita pada akhirnya akan memiliki jam siang yang menyimpang secara substansial dari siang matahari.

UTC

Itu mengarah pada pengembangan Waktu Universal Terkoordinasi (UTC), yang menyatukan yang terbaik dari kedua unit. UTC menggunakan pengukuran satu detik seperti yang didefinisikan oleh TAI. Hal ini memungkinkan pengukuran waktu yang akurat sambil memperkenalkan detik kabisat untuk memastikan bahwa waktu tidak menyimpang dari UT1 lebih dari 0,9 detik.

Bagaimana semua ini diputar bersama di komputer Anda

Dengan semua latar belakang ini, Anda sekarang seharusnya dapat memahami bagaimana sistem operasi melayani waktu pada saat tertentu. Sementara komputer tidak memiliki jam atom di dalamnya tetapi menggunakan jam internal yang disinkronkan dengan seluruh dunia melalui Network Time Protocol (NTP).

Dalam sistem mirip Unix, cara paling umum untuk mengukur waktu adalah dengan menggunakan waktu POSIX, yang didefinisikan sebagai jumlah detik yang telah berlalu pada zaman Unix (Kamis, 1 Januari 1970), tanpa memperhitungkan detik kabisat. Karena waktu POSIX tidak menangani detik kabisat (juga tidak Python), beberapa perusahaan telah menentukan cara mereka sendiri menangani waktu dengan mengolesi detik kabisat sepanjang waktu di sekitarnya melalui server NTP mereka (lihat waktu Google sebagai contoh).

Zona waktu

Saya telah menjelaskan apa itu UTC dan bagaimana hal itu memungkinkan kita untuk menentukan tanggal dan waktu, tetapi negara-negara ingin memiliki waktu dinding siang hari yang sesuai dengan waktu matahari untuk siang hari, sehingga matahari berada di puncak langit pada pukul 12 siang. Itulah sebabnya UTC mendefinisikan offset, sehingga kita dapat memiliki jam 12 pagi dengan offset +4 jam dari UTC. Ini secara efektif berarti bahwa waktu aktual tanpa offset adalah jam 8 pagi.

Pemerintah menentukan standar offset dari UTC yang mengikuti posisi geografis, secara efektif menciptakan zona waktu. Basis data yang paling umum untuk zona waktu dikenal sebagai Basis Data Olson. Ini dapat diambil dengan Python menggunakan dateutil.tz:

>>> from dateutil.tz import gettz
>>> gettz("Europe/Madrid")

The result of gettz gives us an object that we can use to create time-zone-aware dates in Python:

>>> import datetime as dt
>>> dt.datetime.now().isoformat()
'2017-04-15T14:16:56.551778'  # This is a naive datetime
>>> dt.datetime.now(gettz("Europe/Madrid")).isoformat()
'2017-04-15T14:17:01.256587+02:00'  # This is a tz aware datetime, always prefer these

We can see how to get the current time via the now function of datetime. On the second call we pass a tzinfo object which sets the time zone and displays the offset in the ISO string representation of that datetime.

Should we want to use just plain UTC in Python 3, we don’t need any external libraries:

>>> dt.datetime.now(dt.timezone.utc).isoformat()
'2017-04-15T12:22:06.637355+00:00'

DST

Once we grasp all this knowledge, we might feel prepared to work with time zones, but we must be aware of one more thing that happens in some time zones: Daylight Saving Time (DST).

The countries that follow DST will move their clocks one hour forward in spring, and one hour backward in autumn to return to the standard time of the time zone. This effectively implies that a single time zone can have multiple offsets, as we can see in the following example:

>>> dt.datetime(2017, 7, 1, tzinfo=dt.timezone.utc).astimezone(gettz("Europe/Madrid"))
'2017-07-01T02:00:00+02:00'
>>> dt.datetime(2017, 1, 1, tzinfo=dt.timezone.utc).astimezone(gettz("Europe/Madrid"))
'2017-01-01T01:00:00+01:00'

This gives us days that are made of 23 or 25 hours, resulting in really interesting time arithmetic. Depending on the time and the time zone, adding a day does not necessarily mean adding 24 hours:

>>> today = dt.datetime(2017, 10, 29, tzinfo=gettz("Europe/Madrid"))
>>> tomorrow = today + dt.timedelta(days=1)
>>> tomorrow.astimezone(dt.timezone.utc) - today.astimezone(dt.timezone.utc)
datetime.timedelta(1, 3600)  # We've added 25 hours

When working with timestamps, the best strategy is to use non DST-aware time zones (ideally UTC+00:00).

Serializing your datetime objects

The day will come that you need to send your datetime objects in JSON and you will get the following:

>>> now = dt.datetime.now(dt.timezone.utc)
>>> json.dumps(now)
TypeError: Object of type 'datetime' is not JSON serializable

There are three main ways to serialize datetime in JSON:

String

datetime has two main functions to convert to and from a string given a specific format: strftime and strptime. The best way is to use the standard ISO_8601 for serializing time-related objects as string, which is done by calling isoformat on the datetime object:

>>> now = dt.datetime.now(gettz("Europe/London"))
>>> now.isoformat()
'2017-04-19T22:47:36.585205+01:00'

To get a datetime object from a string that was formatted using isoformat with a UTC time zone, we can rely on strptime:

>>> dt.datetime.strptime(now_str, "%Y-%m-%dT%H:%M:%S.%f+00:00").replace(tzinfo=dt.timezone.utc)
datetime.datetime(2017, 4, 19, 21, 49, 5, 542320, tzinfo=datetime.timezone.utc)

In this example, we are hard-coding the offset to be UTC and then setting it once the datetime object has been created. A better way to fully parse the string including the offset is by using the external library dateutil:?

>>> from dateutil.parser import parse
>>> parse('2017-04-19T21:49:05.542320+00:00')
datetime.datetime(2017, 4, 19, 21, 49, 5, 542320, tzinfo=tzutc())
>>> parse('2017-04-19T21:49:05.542320+01:00')
datetime.datetime(2017, 4, 19, 21, 49, 5, 542320, tzinfo=tzoffset(None, 3600))

Note, once we serialize and de serialize, we lose the time zone information and keep only the offset.

Integer

We are able to store a datetime as an integer by using the number of seconds that passed since a specific epoch (reference date). As I mentioned earlier, the most-known epoch in computer systems is the Unix epoch, which references the first second since 1970. This means that 5 represents the fifth second on January 1, 1970.

The Python standard library provides us with tools to get the current time as Unix time and to transform between datetime objects and their int representations as Unix time.

Getting the current time as an integer:

>>> import datetime as dt
>>> from dateutil.tz import gettz
>>> import time
>>> unix_time = time.time()

Unix time to datetime:

>>> unix_time
1492636231.597816
>>> datetime = dt.datetime.fromtimestamp(unix_time, gettz("Europe/London"))
>>> datetime.isoformat()
'2017-04-19T22:10:31.597816+01:00'

Getting the Unix time given a datetime:

>>> time.mktime(datetime.timetuple())
1492636231.0
>>> # or using the calendar library
>>> calendar.timegm(datetime.timetuple())

Objects

The last option is to serialize the object itself as an object that will give special meaning at decoding time:

import datetime as dt
from dateutil.tz import gettz, tzoffset

def json_to_dt(obj):
    if obj.pop('__type__', None) != "datetime":
        return obj
    zone, offset = obj.pop("tz")
    obj["tzinfo"] = tzoffset(zone, offset)
    return dt.datetime(**obj)

def dt_to_json(obj):
    if isinstance(obj, dt.datetime):
        return {
            "__type__": "datetime",
            "year": obj.year,
            "month" : obj.month,
            "day" : obj.day,
            "hour" : obj.hour,
            "minute" : obj.minute,
            "second" : obj.second,
            "microsecond" : obj.microsecond,
            "tz": (obj.tzinfo.tzname(obj), obj.utcoffset().total_seconds())
        }
    else:
        raise TypeError("Cant serialize {}".format(obj))

Now we can encode JSON:

>>> import json
>>> now = dt.datetime.now(dt.timezone.utc)
>>> json.dumps(now, default=dt_to_json)  # From datetime
'{"__type__": "datetime", "year": 2017, "month": 4, "day": 19, "hour": 22, "minute": 32, "second": 44, "microsecond": 778735, "tz": "UTC"}'
>>> # Also works with timezones
>>> now = dt.datetime.now(gettz("Europe/London"))
>>> json.dumps(now, default=dt_to_json)
'{"__type__": "datetime", "year": 2017, "month": 4, "day": 19, "hour": 23, "minute": 33, "second": 46, "microsecond": 681533, "tz": "BST"}'

And decode:

>>> input_json='{"__type__": "datetime", "year": 2017, "month": 4, "day": 19, "hour": 23, "minute": 33, "second": 46, "microsecond": 681533, "tz": "BST"}'
>>> json.loads(input_json, object_hook=json_to_dt)
datetime.datetime(2017, 4, 19, 23, 33, 46, 681533, tzinfo=tzlocal())
>>> input_json='{"__type__": "datetime", "year": 2017, "month": 4, "day": 19, "hour": 23, "minute": 33, "second": 46, "microsecond": 681533, "tz": "EST"}'
>>> json.loads(input_json, object_hook=json_to_dt)
datetime.datetime(2017, 4, 19, 23, 33, 46, 681533, tzinfo=tzfile('/usr/share/zoneinfo/EST'))
>>> json.loads(input_json, object_hook=json_to_dt).isoformat()
'2017-04-19T23:33:46.681533-05:00'

Wall times

After this, you might be tempted to convert all datetime objects to UTC and work only with UTC datetimes and fixed offsets. Even if this is by far the best approach for timestamps, it quickly breaks for future wall times.

We can distinguish two main types of time points: wall times and timestamps. Timestamps are universal points in time not related to anywhere in particular. Examples include the time a star is born or when a line is logged to a file. Things change when we speak about the time “we read on the wall clock.” When we say “see you tomorrow at 2,” we are not referring to UTC offsets, but to tomorrow at 2 PM in our local time zone, no matter what the offset is at this point. We cannot just map those wall times to timestamps (although we can for past ones) because, for future occurrences, countries might change their offset, which happens more frequently than you might think.

For those situations, we need to save the datetime with the time zone to which it refers, and not the offset.

Differences when working with pytz

Since Python 3.6, the recommended library to get the Olson database is dateutil.tz, but it used to be pytz.

They might seem similar, but, in some situations, their approaches to handling time zones is quite different. Getting the current time is simple as well:

>>> import pytz
>>> dt.datetime.now(pytz.timezone("Europe/London"))
datetime.datetime(2017, 4, 20, 0, 13, 26, 469264, tzinfo=<DstTzInfo 'Europe/London' BST+1:00:00 DST>)

A common pitfall with pytz it to pass a pytz time zone as a tzinfo attribute of a datetime:

>>> dt.datetime(2017, 5, 1, tzinfo=pytz.timezone("Europe/Helsinki"))
datetime.datetime(2017, 5, 1, 0, 0, tzinfo=<DstTzInfo 'Europe/Helsinki' LMT+1:40:00 STD>)
>>> pytz.timezone("Europe/Helsinki").localize(dt.datetime(2017, 5, 1), is_dst=None)
datetime.datetime(2017, 5, 1, 0, tzinfo=<DstTzInfo 'Europe/Helsinki' EEST+3:00:00 DST>)

We always should call localize on the datetime objects we build. Otherwise, pytz will assign the first offset it finds for the time zone.

Another major difference can be found when performing time arithmetic. While we saw that the additions worked in dateutil as if we were adding wall time in the specified time zone, when the datetime has a pytz tzinfo instance, absolute hours are added and the caller must call normalize after the operation, as it won’t handle DST changes. For example:

>>> today = dt.datetime(2017, 10, 29)
>>> tz = pytz.timezone("Europe/Madrid")
>>> today = tz.localize(dt.datetime(2017, 10, 29), is_dst=None)
>>> tomorrow = today + dt.timedelta(days=1)
>>> tomorrow
datetime.datetime(2017, 10, 30, 0, 0, tzinfo=<DstTzInfo 'Europe/Madrid' CEST+2:00:00 DST>)
>>> tz.normalize(tomorrow)
datetime.datetime(2017, 10, 29, 23, 0, tzinfo=<DstTzInfo 'Europe/Madrid' CET+1:00:00 STD>)

Note that with the pytz tzinfo, it has added 24 absolute hours (23 hours on the wall time).

The following table resumes the way to get either wall/timestamps arithmetic with both pytz and dateutil:

  pytz dateutil
wall time obj.tzinfo.localize(obj.replace(tzinfo=None) + timedelta, is_dst=is_dst) obj + timedelta
absolute time obj.tzinfo.normalize(obj + timedelta) (obj.astimezone(pytz.utc) + timedelta).astimezone(obj.tzinfo)

Note that adding wall times can lead to unexpected results when DST changes occur.

Finally, dateutil plays nicely with the fold attribute added in PEP0495 and provides backward compatibility if you are using earlier versions of Python.

Quick tips

After all this, how should we avoid the common issues when working with time?

  • Always use time zones. Don’t rely on implicit local time zones.
  • Use dateutil/pytz to handle time zones.
  • Always use UTC when working with timestamps.
  • Remember that, for some time zones, a day is not always made of 24 hours.
  • Keep your time zone database up to date.
  • Always test your code against situations such as DST changes.

Libraries worth mentioning

  • dateutil: Multiple utilities to work with time
  • freezegun: Easier testing of time-related applications
  • arrow/pendulum: Drop-in replacement of the standard datetime module
  • astropy: Useful for astronomical times and working with leap seconds

Mario Corchero will be speaking at PyCon 2017, delivering his talk, It’s time for datetime, in Portland, Oregon.