Posts Tagged ‘filenames’

h1

Arguments for using the yyyy-mm-dd format

2010-04-25

The common way to write a date in the United States is to put the month first followed by a slash and the date e.g. 4/25. In some scenarios it is advisable to prefix the month with a leading zero e.g. 04/25 or in filenames as 04-25 since slashes are typically used as pathname delimiters. This is useful because the most common filename sorting algorithm is alphabetic string comparison, which does not pay attention to the semantically assigned numerical values of the expression, but merely compares the ASCII [or Unicode] value of each character. What can happen is that directories containing files of an entire year will sort improperly when transitioning from 9-31 to 10-01 (or even worse, 10-1). Observe:
1-01
1-02
1-03
10-01
11-24
12-31
9-30
9-31

If however, we use leading zeroes such that the number of characters used to describe the date remains constant, the files with sort themselves chronologically.
01-01
01-02
01-03
09-30
09-31
10-01
11-24
12-31

If we apply year to the expression, the American standard is to put it on the end, e.g. 4/25/10 or 04-25-10. For greater clarity, four digits are used: 04-25-2010. One can see that using this convention results in months being clustered with varying year rather than a chronological order:
02-24-2008
02-28-2007
04-19-2006
10-31-2009

I’d like to diverge for a moment and mention that the Russian standard is quite different. dd-mm-yyyy e.g. 25.04.2010, or 25.4 for short. This isn’t very useful for sorting filenames, but it does demonstrate the reasons behind the orderings that we use are typically because they reflect the syntax of spoken language, e.g. April the Twenty-fifth as opposed to Двадсать Пятого Апрелья or 4月25日.

Ah, the Japanese system. Much like our own, they put the month first – however it’s more common to see the numerals written in Japanese e.g. 四月二十五日 instead. Their system is different than ours, in that they put the year first, following the spoken tradition of starting with general and then following with more concrete terms. The date format yyyy-mm-dd is useful to us as programmers because filenames will sort in strict chronological order using this nomenclature.
2007-11-23
2008-11-23
2009-04-25
2009-12-31
2010-01-01
2010-04-25

In fact, it is common practice in Linux to specify dates in this format, going from broad to general, and time of day can be included e.g. 2010-04-25 01:55 EDT (GMT-4). Good night.

Advertisements