On the use of Dates (including Y2K)

Mark Pottenger

Many computer programs and databases deal with dates and times. Horoscope programs are one example, but the range includes telephone billing, credit card charges and payments, invoices, accounts receivable and payable, inventory, loan tracking, reimbursement schedules, student records, tax records, social security records, medical insurance systems, etc. There are also computer-like chips built into many modern devices, including televisions, VCRs, telephones, faxes, cars, airplanes, traffic lights, cash registers, etc. Many of these devices include some date logic.

In manual systems, people are used to using two digits for the year, leaving the century assumed, implied or preprinted (like blank checks until recently). Most programmers, when setting up a computer system modeled on or intended to replace a manual system, try to follow the manual model closely. This provides helpful familiarity for users making the transition from manual to computer systems. Sometimes the manual model is followed too closely. This is the case with many computer systems using two-digit years. Unless extra logic is incorporated, any computer system still using two-digit years for dates after 1999 will act as if it is working with data for 1900, 1901, etc. This can produce negative ages, misposted payments, incorrect interest, etc.

Another reason, aside from familiarity, that many programs accept user input of two-digit years is a desire to keep down keystrokes. Many programmers try to write programs requiring as little typing as necessary by offering shortcuts, with two-digit years as one example. In the 1992 CCRS Horoscope Program, two digit years are accepted for dates in the 1900s, but full years are also accepted for this or any other century. If the user types a two-digit year it is immediately converted to a full year for all uses: display (screen or printer), internal use and storage. (Other examples of keystroke conservation in CCRS occur wherever a default option takes just the <enter> key, making North/South and East/West input optional for most latitudes and longitudes, etc.)

Thirty to forty years ago computer memory was so expensive that every bit of savings in working program size and storage was viewed as desirable, but that attitude has failed to change to match the decreasing cost of computer working memory and storage. The term “bloatware” is a pejorative for some overly large programs. The attitude that smaller is better has contributed to some use of short dates.

Many programs still provide users the convenience of typing two-digit years while using full years internally. The easiest way to do this is through a “pivot year”. These programs are set to add the century to what the user types, with the century determined by whether the typed year is before or after the pivot year. For example, at my day job, where we deal with postsecondary student records, most of the programs have a built in pivot year of 20. Typed two-digit years less than 20 have 2000 added to them for all calculations and storage, and typed two-digit years greater than or equal to 20 have 1900 added to them. Since the data is all concerned with students and schools, there are very few cases where we actually have to ask the user to type the century digits. In a few years (whenever 2020 starts to look close) we can move the pivot year and recompile our programs. Several programs available for PCs have built in pivot years in their date-related logic, and some of them let you set the pivot year. Programs with a hard-wired pivot year that is inappropriate to your use can be a problem.

In my own programming I almost never use built-in date-time data types provided by the OS or language except to get the current date and time. I almost always set up my own storage of dates and times because then I have full control over their use and am not relying on unknown hidden features in the OS or language.

How much memory a program requires to store dates depends on the range of dates the program will work with. In the CCRS Horoscope Program, I keep month, day and year as three separate integer variables (two bytes each, values from -32,768 to +32,767). This uses one byte each more than needed for the month and day, since they could be stored in a single byte (0 to 255) each. I realized that an integer isn’t enough for the year when I wanted dates before -32,768 for a project a few years ago. For most programs, much shorter ranges of dates are enough. At my day job, all dates are stored in long integers (four bytes each, values from -2,147,483,648 to +2,147,483,647). Since we know all the data we want is in the 20th or 21st century, we just pack the digits of all dates as CCYYMMDD (century year month day), which fits nicely in a long integer. This format can handle any A.D. date through the year 9999 (99999 if we use the remaining available digit to expand the century), and could even handle B.C. (negative) dates with extra work in the packing and unpacking routines. This format also has the virtue of sorting naturally into proper date order for positive dates.

In my work in progress rewriting the CCRS Horoscope Program for Windows, I have upgraded the year to a long integer. I have also incorporated pivot year logic for people who want to type dates with two-digit years. There is one pivot year for dates of birth and a separate pivot based on the year of birth for dates for progressions, directions and transits. This lets you type 1/1/80 and have the program understand 1/1/1980, then ask for a progression for 1/1/1 and have the program understand 1/1/2001. (Of course, people who don’t mind the extra keystrokes can still type full years.)

Standard calendar dates are sloppy measures. A year can be 365 or 366 days long. A month can be 28, 29, 30 or 31 days long. This sort of variability forces programs working with dates to include a lot of extra logic. When doing calculations involving dates, some form of continuous measure is usually more desirable than months, days and years. A standard form from the astronomical community is the Julian Day Number. This is a single number that expresses date (before the decimal) and time (after the decimal). The zero point for JDs is January 1, -4712 (4713 B.C.) (Julian proleptic calendar) at noon in Greenwich. (JDs start at noon instead of midnight so a set of overnight astronomical observations will all be in the same “day”.) January 1, 1900 started at JD 2415020.5. The Julian Ephemeris Date (including delta t) of my birth was 2435147.932639. January 1, 2000 will start at JD 2451544.5. Julian Day Numbers have 7 digits to the left of the decimal from November -1975 (1976 B.C.) through November 22666. Quite a few computer systems use continuous counts of days with different starting epochs and misapply the name Julian Days to those measures. The real thing has the ancient epoch chosen by astronomers as likely to be before any reasonably accurate series of astronomical observations. (Some cycles were also involved in the choice of the starting epoch.) A double-precision number (8 bytes) will hold a JD with date and time with enough accuracy for most uses. (I have seen one application—JPL’s ephemeris—which uses two double-precision numbers for date and time.) If you don’t need the time portion and are only working with dates, the date portion of a JD fits in a long integer.

Working with Julian Day Numbers instead of months, days and years lets you get exact numbers of days between dates with a simple subtraction or scan through a range of dates with a simple loop. For example, the timed transit printout in the CCRS Horoscope Program gets the JDs of the first and last days requested and then simply scans between those JDs one day at a time. Almost all astronomical/astrological planetary calculations work from the JD or JED (JD for Ephemeris Time, incorporating the delta t correction from UT). Formulae that calculate positions from elements and perturbations get values for a particular date by calculating motion since the epoch and adding that to the starting epoch positions. Motion since the epoch is calculated by multiplying motion in one day (or century) by the number of days (or centuries) since the epoch. The epochs for most astronomical formulae are 1900, 1950 and 2000. Calculations using a disk ephemeris look up the date in the ephemeris based on a JED.

Here are examples of BASIC functions converting from date to JD and from JD to date. I realized early in my work on astrological computing how important JDs are and researched the issue until I felt I had the best concise solution available (these routines require no tables of lengths of months). My versions expand on a more limited formula I found during my research. I sent a heavily commented version to an astrological computing magazine in the late 1970s, though I haven’t been able to find the original magazine recently. Over the years, I have written versions in several dialects of BASIC, in C, in COBOL, and the version shown here in Visual BASIC. Simpler versions can be created when time is not needed. Since the BASIC INT function is actually a FLOOR function, these routines work even for negative JDs (before -4712). If you translate to a language like C, be sure to replace INT with FLOOR. A way to test an implementation of the functions is to do a loop through all the dates a program could deal with, first calling one function, then calling the other function with the results of the first and comparing that result to the original value.

Function DateToJD(DT As DateTime) As Double

Rem JULIAN DAY (PASS Date, Time & Calendar flag: Gregorian=0,Julian=1)

Dim lngY As Long

Dim intM As Integer

Dim lngEpochY As Long

Dim dblJD As Double

lngY = DT.Year

intM = DT.Month

If intM < 3 Then lngY = lngY - 1

lngEpochY = lngY + 4712

intM = intM + 1

If intM < 4 Then intM = intM + 12

dblJD = Int(lngEpochY * 365.25) + Int(30.6 * intM) + DT.DAY + DT.Time / 24 - 63.5

If DT.Calendar = 0 Then

dblJD = dblJD - (Int(Abs(lngY) / 100) - Int(Abs(lngY) / 400)) * Sgn(lngY) + 2

If lngY < 0 And lngY / 100 = Int(lngY / 100) And lngY / 400 <> Int(lngY / 400) Then dblJD = dblJD - 1

End If

DateToJD = dblJD

End Function

Sub JDToDate(ByVal JD As Double, ByRef DT As DateTime)

Rem JULIAN DAY NUMBER to Date & Time (DT.Calendar must be set before call: Gregorian=0,Julian=1)

Dim dblTemp0 As Double

Dim dblTemp1 As Double

Dim dblTemp2 As Double

Dim dblTemp3 As Double

dblTemp0 = JD + 32082.5

If DT.Calendar = 0 Then

dblTemp1 = dblTemp0 + Int(dblTemp0 / 36525#) - Int(dblTemp0 / 146100#) - 38

If JD >= 1830691.5 Then dblTemp1 = dblTemp1 + 1

dblTemp0 = dblTemp0 + Int(dblTemp1 / 36525#) - Int(dblTemp1 / 146100#) - 38

End If

dblTemp1 = Int(dblTemp0 + 123)

dblTemp2 = Int((dblTemp1 - 122.2) / 365.25)

dblTemp3 = Int((dblTemp1 - Int(365.25 * dblTemp2)) / 30.6001)

DT.Month = dblTemp3 - 1

If DT.Month > 12 Then DT.Month = DT.Month - 12

DT.DAY = dblTemp1 - Int(365.25 * dblTemp2) - Int(30.6001 * dblTemp3)

DT.Year = dblTemp2 + Int((dblTemp3 - 2) / 12) - 4800

DT.Time = (JD - Int(JD + 0.5) + 0.5) * 24

End Sub

A pair of routines like this can even be used to clean up some bad date/time input—convert the bad date to a JD, then convert back to a valid date (e.g. February 29, 1998 to March 1, 1998 or the 25th hour of a day to the 1st hour on the next day).

Here is part of a table I created in 1982 when I was working on some of the earliest conversions of the CCRS Horoscope Program between different machines and languages. I was concerned with how many years I could cover with analytic epoch-based planetary formulae. I had been using 14-decimal-digit math on our NorthStar system. CBASIC also used 14-decimal-digit math, but it was painfully slow. AppleSoft (the main language on the Apple II), with 32-bit numbers, would have been unable to maintain to-the-second accuracy more than 136 years from the epoch. Microsoft BASIC at that time offered 24-bit single precision and 56-bit double precision. This was one of the reasons the 1983 version of the CCRS Horoscope Program was converted to Microsoft BASIC.

Time Range for To-The-Second Timing for Different Precisions

Binary Math Languages

BITS

SECONDS

YEARS

16

65,535

0.002077

24

16,777,215

0.531638

32

4,294,967,295

136.099301

40

1,099,511,627,775

34,841.421013

48

281,474,976,710,655

8,919,403.779459

Decimal Math Languages

DIGITS

SECONDS

YEARS

4

9,999

0.000317

5

99,999

0.003169

7

9,999,999

0.316881

8

99,999,999

3.168809

9

999,999,999

31.688088

14

99,999,999,999,999

3,168,808.781403

15

999,999,999,999,999

31,688,087.814029

A similar table can be constructed for any project planning to use continuously measured dates and times. First decide the accuracy you need: millionths of seconds, thousandths of seconds, hundredths of seconds, tenths of seconds, seconds, minutes, hours, or days. Next decide the range of dates you need: 1900s, 1900s and 2000s, A.D. 0 to 9999, 10000 B.C. to 10000 A.D., etc. Next decide if you will use only positive numbers from beginning to end of your range or if you will use positive and negative numbers from the middle of your range. Use the above three decisions to determine how many bits or digits you need. Unless there are very strong reasons to change, I tend to use the existing astronomical JD system for most date work.

If I’ve looked it up right, the Real Time Clock in PCs keeps the date as days since January 1, 1980 in two bytes, giving it a potential range of 179 years (1980-2159). Other references indicate that the PC RTC date range is 1980-2099 because system calls to get or set the date use one byte for a century of 19 or 20 and one byte for a year. The system calls use the bytes in BCD (binary coded decimal) format, in which a byte is read as two decimal digits (in 4 bits each). RTC times are kept to 1/100th of a second. DOS file date stamps in directories give the range of 1980-2097 by using 7 bits for years from 1980. Windows 95/98 and Windows NT support file time stamps using 64 bits to count 100-nanosecond intervals since January 1, 1601 (which will handle dates into the year 59934), and system date information with a 16-bit year (which will handle years up to 65535). Many other computer system that use pseudo-Julian Days with varying numbers of bits or digits will run out at different years, depending on the zero point and range chosen during design.

Another example I had not heard of before was described in the August 17, 1998 Los Angeles Times. The Global Positioning System satellite network incorporates a timekeeping system with a cycle of 1024 weeks. The current cycle ends August 22, 1999, and there could be several days of confusion depending on how receiver software is written. The Times also mentioned that the date 9/9/99 (September 9, 1999) could be a problem for some software that treats a value of 9999 as a special flag or end marker. Like the DOS dating described above, 32-bit Unix system use a 32-bit number of seconds from January 1, 1970. This runs out in January 2038.

Another example of date manipulation is squeezing date information into fewer characters. At my day job some file names record the date of the file in 3 characters by using one letter of the alphabet for years from a 1980 epoch, one letter for month, and one number or letter for day. This system is useful for getting more information in the limits of DOS 8 character + 3 character file names, but will run into problems if we are still using it in another 8 years when we finish running through the alphabet in years since the epoch. Also, the routines the original programmers of these date codes wrote would have failed in 2000 because they used two-digit years with no pivot logic.

Subscribers who have renewed more than one year ahead can see an example of a date issue on the mailing label on the envelope The Mutable Dilemma was mailed in. We started this journal in 1977, and we are still maintaining the mailing list with a program Rique wrote back then. At the time we set up the mailing list, we used two-digit years in the last issue codes. When we first received a renewal that extended past 1999, I took a shortcut. I looked at the source code in the address list program and saw that the last issue comparison was done alphabetically rather than numerically. Knowing this, I used A as the next digit after 9 in last issue dates. Some subscribers now have last issue codes like VA0, SA1, PA2, etc. This shortcut adds another 260 years to the usability of two-character dates without having to rework old data, but can only be used when the dates are treated as alphanumeric characters rather than numbers.

Another area where there is room for error in date work concerns leap years. In our current Gregorian calendar, every year evenly divisible by 4 includes the leap day of February 29th, except century years (divisible by 100) don’t, except century years evenly divisible by 400 do. 1900 was not a leap year, but 2000 will be. Someone who programs leap years by the simple evenly divisible by 4 rule will be correct from 1901 through 2099. Ignoring the 100s exception would produce wrong answers in 1900 and 2100. Including the 100s exception and ignoring the 400s exception to it would produce the wrong answer in 2000. This is a case where a middle level of knowledge is more dangerous than more or less.

Another area of concern is the use of dates in spreadsheets. Many people who do no other programming do a form of programming by using formulae in spreadsheets. Any spreadsheet date created with two-digit years or using built-in functions with date limits or inappropriate pivots could become a problem when the year 2000 (or the limiting year of the built-in functions) arrives.

Another issue with dates is data interchange. The use of integers, long integers and double precision I described above applies within working memory or within a program’s private data structures, but is less appropriate for exchanging data between systems. Computers and operating systems do not all use the same internal formats to store integers, long integers and double precision, so data expected to move between systems is frequently converted to ASCII text. If you are sending data as text, you have to have very clear agreement at both ends what the format of the data will be. If CCYYMMDD (e.g., 19980704) is sent to someone expecting MMDDCCYY (e.g., 07041998), or any other mismatch, there will be problems.

The Year 2000 Problem (nicknamed Y2K—Kilo is the metric prefix for 1,000) is getting a lot of attention and will get more for the next couple of years. It is often described as a technical problem, which is misleading. It includes all of the potential problems and limits described above, but they all have known solutions. Any article that talks about “the solution” to “the” Y2K “bug” is written by someone who doesn’t understand the topic. Y2K is actually a mind-set and management problem. The mind-set is a pattern of thinking I suspect is common to many programmers: because the systems and tools they use change very rapidly, many programmers subconsciously expect that the programs they write will come in and out of use as rapidly. With that kind of thinking, it is very easy to write something that will work “now” without worrying if it will work 5, 10, 20, 30 or more years from now. Sometimes a program written as a “quick and dirty” solution to one problem grows into a major project without ever being reviewed for design problems. The management aspect of the problem includes at least two parts. Much management is so focused on short-term goals that programmers are pushed to achieve fast results even if taking a little longer to design for the long term would avoid problems down the road. Even when problems are brought up, many managers are reluctant to allocate resources to fix problems because that takes away resources wanted for new projects. Also, many people still try to hide problems or don’t admit that they need to be fixed now.

The Y2K problem also includes two components unrelated to the actual technicalities of date calculations discussed above: surveys and laws. Any company that exchanges date data with other companies needs to know if the other companies have any date handling problems, so many companies are sending inquiries to each other to find out. Any company or government agency that knows it has date handling problems and hasn’t fixed them, doesn’t plan to fix them, or fails to fix them might be vulnerable to being sued, especially if they lie about their problems. I have even seen mention in some computer magazines that there are now lawyers specializing in Y2K issues!

Using my day job as an example again, we are using a set of computer programs for which the basic design was established in 1986. In the early 1990s we noticed problems with two-digit years in some printouts running 10 years into the future. When one of the programmers then on staff planned to move out of state in 1992, we made the last project before he left a database and program conversion to convert all dates in our databases to four-digit years. Some program logic changes were missed in the big conversion, but they have mostly been found in the 5+ years since. I have still occasionally run into a problem with two-digit year logic that other programmers embedded in screen handling code because the screen language programs are in a format I can’t search through easily. After starting this article, I recently spent a couple weeks finding places in programs where two-digit years were converted to four digits without using a pivot year or were used directly in addition, subtraction or comparisons. I also took the time to go through all the data entry screens looking for any years. There are also a lot of DOS-based data entry programs using two-digit years. We are mostly handling those with a pivot year, as described above. In our application, we exchange a lot of school and student data with the U.S. Department of Education. All of their dates used two-digit years up until two years ago. In the last two years, they have upgraded to four-digit years in all the data interchange formats we deal with.

How prepared any company is for Y2K or any other date problem depends on whether their programmers wrote for the short-term or for the long-term in the past (often determined by personal or management attitudes) and how their management has allocated resources in the last few years and for the next 1 1/2 years. Based on what I am seeing in the computer trade press these days, I suspect the worldwide extent of unreadiness will be considerably more than I expected from my own experience.

Copyright © 1998 Los Angeles Community Church of Religious Science, Inc.

back to top