Apache's Common Log Format Datetime converted to Unix Timestamp with C++
The datetime in Apache's log format looks like this: day/month/year:hour:minute:second zone. It usually has wrapping brackets but I'm assuming those have been taken care of. The datetime format has a standard name but I don't remember it right now. An example would be "04/Apr/2012:10:37:29 -0500".This is great for displaying to humans but annoying to pass around to computers, so let's convert it to a Unix timestamp that is simply the number of seconds since the Unix epoch, i.e. 1970-01-01 00:00:00 +0000. Notice that since the simple seconds timestamp has no time zone information, that information will be lost.
The code, released in public domain (I don't think one could assert copyright over this anyway):
#include <string>
#include <time.h>
#include <sstream> // for converting time_t to str
using std::string;
using std::sstream;
/*
* Parses apache logtime into tm, converts to time_t, and reformats to str.
* logtime should be the format: day/month/year:hour:minute:second zone
* day = 2*digit
* month = 3*letter
* year = 4*digit
* hour = 2*digit
* minute = 2*digit
* second = 2*digit
* zone = (`+' | `-') 4*digit
*
* e.g. 04/Apr/2012:10:37:29 -0500
*/
string logtimeToUnix(const string& logtime) {
struct tm tm;
time_t t;
if (strptime(logtime.c_str(), "%d/%b/%Y:%H:%M:%S %Z", &tm) == NULL)
return "-";
tm.tm_isdst = 0; // Force dst off
// Parse the timezone, the five digits start with the sign at idx 21.
int hours = 10*(logtime[22] - '0') + logtime[23] - '0';
int mins = 10*(logtime[24] - '0') + logtime[25] - '0';
int off_secs = 60*60*hours + 60*mins;
if (logtime[21] == '-')
off_secs *= -1;
t = mktime(&tm);
if (t == -1)
return "-";
t -= timezone; // Local timezone
t += off_secs;
string retval;
stringstream stream;
stream << t;
stream >> retval;
return retval;
}
The annoying parts of this code are knowing to use strptime combined with mktime, knowing to subtract off the program's local timezone, and handling the string's timezone yourself. In Linux, the tm struct has an additional field to hold the zone, but it's good practice to be cross platform compatible and the zone is easy to handle yourself anyway.
This code may contain bugs, it's your responsibility to test it. (If you find bugs I'd like to know, of course!) Time handling code is a pain in the butt with C or C++.
Posted on 2012-07-24 by Jach
Tags: c, c++, programming, tips
Permalink: https://www.thejach.com/view/id/257
Trackback URL: https://www.thejach.com/view/2012/7/apaches_common_log_format_datetime_converted_to_unix_timestamp_with_c
Recent Posts
2023-08-05
2023-07-23
2023-07-01
2023-06-28
2023-06-17