Standardized Logging

Before we could build a logging architecture, we need to have a standardized way of logging. Standard Fields in a logged message and there meaning:

{
  message
  timestamp
  level
  error
  service
  namespace
  context_id
}

service:

It indicates the logical name of the service which is logging to the stdout / stderr. For example in case we have render service logging, it would be logging with the service: bundler

timestamp:

This field indicates the time at which the log was created. It must be logged according to RFC 3339 and in the following format YYYY-MM-DDThh:mm:ssZ. The time logged should always be in UTC.

level:

Level indicates the severity of the message being logged. It's possible values are INFO, DEBUG, ERROR, WARN. A log should not be created with any other log levels.

error:

It indicates human readable information about the exception occurred in the system.

message:

It indicates human readable information about the event occurred in the system.

namespace:

It indicates the logical namespace under which the logger is created to push out logs.

context_id:

In microservices architecture, we have a group of machines working together to carry out a single task. This means that a single request will potentially hit many services in order to carry out it's task. Sometimes, it becomes imperative to trace the whole request lifecycle. Therefore, context_id in the log is used to indicate the correlation between different logs across services.

All the above fields are reserved fields and therefore a log should not be logged with the following fields in a context which is different than the one intended.

INFORMATION NOT TO BE LOGGED

As developers we want to log as much information as we can log. It gives a better sense of understanding about our program and also be proactive about the status of the system. It helps us discover bugs by going through the exception traceback.

It also makes us susceptible to log sensitive information about the user which might be needed at all. This creates a problem for us because we can go through these logs of past and identify the users in out system. This is not good from security and compliance perspective. Therefore following fields should be avoided in the logs:

email:

This represents the email information about the user.

username:

This field represents the name of the user. In this context, this is actual name of the account holder.

ipaddress / ip:

IP from which the request originates. This information could be used to narrow down the geographical location and in some cases where the population is less, identify the exact address of the user from which the information originates.

cardId:

Card PAN from which the payment is being made. This is an extremely sensitive information and should not be logged under any circumstance.

youtubeChannelId:

We have a lot of people who use music from our platform. Most if not all, have some form of a Youtube channel under which they purchase our license. That channel identifier is direct relation to there online identity and therefore should be logged in our systems.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly