Chuniversiteit logomarkChuniversiteit.nl
The Toilet Paper

There is a time and place for log statements, but when and where?

Logging statements can be very valuable for various reasons, but aren’t entirely free of cost. What’s the best way to use them?

Professor Oak
There’s a time and place for everything, but not now.

Logging statements in source code can collect valuable runtime information. Often they’re the only available resource for diagnosing failures on production. This week’s paper discusses the costs and benefits that developers consider when making decisions about what, where, and how to log. These insights come from a two-part study:

  1. A qualitative study in the form of a survey that directly collects answers from developers about their logging considerations, and

  2. A case study that provides more detailed information about real-life scenarios.

Benefits

Link

The study revealed eight main categories of logging benefits:

  • Diagnosing runtime failures: Logs are often an important (and sometimes the only) source to diagnose a system when something went wrong.

  • Using logs as a debugger: Logs can be used as a debugger to help developers narrow down execution paths and find the root cause of a software failure.

  • User/customer support: When logs are directly exposed to users and customers, .

  • Tracking execution progress: Logs help track the status or progress of an execution, such as the start or end of an event. This is especially important for long-running processes.

  • Monitoring & alerting: Logs can alert developers and users about problems or anomalies.

  • System comprehension: Logs help developers get familiar with the source code and the runtime behaviour under normal conditions.

  • Assisting in developing software: Developers can gain insights from logs to evaluate execution flows and make better design and implementation decisions.

  • Bookkeeping: Log messages can be used to record important transactions or operations, such as logins and remote requests.

Costs

Link

The study also revealed nine different categories of logging costs:

  • Storage cost: Excessive logging leads to excessive storage costs, especially when entire stack traces are logged.

  • Producing noise that hides important information: Excessive logging makes it hard to find the things that are actually important to you. For instance, logging normal events or properly handled problems at warn or error levels can make it hard to identify real problems.

  • Effort on log collection, processing, and management: Excessive logging increases the effort for collecting, processing, and managing logs.

  • Performance overhead: Writing log messages into files may involve expensive I/O operations. Repetitive printing of similar log messages (e.g. in loops) may cause a noticeable reduction in performance!

  • Perturbing system behaviours: Logging normally shouldn’t affect the functional behaviour of a system. However, logging can cause system runtime failures due to bugs, for instance by triggering a NullPointerException.

  • Confusing users: Inappropriate log messages may confuse or mislead users, e.g. when warn or error messages are used for successful operations or when messages are logged inconsistently (logging the creation of objects, but not their destruction).

  • Exposing sensitive information: Sensitive information like usernames and passwords should not be printed in log files, especially when they might be archived for years and cannot be tampered with due to legal regulations.

  • Logging code development and maintenance cost: Writing logs isn’t free. You’ll have to set up and configure a logging library, be prepared for constant bikeshedding, and keep logging statements up-to-date with source code.

  • Decreasing code readability: Logging can help developers better understand source code. At the same time, having too many log statements can make the code hard to read.

A balancing act

Link

Clearly, writing logs comes with many different benefits and costs. Developers balance these benefits and costs in several ways:

  • Appropriate log levels: The most common approach is to assign appropriate log levels for , so that logs are not created unnecessarily and users have the ability to filter by level.

  • Differentiate different logging purposes: Different loggers can be used for different purposes. For example, performance information can be logged using standardised performance loggers to allow for better performance analysis.

  • Proactively determining appropriate scope/focus of logging: It’s important to log uncommon/unexpected situations (e.g. when errors happen) and important events/state changes. Logging should also follow project or team-based styles.

  • Reactive logging: Log statements need to be updated over time, as it’s hard to determine exactly what you need beforehand. However, make sure that logging follows a budget and remove unnecessary logs.

  • Minimising repeated logging: Inserting logging statements in tight loops is considered as a bad logging practice. Highly repetitive logging messages are best aggregated. Moreover, logging exceptions that are thrown may lead to duplicated logging.

  • Considering logging impact: Logging may negatively impact performance in some cases. This should be taken into account when making logging decisions.

  • Ensuring quality of logging code: Logging statements without sufficient contextual information (thread, session, query or user IDs) may hinder their usefulness.

  • Using advanced tooling to support logging: Use libraries like slf4j and processing tools like Splunk to improve the way logs are generated, processed and analysed.

  • High configurability of logging: Ideally, it should be possible to enable and disable certain logs and stack traces independently from each other.

Summary

Link
  1. There are many considerations to be made when deciding whether and how to log