Home
Software
Writings
Logging HTTP requests in Go
in: Go programming articles
go
If you’re running HTTP servers you probably want to log HTTP requests.
They are helpful for diagnosing problems (which requests fail? how many requests per day we serve? which requests are slow?).
They are necessary for analytics (which pages are popular? where do website visitors come from?).
This article describes everything you need to know to log HTTP requests in your own Go web servers.
This is not about re-usable library but about things you need to know to implement your own solution, a description of my logging choice.
You can see how all things comes together in a sample app: https://github.com/essentialbooks/books/tree/master/code/go/logging_http_requests

What information to log

To show what information is typically logged, here’s a sample log entry in Apache’s Extended Log File Format:
111.222.333.123 HOME - [01/Feb/1998:01:08:39 -0800] "GET /bannerad/ad.htm HTTP/1.0" 200 198 "http://www.referrer.com/bannerad/ba_intro.htm" "Mozilla/4.01 (Macintosh; I; PPC)"
We have:
We can log more information or choose to not log some of the above information.
Personally:
The signature of Go’s standard HTTP serving function is:
func(w http.ResponseWriter, r *http.Request)
We’ll add logging as a so-called middleware, which is a way to add re-usable pieces of functionality into HTTP serving pipeline.
We have logReqeustHandler function that takes http.Handler interface as an argument and returns another http.Handler that wraps the original handler and adds logging:
func logRequestHandler(h http.Handler) http.Handler {
	fn := func(w http.ResponseWriter, r *http.Request) {

		// call the original http.Handler we're wrapping
		h.ServeHTTP(w, r)

		// gather information about request and log it
		uri := r.URL.String()
		method := r.Method
		// ... more information
		logHTTPReq(uri, method, ....)
	}

	// http.HandlerFunc wraps a function so that it
	// implements http.Handler interface
	return http.HandlerFunc(fn)
}
We can nest middleware handlers on top of each other, each adding some functionality.
Here’s how we would use it to add logging to all request handlers:
func makeHTTPServer() *http.Server {
	mux := &http.ServeMux{}
	mux.HandleFunc("/", handleIndex)
	// ... potentially add more handlers

	var handler http.Handler = mux
	// wrap mux with our logger. this will 
	handler = logRequestHandler(handler)
	// ... potentially add more middleware handlers

	srv := &http.Server{
		ReadTimeout:  120 * time.Second,
		WriteTimeout: 120 * time.Second,
		IdleTimeout:  120 * time.Second, // introduced in Go 1.8
		Handler:      handler,
	}
	return srv
}
First, let’s define a struct to package all the logged information:
// LogReqInfo describes info about HTTP request
type HTTPReqInfo struct {
	// GET etc.
	method string
	uri string
	referer string
	ipaddr string
	// response code, like 200, 404
	code int
	// number of bytes of the response sent
	size int64
	// how long did it take to
	duration time.Duration
	userAgent string
}
Here’s the full implementation of logRequestHandler:
func logRequestHandler(h http.Handler) http.Handler {
	fn := func(w http.ResponseWriter, r *http.Request) {
		ri := &HTTPReqInfo{
			method: r.Method,
			uri: r.URL.String(),
			referer: r.Header.Get("Referer"),
			userAgent: r.Header.Get("User-Agent"),
		}

		ri.ipaddr = requestGetRemoteAddress(r)

		// this runs handler h and captures information about
		// HTTP request
		m := httpsnoop.CaptureMetrics(h, w, r)

		ri.code = m.Code
		ri.size = m.BytesWritten
		ri.duration = m.Duration
		logHTTPReq(ri)
	}
	return http.HandlerFunc(fn)
}
Let’s go over the easy cases:
Other information is harder to get.
The problem with getting IP address of the client is that there could be a HTTP proxy involved. The client makes a request to the proxy, the proxy makes a requests to us. As a result, we get the IP address of the proxy, not the client.
Because of that, proxies usually augment the request to us with information about client’s real IP address using X-Real-Ip or X-Forwarded-For HTTP headers.
Here’s how to extract this info:
// Request.RemoteAddress contains port, which we want to remove i.e.:
// "[::1]:58292" => "[::1]"
func ipAddrFromRemoteAddr(s string) string {
	idx := strings.LastIndex(s, ":")
	if idx == -1 {
		return s
	}
	return s[:idx]
}

// requestGetRemoteAddress returns ip address of the client making the request,
// taking into account http proxies
func requestGetRemoteAddress(r *http.Request) string {
	hdr := r.Header
	hdrRealIP := hdr.Get("X-Real-Ip")
	hdrForwardedFor := hdr.Get("X-Forwarded-For")
	if hdrRealIP == "" && hdrForwardedFor == "" {
		return ipAddrFromRemoteAddr(r.RemoteAddr)
	}
	if hdrForwardedFor != "" {
		// X-Forwarded-For is potentially a list of addresses separated with ","
		parts := strings.Split(hdrForwardedFor, ",")
		for i, p := range parts {
			parts[i] = strings.TrimSpace(p)
		}
		// TODO: should return first non-local address
		return parts[0]
	}
	return hdrRealIP
}
Capturing ResponseWriter status code and size of the response is the most tricky.
http.ResponseWriter doesn’t give us this information. Luckily, it’s a simple interface:
type ResponseWriter interface {
    Header() Header
    Write([]byte) (int, error)
    WriteHeader(statusCode int)
}
It’s possible to write an implementation of this interface which wraps the original and records the information we want to know. Lucky for us, some already did it in package httpsnoop.

Format of log file

The Apache log format is compact, human-readable but rather hard to parse.
At some point we also need to read the logs for analysis and I don’t relish the idea of implementing a parser for this format.
From the point of view of implementation, one easy way would be to log each record in JSON, separated by newline.
What I don’t like about this approach is: JSON is not easy to read.
As a middle-ground, I’ve created siser library which implements a serialization format that is extensible, easy to implement and human readable. It’s well suited for logging structured information and I use it in multiple projects.
Here’s how a single request is serialized:
171 1567185903788 httplog
method: GET
uri: /favicon.ico
ipaddr: 204.14.239.58
code: 404
size: 758
duration: 0
ua: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:68.0) Gecko/20100101 Firefox/68.0
The firsts line of each record contains the information:
What follows is data in a key: value format.
Here’s how we serialize a record and write it to log file:
var (
	muLogHTTP sync.Mutex
)

func logHTTPReq(ri *HTTPReqInfo) {
	var rec siser.Record
	rec.Name = "httplog"
	rec.Append("method", ri.method)
	rec.Append("uri", ri.uri)
	if ri.referer != "" {
		rec.Append("referer", ri.referer)
	}
	rec.Append("ipaddr", ri.ipaddr)
	rec.Append("code", strconv.Itoa(ri.code))
	rec.Append("size", strconv.FormatInt(ri.size, 10))
	durMs := ri.duration / time.Millisecond
	rec.Append("duration", strconv.FormatInt(int64(durMs), 10))
	rec.Append("ua", ri.userAgent)

	muLogHTTP.Lock()
	defer muLogHTTP.Unlock()
	_, _ = httpLogSiser.WriteRecord(&rec)
}

Rotating logs daily

I usually deploy my servers on Ubuntu servers and place logs in /data/<service-name>/log directory.
We can’t write to the same log file forever. Eventually we would run out of space.
For long term logs I usually have a log file per day, named by date e.g. 2019-09-23.txt, 2019-09-24.txt etc.
This is sometimes called log rotat
To avoid re-implementing this functionality multiple times, I wrote a library dailyrotate.
It implements Write, Close and Flush methods, so it’s easy to plug into existing code using io.Reader etc.
You specify a directory to use and a format for the name of the log file. The format is given to Go’s time formatting function. I usually use 2006-01-02.txt as this generates a unique time for each day, sorts by day and .txt is a hint to tools that this is a text, and not binary, file.
You then write to it as to a regular file and the code ensures to create per-day files.
You can also provide a callback to be notified when rotation happens so that you can, for example, upload the just closed file to online storage for archival or run analytics on it.
Here’s the code:
pathFormat := filepath.Join("dir", "2006-01-02.txt")
func onClose(path string, didRotate bool) {
  fmt.Printf("we just closed a file '%s', didRotate: %v\n", path, didRotate)
  if !didRotate {
    return
  }
  // process just closed file e.g. upload to backblaze storage for backup
  go func() {
    // if processing takes a long time, do it in a background goroutine
  }()
}

w, err := dailyrotate.NewFile(pathFormat, onClose)
panicIfErr(err)

_, err = io.WriteString(w, "hello\n")
panicIfErr(err)

err = w.Close()
panicIfErr(err)

Long term storage and analytics

For long term storage I compress them with gzip and upload the files to online storage. There are plenty of options: S3, Google Storage, Digital Ocean Spaces, BackBlaze.
I tend to use Digital Ocean Spaces or BackBlaze, as they are significantly cheaper (both in storage costs and bandwidth costs).
They both support S3 protocol so I use go-minio library.
For analytics I do daily processing using my own code to generate basic summaries of the most useful information.
Another option would be to import the data into a system like BigQuery.

More Go resources

Written on Aug 16 2021. Topics: go.
home
Found a mistake, have a comment? Let me know.

Feedback about page:

Feedback:
Optional: your email if you want me to get back to you: