Analyzing Cloud Foundry Access Logs
Analyzing Cloud Foundry Access Logs
The gorouter component of Cloud Foundry routes all incoming HTTP requests to their target containers and writes all requests to an access log. Each access log entry produced by the gorouters contains a lot of information you would typically receive in other web server access logs as well. This blog post shows how to obtain the logs with either the BOSH CLI or the Cloud Foundry CLI and how to analyze them with goaccess.
Obtaining Cloud Foundry Access Logs
If you have administrative access to Cloud Foundry you can use the BOSH CLI to obtain the access logs for all apps running on Cloud Foundry. In case you are just interested in the access logs of a single application, you can use the CF CLI to obtain the logs.
Obtaining Access Logs with the BOSH CLI
Using the BOSH CLI you need to download the logs of each router instance separately:
# put all access logs into a separate folder
mkdir ./access-logs
$INDEX=0
bosh logs JOB $INDEX --dir ./access-logs
Put all the obtained router logs into the same directory so that you can can easily unpack and concatenate them to a single file for analysis with the following script:
for archive in $(find . -name 'router*.job.tgz'); do
# extract all the logs
destination=./logs/logs-$(basename $archive)
echo "Unpacking $archive to $destination"
mkdir -p ./$destination
tar -xzvf $archive -C $destination
# gunzip all individual access logs
for accessLog in $(find $destination -name "access*.gz"); do
echo "Unpacking $accessLog in place"
gunzip $accessLog
done
# concatenate all access logs of the router
accessLogCombined=$(echo $destination)-combined.log
for accessLog in $(find $destination -name "access.log*"); do
echo "Concatenating $accessLog to $accessLogCombined"
cat $accessLog >> $accessLogCombined
done
done
The above script creates a combined access log for each router instance in ./logs/<ROUTER>-combined.log
.
The combined log files serve as input for the next step, the report generation with goaccess.
Obtaining Access Logs with the CF CLI
You can use the CF CLI to access the individual logs of an application. As the application logs contain various types of log messages, we need to make sure to grab only those that are produced by the router component:
cf logs <APP_NAME> --recent \
| grep RTR \
| sed -E 's/^.*(OUT\|ERR)\s*(.*)$/\2/' \
| awk 'NF' \
> access-log-<APP_NAME>.log
The grep
part keeps only logs produced by the routers, the sed
part removes the log timestamp and keeps only the plain access log, while the awk
part removes empty lines.
The access-log-<APP_NAME>.log
can now be used in the next step to generate a report with goaccess.
Generating a Report with goaccess
goaccess is a handy CLI tool to analyze access logs and provides a graphical HTML report (or ncurses if you want!) of the access logs. It ships with a set of predefined access logs formats, but provides CLI options to customize the access log format. To analyze the access logs generated in the previous steps, the following goaccess options are used:
../goaccess-1.2/goaccess.exe \
access-log.log \
--log-format '%v - [%dT%t.%^] "%m %U %H" %s %^ %b "%R" "%u" %^ %^ x_forwarded_for:~h{," } %^ %^ response_time:%T %^ %^ %^ %^ %^' \
--time-format '%H:%M:%S' \
--date-format '%Y-%m-%d' \
--date-spec hr \
--hour-spec min \
--invalid-requests invalid-requests.log \
-o report.html
The command generates the interactive report to report.html
.
Depending on the time range you are analyzing it may be useful to adjust the date-spec
or hour-spec
settings (as per the docs):
--hour-spec=<hour|min>
: Set the time specificity to either hour (default) or min to display the tenth of an hour appended to the hour. This is used in the time distribution panel. It's useful for tracking peaks of traffic on your server at specific times.--date-spec=<date|hr>
: Set the date specificity to either date (default) or hr to display hours appended to the date. This is used in the visitors panel. It's useful for tracking visitors at the hour level. For instance, an hour specificity would yield to display traffic as 18/Dec/2010:19
And now, explore the interactive report with all its charts and options.