Simple logging with rotation and custom format in Python

Sometimes I find myself having to write a short Python script with proper logging capabilities.

This code snippet demonstrates how to do use Python’s official logging module using very few lines but still with all these features in:

  • basicConfig
  • log rotation
  • formating
import logging
from logging.handlers import RotatingFileHandler
logging.basicConfig(
	handlers=[
		RotatingFileHandler(
			'/path/to/log/file.log',
			maxBytes=10240000,
			backupCount=5
		)
	],
	level=logging.INFO,
	format='%(asctime)s %(levelname)s PID_%(process)d %(message)s'
)
logging.info('Hello world')
logging.error('Oh no, an error occurred!')

The following lines are interesting:

  • ‘/path/to/log/file.log’ -> lets you specify where the log file is
  • maxBytes=10240000 -> maximum size of a log before being rotated
  • backupCount=5 -> how many rotated files to keep
  • format=’%(asctime)s %(levelname)s PID_%(process)d %(message)s’ -> format of each log line; pay attention to the PID, very useful if your script can be invoked simultaneously multiple times, because this will let you split logs from different parallel runs

Implementing Apache’s force proxy flag for rewrite rules under NGINX

NGINX’s default behavior for rewrite rules (at least up to version 0.7.65) is to redirect if the replacement part begins with ‘http://’. Let me quote some info from NGINX’ wiki:

rewrite

syntax: rewrite regex replacement flag

[…]

If the replacement string begins with http:// then the client will be redirected, and any further rewrite directives are terminated.

It is important to take this into consideration while designing new rules because the behavior of the rule itself is bound to the place we are retrieving data from.

Apache does this differently since it offers a flag that can be set per rule which instructs the web server to just “proxy” the request (i.e. do not redirect, just get the response of that request in the background and send it back to the client). Taken from Apache’s site, this flag is:

proxy|P‘ (force proxy)
This flag forces the substitution part to be internally sent as a proxy request and immediately (rewrite processing stops here) put through the proxy module. You must make sure that the substitution string is a valid URI (typically starting with http://hostname) which can be handled by the Apache proxy module. If not, you will get an error from the proxy module. Use this flag to achieve a more powerful implementation of the ProxyPass directive, to map remote content into the namespace of the local server.

Note: mod_proxy must be enabled in order to use this flag.

How to implement this under NGINX

I have been able to get similar behavior under NGINX using ‘rewrite’ and ‘proxy_pass’ directives.

The following example implements a regular expression based rewrite rule serving content from domain2 to the client’s request on domain1.

server {
  listen 1.2.3.4:80;
  server_name domain1.com;

  location / {
    rewrite ^/([0-9][0-9]/[0-9][0-9]/.+)$ /example/?t=$1 last;
    proxy_pass http://domain2.com;
  }
}

Using that NGINX configuration, the client can request:

http://domain1.com/12/34/test

which will be proxied to:

http://domain2.com/example?t=12/34/test/

and served back to her “apparently” from http://domain1.com/12/34/test (i.e. there won’t be any URL redirection).

Quick and dirty fix for VMware Linux guests loosing clock accuracy

I covered on a previous post how to keep the clock synchronized for VMware Linux guest(s). Well this seems to not work at least for recent versions VMware Server 2 (i.e. the one with web based management console). For now the quick& dirty solution I am using is putting a cron job that executes ntpdate pretty often…

My cron job looks like this:

#
# Temporary fix for the time getting lost
#
0-59/10 * * * * /usr/sbin/ntpdate north-america.pool.ntp.org > /dev/null 2>1

Yes, this fix requires to have NTPDATE installed (apt-get install ntpdate under Debian).

Dumping a mysql database excluding one or several tables

If we need to dump a MySQL database and want to exclude table(s) we should use the option:

--ignore-table=db_name.tbl_name

Do not dump the given table, which must be specified using both the database and table names. To ignore multiple tables, use this option multiple times. This option also can be used to ignore views. Citation.

Example:

mysqldump --ignore-table=cars.brands cars > cars.dump

References

http_load man page

http_load(1)                                                                                                      http_load(1)

NAME
       http_load - multiprocessing http test client

SYNOPSIS
       http_load [-checksum] [-throttle] [-proxy host:port] [-verbose] [-timeout secs] [-sip sip_file] [-cipher str] ( -paral-
       lel N | -rate N [-jitter] ) ( -fetches N | -seconds N ) url_file

DESCRIPTION
       http_load runs multiple http fetches in parallel, to test the throughput of a web server.   However  unlike  most  such
       test clients, it runs in a single process, so it doesn't bog down the client machine.  It can be configured to do https
       fetches as well.

       The -checksum flag tells http_load to do checksums on the files fetched, to make sure they came across ok.  The  check-
       sums  are  computed  the  first  time each URL gets fetched, and then recomputed and compared on each subsequent fetch.
       Without the -checksum flag only the byte count is checked.

       The -throttle flag tells http_load to throttle its consumption of data to 33.6Kbps, to simulate access by modem  users.

       The -proxy flag lets you run http_load through a web proxy.

       The -verbose flag tells http_load to put out progress reports every minute on stderr.

       The -timeout flag specifies how long to wait on idle connections before giving up.  The default is 60 seconds.

       The  -sip  flag  lets you specify a file containing numeric IP addresses (not hostnames), one per line.  These get used
       randomly as the *source* address of connections.  They must be real routable addresses on your  machine,  created  with
       ifconfig, in order for this to work.  The advantage of using this option is you can make one client machine look like a
       whole bank of machines, as far as the server knows.

       The -cipher flag is only available if you have SSL support compiled in.  It specifies a cipher set to use.  By default,
       http_load  will  negotiate  the highest security that the server has available, which is often higher (and slower) than
       typical browsers will negotiate.  An example of a cipher set might be "RC4-MD5" - this  will  run  considerably  faster
       than  the  default.   In addition to specifying a raw cipher string, there are three built-in cipher sets accessible by
       keywords:
         * fastsec - fast security - RC4-MD5
         * highsec - high security - DES-CBC3-SHA
         * paranoid - ultra high security - AES256-SHA
       Of course, not all servers are guaranteed to implement these combinations.

       One start specifier, either -parallel or -rate, is required.  -parallel tells http_load  to  keep  that  many  parallel
       fetches  going  simultaneously.   -rate tells http_load to start that many new connections each second.  If you use the
       -rate start specifier, you can also give the -jitter flag, telling http_load to vary the rate randomly by about 10%.

       One end specifier, either -fetches or -seconds, is required.  -fetches tells http_load to quit when that  many  fetches
       have been completed.  -seconds tells http_load to quit after that many seconds have elapsed.

       The url_file is just a list of URLs, one per line.  The URLs that get fetched are chosen randomly from this file.

       All flags may be abbreviated to a single letter.

       Note  that  while the end specifier is obeyed precisely, the start specifier is only approximate.  If you use the -rate
       flag, http_load will make its best effort to start connections at that rate, but may not succeed.  And if you  use  the
       -parallel flag, http_load will attempt to keep that many simultaneous connections going, but may fail to keep up if the
       server is very fast.

       Sample run:
           % http_load -rate 2 -seconds 300 urls
           591 fetches, 8 max parallel, 5.33606e+06 bytes, in 300 seconds
           9028.87 mean bytes/connection
           1.97 fetches/sec, 17786.9 bytes/sec
           msecs/connect: 28.8932 mean, 44.243 max, 24.488 min
           msecs/first-response: 63.5362 mean, 81.624 max, 57.803 min
           HTTP response codes:
             code 200 -- 591

SEE ALSO
       http_ping(1)

AUTHOR
       Copyright (C) 1998,1999,2001 by Jef Poskanzer .  All rights reserved.

                                                       15 November 2001                                           http_load(1)

How to install MySQL Server on Debian Linux

While installing MySQL Server it is always good to keep in mind that the logs and data folders will potentially have a big size. By default MySQL keeps them in the root mount point (i.e. ‘/’). That may cause your database server system disk to get full, which is never a good idea.

This article describes how to move these two folders to ‘/home’ which is ideally mounted into another disk and has enough space to keep your database data and logs.

First, I install the required apt-get packages as follows:

apt-get update
apt-get install mysql-server

To check the status:

/etc/init.d/mysql status

/usr/bin/mysqladmin  Ver 8.41 Distrib 5.0.51a, for debian-linux-gnu on i486
Copyright (C) 2000-2006 MySQL AB
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL license

Server version          5.0.51a-24
Protocol version        10
Connection              Localhost via UNIX socket
UNIX socket             /var/run/mysqld/mysqld.sock
Uptime:                 3 sec

Threads: 1  Questions: 78  Slow queries: 0  Opens: 23  Flush tables: 1
Open tables: 17  Queries per second avg: 26.000.

Now, stop MySQL, move the folders to the right location, reconfigure MySQL and start again:

# Stop MySQL
/etc/init.d/mysql stop

# Move and reconfigure data
mkdir /home/mysql
mv /var/lib/mysql /home/mysql/mysql-data
ln -s /home/mysql/mysql-data/ /var/lib/mysql

# Move and reconfigure logs
mv /var/log/mysql/ /home/mysql/mysql-logs
ln -s /home/mysql/mysql-logs/ /var/log/mysql

# Start MySQL and check that everything is OK
/etc/init.d/mysql start
/etc/init.d/mysql status
/usr/bin/mysqladmin  Ver 8.41 Distrib 5.0.51a, for debian-linux-gnu on i486
Copyright (C) 2000-2006 MySQL AB
This software comes with ABSOLUTELY NO WARRANTY. This is free software,
and you are welcome to modify and redistribute it under the GPL license

Server version          5.0.51a-24
Protocol version        10
Connection              Localhost via UNIX socket
UNIX socket             /var/run/mysqld/mysqld.sock
Uptime:                 14 sec

Threads: 1  Questions: 78  Slow queries: 0  Opens: 23  Flush tables: 1
Open tables: 17  Queries per second avg: 5.571.

These are some settings that I usually put on the /etc/mysql/my.cnf configuration file:

# Here you can see queries with especially long duration
log_slow_queries        = /var/log/mysql/mysql-slow.log
long_query_time         = 1
log-queries-not-using-indexes

# A server-id unique
server-id                = 177
log-bin                  = /var/log/mysql/mysql-bin.log
log-bin-index            = /var/log/mysql/mysql-bin.log
innodb_file_per_table
# Unique log names (this prevents replication breaking upon hostname change :-)
relay-log                = iamalsounique98127-relay-bin
relay-log-index          = iamalsounique98127-relay-bin

# Taking care of the auto-increment values (for multi-master replication)
auto_increment_increment      = 10
auto_increment_offset         = 1

For these changes to take effect, you would need to restart MySQL:

/etc/init.d/mysql restart

If you want to ignore databases or tables you may use the following options:

binlog_ignore_db        = information_schema
replicate_ignore_db     = information_schema
binlog_ignore_db        = mysql
replicate_ignore_db     = mysql

# Ignore all the cache* tables which have caused DUPLICATE
# ENTRY issues. Unai.
replicate_wild_ignore_table = exampledb.cache%

Having ‘binlog_ignore_db’ is enough to exclude databases from replication BUT having ‘replicate_ignore_db’ as well will make things clearer since the databases that are being ignored will appear in both the ‘SHOW SLAVE STATUSG’ and ‘SHOW MASTER STATUSG’.