Python Developer and Educator
2009-11-05
Here's another utility script that I use to check on replication to one of our slave db servers. This probably bears some explanation - a few months ago, replication stopped for no apparent reason (at least no reason that was apparent to the developers - our SA either didn't know or didn't tell us). We didn't discover it until users started complaining that they couldn't log in to their newly created accounts - records being written to the master couldn't be found and read from the slave, and by that time the slave was already two days behind. So we started running this simple check on a nightly basis - the results are mailed out to the entire engineering team.
This obviously has a lot more to do with MySQL than it does Python - as a script, it's dead-simple elementary-school stuff. To get the update times on each database on the slave, I do a select against INFORMATION_SCHEMA, a database containing metadata about all the other dbs on the server. INFORMATION_SCHEMA has a table called, conveniently enough, 'tables', which in turn has a column 'update_time'. Simple, no?
I should point out that this is run against MySQL5 dbs. I have a separate script for the MySQL4 servers (yes, I know, don't ask - we deal with a lot of legacy data), which does not use INFORMATION_SCHEMA. These are also MyISAM dbs, not that it makes a difference - INFORMATION_SCHEMA runs in memory, so is engine-agnostic. For more detailed information about INFORMATION_SCHEMA, its permissions and the metadata it holds, check out the official MySQL documentation.
The script:
#!/usr/bin/python
# usage: ./dbwalk5.py
import os, os.path, sys, MySQLdb as Database, smtplib
emailto = ['barbara.shaurette@gmail.com']
allupdatetimes = []
def report_update_times():
"""get the update_time value for a specific table in each db"""
update_times = ''
cursor = db.cursor()
sql = """SELECT update_time, table_schema FROM information_schema.tables WHERE table_name = 'user_accounts'"""
# in this case, I'm selecting update_time on the user accounts table in each db,
# as that's the one that's written to most frequently
cursor.execute(sql)
status_rows = cursor.fetchall()
for row in status_rows:
update_times = update_times + str(row[1]) + ', ' + str(row[0]) + '\n'
cursor.close()
return update_times
def send_mail(text):
fromaddr = "Barbara Shaurette "
toaddrs = emailto
msg = ("From: %s\r\nTo: %s\r\n\r\n" % (fromaddr, ", ".join(toaddrs)))
for item in text:
line = item
msg = msg + line
server = smtplib.SMTP('localhost')
server.sendmail(fromaddr, toaddrs, msg)
server.quit()
db = Database.connect(host="host", port="port", user="username", passwd="password", db="")
updates = report_update_times()
allupdatetimes.append(updates)
db.close()
sendMail(allupdatetimes)
Then the list that's sent out in the body of the email looks a little like this:
db1, 2009-11-05 12:59:52
db2, 2009-08-19 10:21:50
db3, 2009-11-05 12:46:44
db4, 2009-08-19 10:22:21
db5, 2009-11-05 13:01:31
db6, 2009-11-05 12:58:43
db7, 2009-08-19 10:25:55
db8, 2009-08-19 10:18:07
db9, 2009-08-19 10:20:38
db10, 2009-08-19 10:26:17
db11, 2009-11-05 13:01:52
Contact: barbara@mechanicalgirl.com