Monday, May 17, 2010

Poor Man's Profiler using Solaris' pstack

Recently I was working with the output of pstack from a hung MySQL server and wanted to use Poor Man's Profiler in order to combine stack traces. Unfortunately, the awk magic expects the output from gdb's thread apply all bt output.

gdb output:

Thread 10 (Thread 0xa644db90 (LWP 26275)):
#0 0xb7f47410 in __kernel_vsyscall ()
#1 0xb7f33b1a in do_sigwait () from /lib/tls/i686/cmov/libpthread.so.0
#2 0xb7f33bbf in sigwait () from /lib/tls/i686/cmov/libpthread.so.0
#3 0x081cc4fc in signal_hand ()
#4 0xb7f2b4fb in start_thread () from /lib/tls/i686/cmov/libpthread.so.0
#5 0xb7d25e5e in clone () from /lib/tls/i686/cmov/libc.so.6

Thread 9 (Thread 0xa641cb90 (LWP 26273)):
#0 0xb7f47410 in __kernel_vsyscall ()
#1 0xb7d1e881 in select () from /lib/tls/i686/cmov/libc.so.6
#2 0x081d1190 in handle_connections_sockets ()
#3 0x081d1ef3 in main ()
...

pstack output:

----------------- lwp# 56 / thread# 56 --------------------
fffffd7ffed7bb7a sigtimedwait (fffffd7ffe3aee10, fffffd7ffe3aee20, 0)
fffffd7ffed6aced sigwait () + d
fffffd7ffed62740 __posix_sigwait () + 40
0000000000712dcd signal_hand () + 12d
fffffd7ffed7704b _thr_setup () + 5b
fffffd7ffed77280 _lwp_start ()
----------------- lwp# 53 / thread# 53 --------------------
fffffd7ffed7c6ca pollsys (fffffd732fe31dd0, 0, fffffd732fe31e60, 0)
fffffd7ffed234c4 pselect () + 154
fffffd7ffed23792 select () + 72
fffffd7ffe866008 os_thread_sleep () + 50
fffffd7ffea6dafb srv_lock_timeout_and_monitor_thread () + 1b3
fffffd7ffed7704b _thr_setup () + 5b
fffffd7ffed77280 _lwp_start ()
...

However, we can see how powerful and flexible PMP is by making a small change to the awk script and poof, we had the same output:


awk '
BEGIN { s = ""; }
/thread#/ { print s; s = ""; }
/^ [0-9a-f]/ { if (s != "" ) { s = s "," $2} else { s = $2 } }
END { print s }' | sort | uniq -c | sort -r -n -k 1,1


Yet another win for PMP.

Tuesday, March 2, 2010

Semi-sync Replication Testing

I have recently been trying out semisynchronous replication. This is a new feature added in MySQL 5.5 based on the original google patch.

Installing was very simple and has been covered elsewhere in detail. Suffice to say it was relatively simple to do.

While test I was a bit surprised by some behavior I saw that turned out correct. What I wanted to examine what was semi-synchronous actually does and use cases for it.

The manual defines this feature correctly in very careful language:
If semisynchronous replication is enabled on the master side and there is at least one semisynchronous slave, a thread that performs a transaction commit on the master blocks after the commit is done and waits until at least one semisynchronous slave acknowledges that it has received all events for the transaction, or until a timeout occurs.


There is a subtle difference to how this is described in other places, for example from Guiseppe's blog (not picking on him, just an example since I have seen in many places):
That is, before committing, the master waits until at least one slave has acknowledged that it has received the portion of binary log necessary to reproduce the transaction.


What the difference is the fact that the blocking and relay to the remote slave occurs after the transaction is actually committed to disk.

This means that if the master crashes it is possible that it may have committed transactions on disk that are not on the slave. However, the client will get back an error saying that the commit failed.

What is the use case for this then?

For failover purposes, this is generally exactly what you need. The client can know it has failed and can redo the transaction on the slave. There is no data loss and everything is nicely guaranteed.

What doesn't it help?

Where it isn't useful is for recovery after the crash. When you finally get your master restarted, it may already have some transactions that were later replayed on the slave. This will naturally cause replication to break and things fail.

Thursday, January 14, 2010

State of MySQL Search

Recently I took part in the first meeting of the MySQL Search Team. You can read more about the team here.

My task was to represent the customer interests regarding fulltext search. My report had a few main points which I made:

  • MySQL Fulltext Search (FTS) has several big issues with it compared to other solutions. I have listed these in the order of importance that I see. Numbers 1 and 2 are very close in importance however, so I can see those going either way.

    1. MyISAM only: Limits the usefulness in many very busy environments due to table level locking and crash safety

    2. Performance: The design of FTS makes performance hurt as the index size exceeds RAM. Combining multiple second queries with MyISAM table level locking is a recipe for failure.

    3. Features: There are a lot of features lacking in FTS that exist in other products. Stemming, synonyms, dictionary usage, CJK support, etc... are all potential avenues of investigation.

  • Third-party solutions are commonly used for searching data in MySQL, but does require extra work to deploy or implement. Tools like Lucene, Sphinx, etc... are very good at what they do, and are normally faster and more fully featured compared to MySQL FTS. The big issue is that normally there is some effort involved in implementing these, so the ease of use in implementing and maintaining is lower than MySQL FTS.



So the forward looking question is how to combine the best attributes of these two solutions to get one easy to deploy and use robust solution. Personally, I can see a few possible routes forward:
  • Improve MySQL FTS! There is a lot of research and known best practices that exist for search. This would leverage those in order to make the MySQL solution more robust and featureful. From the current state of FTS, it would be a pretty decent amount of work, but could have good returns.
  • Improve the glue code between the third parties and MySQL. This could take a variety of forms. One idea I had was in the the form of index plugins where as rows are inserted/updated/deleted, the external datastores can be automatically updated as well. This is very similar to triggers, but is a bit different since there can be more integration and easier deployment.
It's possible the best solution may be a mix of the two options to allow for more flexibility or even something completely new that I haven't thought of!

Regardless, I am happy that people at MySQL are finally taking a look at the poor state of the MySQL fulltext search environment and trying to figure out new good solutions for this space.