Dec 28, 2011

GTM fourth improvement

Now I'm tackling the fourth GTM improvement, to correct backup algorithm.

Current implementation just proxies begin transaction/get GXID type command directly to the backup.   Backup tries to assign GXID and transaction handle independently and depending upon the order of each thread execution, each transaction can be assigned different gxid at GTM standby, which may cause serious problem when standby fails over.

The improvement corrents this.   Now both handle and gxid are backed up.   GTM standby will assign transaction struct slot based upon backed-up handle and use backed-up gxid to keep everything consistent.

One thought about sequence.   Because it is not practical to backup current sequence value, sequence command is essentially proxied to the backup (standby).   Because history of sequence value has very little meaning and requirement is to begin the next value correctly when failed over, I think current mechanism should work well.

More over, I eliminated needless response from Standby to ACT.    When synchronous backup is specified, acknowledgement will be exchanged to make sure that backups reached the standby.

Most of this is skipped when message comes through GTM-Proxy.  I need to find where I should insert the code to synchronize in this case.

----
(Addenda: 30th Dec., 2011)

1. Add "Backup_synchronously" to GTM_conn or GTM_TheadInfo

2. Check Backup_synchronously when flush to the client and "send_smething".   Send something to backup before send something to the client.

3. Check if I'm running in standby or not when accepting commands.  

Mac, Linux, Windows in the office

Bought new LCD display with 1080p resolution and now running Mac, Linux and Windows.   Mac book air runs pretty quick.   More than that, thanks to KeyRemap4, key-binding is set to emacs!   This is done in very low level and makes typing really a fun.

One tweak to use Mac (or other system) through HDMI.   HDMI picture is originally tuned for movies, or digital TV.   You should set this to PC ir still picture, which makes screen really beautiful.

My keyboards:

Happy Hacking Keyboard professional for Linux and Windows.   Really cool keyboard I've ever met.

Apple wireless keyboard for Mac book air clam shell mode.  I didn't expect much but found this keyboard is really as cool as HHK pro.

Happy typing!

Dec 20, 2011

GTM-standby third patch

Now third patch is done.   What I did today are:

  1. Add missing option definition,
  2. Correcting wrong option description
  3. Adding log message of error detection, GTM connection retry and reconnect to GTM-standby.
Then I can test this code tomorrow!

Dec 19, 2011

GTM-Proxy fails with SEG-V

Sudo-san reported me that GTM-Proxy fails with SEG-V.    It runs normally in Ubuntu and without O2 build option, it also runs in CentOS too.   Finally, I found that it only fails with O2 option at CentOS.    It looks that entry to memory allocation handler is corrupted.   I will look into it tomorrow.

---
It was caused by uninitialized thrinfo, which points to all the memory context.   Just adding memset() fixed the problem.

I checked all the other malloc() in GTM-related code and found all the others are associated with proper initilization or written before read.

Dec 16, 2011

GTM-proxy error handling for reconnect

Added a code to allow GTM-Proxy to do the following:

1) Optionally retry connection to current GTM.   Specifies count, idle and interval.
2) Optionally waits reconnect command.   Specifies count, idle and interval.

To reduce the number of options, I'm not willing to introduce "option" flag to ask yes or no.   Instead, maybe we should consider that all zero menas no retry or no reconnect.   This code has not been done yet.  Maybe next Monday.

Anyway, need to log this activity.

----
Gee! After I made the first commit of this feature, I found there're couple of issues to be fixed before tested.

1) Description of new option GTM_OPTNAME_RETRY_IDLE ... This is wrong!!
2) GTM_OPTNAME_ERR_WAIT_IDLE definition is missing!!

I need to fix them as well as documentation.

Now, GTM_OPTNAME_ERR_WAIT_OPT is removed.   So, for connection retry and wait for reconnect,
if all the idle, count and interval are zero, then no such action will be made.

If communication error is detected and no retry, no reconnect wait are specified, then this will cause FATAL error because GTM-Proxy cannot continue service.

---
Koichi

Dec 1, 2011

GTM Standalone connectivity improvement (2)

Today, I finished the first code for this.   Tested the following:

  1. start gtm
  2. start gtm_proxies
  3. start coordinators/datanodes and do some session through psql.
  4. start gtm standby and do another psql sessions.
They worked okay.  Promote and reconnect seems to work fine.

However, when gtm standby is shutdown and try to connect again, gtm/gtm_proxies seems to stall.   Maybe disconnect sequence doesn't work well.  Need to fix this before commit.