Aug 2, 2012

WAL Filtering for table rebalancing

Now XC team finished table rebalancing feature by ALTER TABLE ... DISTRIBUTE BY ... statement.   This runs with all the rows of the table locked.    The next is to do this concurrently, not locking rows and run everything background.   Different from other ALTER TABLE operation, this doesn't affect the logical view of the table and is safe to run this concurrently, as in the case of CREATE INDEX CONCURRENTLY.

To implement this, I thought to use a kind of log shipping for specific table.   To choose WAL records for the target table and then send them to the handler.   This looks very simple but we should consider TOAST as well.    Some says that it's much simpler to implement this as a part of walsender.  Yes, this can be if we don't have to worry about full page writes.

Full page writes are not simple to deal with.   We have to extract logical operation from them.   We can avoid this difficulty if the WAL filter is a part of XLogInsert(), which runs as each backend process and we need to handle this as well.

Which will be better?

Mar 7, 2012

XC conversation

XC developers ML received a comment on table rebalancing from a new guy.   His comment shows that he is really using XC.   He is from Shanghai Linux User Group. Considering the number of XC downloads, many Chinese people must be using or testing XC.  Very interesting.

Mar 2, 2012

Hint clause in XC?

I came to this idea.   This could be a wicked idea though.   In the future, we will see many statement which we cannot generate the best plan.   So, to do this, we may need to accept a hint which distribution to rely on, as

SELECT * FROM A,B,C WHERE A.A=B.B and B.B=C.C A.A in (SELECT Z from X WHERE Y=100) /*+ USE_PUSHDOWN TO ALL */

Looks tempting ....

Feb 28, 2012

XC at PGCon2012

Now PGCon2012 schedule is out.   Postgres-XC will have tutorial for building, installing, configuring and running Postgres-XC database cluster.   This is both for users and developers.    Developers will learn XC's source code configuration and documentation internals, although the tutorial will be best-tuned for users.
Tutorial will include everything they should know to use XC, including node configuration and table design.
Scheduled on May 16th, Wed., 1:00PM to 4:00PM.

Jan 31, 2012

Testing XC toward 1.0

Now we're becoming the goal of this quarter's development.   Depending upon each mender's schedule, we will go into dedicated test for 1.0.
So far, from the discussion among the core members, we agreed that the test should consider at least three points of view.

  1. How many bugs are hidden and how many bugs should be found.    Considering the written code is around 100,000 lines, I think we should find around 500 bugs.    I do hope the code is much better.   We need much more test cases to run.
  2. SQL functionality check: what is supported and what is not.    I've been discussion this with the sponsor.   I don't know if the developer team should (and can) do this.    From my experience, it will take at least five to six man-month to write a doc of test case for this purpose.
  3. Code test coverage.   There could be a handy tool to measure this.   Although test code test coverage doesn't tell about the condition for each piece of code, Pavan tells this will be good to check the coverage of error handling.   I agree on this.
I should write first draft of XC test schedule by the next Tuesday to discuss in core member's teleconference.

After consideration...
Maybe increase code coverage makes most sense from current development team.   Feature test and performance test are being done for each development project.

Jan 29, 2012

Cluster Summit and PGCon

Now began discussion about Cluster Summit in PGCon with Josh.   Josh still likes an idea to have CHAR in Europe, Japan and US in turn.    Yes, I think this is quite a good idea but it seems to me that we need some independent organization (in PostgreSQL community) to manage this, as well as local organization to help preparation.     In Feb, we may be able to innate this to Simon at Paris through Michael.    Should Japanese local organization be in JPUG?

Jan 27, 2012

pgxc_clean

Finished initial pgxc_clean code.  It is not tested yet.

At first, I thought I need direct connection to GTM to clean up TXN status, but finally I found that I just need to tell the coordinator to commit or abort prepared transaction.  With this, the code is simple enough.   It is just libpq application.   If I wish, I could have written this as ECPG application or even JDBC application.

Anyway, because this completely depends upon XC internal catalog and proprietary function calls, this is local to XC and is not portable at all.

Jan 12, 2012

GTM HA for 0.9.7

Committed the patch.   Now GTM's transaction backup to the standby is accurate, even in heavy workload.   Remaining issues are:

  1. tcp_keepalives feature works to monitor client connection at the server.   Need some more work to enable monitoring server connection at clients.
  2. Need to add "status" feature.   I once considered to add this to gtm_ctl.   It may work but new command can be okay too.    Anyway, this feature should be implemented using normal communication with gtm/gtm_proxy, not signal.
  3. "gtminit", associated with XC cluster bootstrap.

Jan 10, 2012

GTM standby test

Found that GTM synchronous backup does not work well.
What found:

  1. Try to sync with GTM-Standby by gtm_sync_standby()
  2. It does not return somehow.
Before these, called bkup_node_register_internal() to backup the command to GTM-Standby and it looks successful.

Maybe GTM-Standby does not handle backup messages correctly and GTM-Standby is waiting for some more information.

Jan 6, 2012

GTM standby

Now GTM standby is at the last stage.
  1. Error handling in GTM-Proxy done: when GTM communication error occurs, proxy does some retry and if it fails then wait for reconnect operation.
  2. GTM standby can now connect to GTM at any moment.
  3. After reconnect and shutdown whole cluster, GTM-Standby can start as GTM.   Do not forget to rearrange configuration files properly.
  4. GTM backup is now corrected so that transaction handle and GXID are backed up to the standby.
  5. Asynchronous backup works fine.
  6. Synchronous backup has still some issues.   Maybe some minor protocol handling.   Will be tested next.

Jan 3, 2012

XC bootstrap

Pavan wrote his proposal on XC bootstrap.   Here's some of my idea/comment

  1. It's nice to run initdb as independently as possible and register each node after initdb.   It will be even nicer if initdb runs vacuum freeze so that any node can begin with any GXID.   It will make adding nodes safer.
  2. He is right that XC configuration makes sense with at least one coordinator registered.   The issues are:
    1. Should a coordinator registered at first?  The order of registration can be more flexible.   Because clients target to a coordinator, there will be no problem to have only datanodes initialized and registered at initial phase of the bootstrap.
    2. Should datanode be registered to GTM?   What the registration works for?   
  3. Coordinator was made a separate node because we thought coordinator and datanode should be different binaries.  Now that they share the same binary what happens if we a node is both coordinator and datanode?   If they're the same, configuration may look simpler.