[Postgres-xl-developers] Recovering GTM after failure
anicoara at uwaterloo.ca
Mon Jan 26 15:39:26 PST 2015
On Mon, Jan 26, 2015 at 5:30 PM, Mason Sharp <msharp at translattice.com> wrote:
> On Sun, Jan 25, 2015 at 11:01 AM, Adrian Nicoara <anicoara at uwaterloo.ca>
>> What about establishing a valid snapshot for new transactions?
>> Without the set of running transactions, a new transaction would
>> assume that everything with a smaller ID is committed / aborted? Then,
>> couldn't you have a race condition where:
>> 1. A transaction started after GTM recovery (ID X+1000) attempts to
>> read two items a and b, that are present at two data nodes A and B.
>> 2. An old transaction started before GTM failure (ID X) is writing
>> data items a and b.
>> 3. The read from (ID X+1000) at A determines that the write for a by
>> (ID X) isn't committed yet, from the hint bits and committed log.
>> 4. The read from (ID X+1000) at B determines that the write for b by
>> (ID X) is committed, from the hint bits and committed log.
> If it committed at one node but not the other, it must have been a two node
> transaction, and two phase commit would have been used. The transaction must
> have been fully prepared on both nodes for it to have have been committed on
> at least one. If the transaction committed on one node, we must commit on
> all to be consistent. On node B it committed. On node A, the transaction is
> in a prepared but not committed state. Doing a read from would return data
> from B but not A until the prepared transaction is committed. There is a
> utility, pgxc_clean that tries to clean up such cases.
> So, there is a theoretical window if a node goes down just after it has
> prepared but not committed a 2PC transaction and other nodes have committed,
> where if the node is made available to users before manually committing the
> prepared transaction or running the pgxc_clean utility that one could get
> the result of something like you describe.
I see - that makes sense for the provided example.
But there is still one more execution that requires the running set of
transactions, that isn't covered above:
1. When transaction (ID X+1000) starts at node A, transaction (ID X)
has made no modifications to data item a (consider that it hasn't even
reached that data node, assuming a fast enough GTM recovery). Thus,
(ID X+1000) is not aware of the existence of (ID X).
2. Transaction (ID X+1000) does some work at some other node C, to keep it busy.
3. In the mean time, transaction (ID X) finishes its work at nodes A and B.
4. Transaction (ID X+1000) goes to B, and reads the modified data item
b by transaction (ID X).
Thus, when transaction (ID X+1000) reaches B, it is not aware that
transaction (ID X) spanned two data nodes.
I think that for such an execution, the set of running transactions is
a requirement, to establish a valid snapshot, but I am not sure how
the new GTM would establish this set.
More information about the Postgres-xl-developers