Permanent Message Handling

As described in Permanent Message Handling, messages are marked permanent if they contain database modifications that should be committed at the replica. DB's replication code decides if it must flush its transaction logs to disk depending on whether it receives sufficient permanent message acknowledgments from the participating replica. More importantly, the thread performing the transaction commit blocks until it either receives enough acknowledgments, or the acknowledgment timeout expires.

The replication framework is fully capable of managing permanent messages for you if your application requires it (most do). Almost all of the details of this are handled by the replication framework for you. However, you do have to set some policies that tell the replication framework how to handle permanent messages.

There are two things that you have to do:

Identifying Permanent Message Policies

You identify permanent message policies using the Note that you can set permanent message policies at any time during the life of the application.

The following permanent message policies are available when you use the replication framework:

  • DB_REPMGR_ACKS_NONE

    No permanent message acknowledgments are required. If this policy is selected, permanent message handling is essentially "turned off." That is, the master will never wait for replica acknowledgments. In this case, transaction log data is either flushed or not strictly depending on the type of commit that is being performed (synchronous or asynchronous).

  • DB_REPMGR_ACKS_ONE

    At least one replica must acknowledge the permanent message within the timeout period.

  • DB_REPMGR_ACKS_ONE_PEER

    At least one electable peer must acknowledge the permanent message within the timeout period. Note that an electable peer is simply another environment that can be elected to be a master (that is, it has a priority greater than 0). Do not confuse this with the concept of a peer as used for client to client transfers. See Client to Client Transfer for more information on client to client transfers.

  • DB_REPMGR_ACKS_ALL

    All environments must acknowledge the message within the timeout period. This policy should be selected only if your replication group has a small number of replicas, and those replicas are on extremely reliable networks and servers.

    When this flag is used, the actual number of environments that must respond is determined by the value set for DbEnv::rep_set_nsites().

  • DB_REPMGR_ACKS_ALL_PEERS

    All electable peers must acknowledge the message within the timeout period. This policy should be selected only if your replication group is small, and its various environments are on extremely reliable networks and servers.

    Note that an electable peer is simply another environment that can be elected to be a master (that is, it has a priority greater than 0). Do not confuse this with the concept of a peer as used for client to client transfers. See Client to Client Transfer for more information on client to client transfers.

  • DB_REPMGR_ACKS_QUORUM

    A quorum of electable peers must acknowledge the message within the timeout period. A quorum is reached when acknowledgments are received from the minimum number of environments needed to ensure that the record remains durable if an election is held. That is, the master wants to hear from enough electable replicas that they have committed the record so that if an election is held, the master knows the record will exist even if a new master is selected.

    Note that an electable peer is simply another environment that can be elected to be a master (that is, it has a priority greater than 0). Do not confuse this with the concept of a peer as used for client to client transfers. See Client to Client Transfer for more information on client to client transfers.

By default, a quorum of sites must must acknowledge a permanent message in order for it considered to have been successfully transmitted. The actual number of environments that must respond is calculated using the value set with DbEnv::rep_set_nsites().

Setting the Permanent Message Timeout

The permanent message timeout represents the maximum amount of time the committing thread will block waiting for message acknowledgments. If sufficient acknowledgments arrive before this timeout has expired, the thread continues operations as normal. However, if this timeout expires, the committing thread flushes its transaction log buffer before continuing with normal operations.

You set the timeout value using the DbEnv::rep_set_timeout() method. When you do this, you provide the DB_REP_ACK_TIMEOUT flag to the which parameter, and the timeout value in microseconds to the timeout parameter.

For example:

    dbenv->rep_set_timeout(DB_REP_ACK_TIMEOUT, 100); 

This timeout value can be set at anytime during the life of the application.

Adding a Permanent Message Policy to RepMgr

For illustration purposes, we will now update RepMgr such that it requires only one acknowledgment from a replica on transactional commits. Also, we will give this acknowledgment a 500 microsecond timeout value. This means that our application's main thread will block for up to 500 microseconds waiting for an acknowledgment. If it does not receive at least one acknowledgment in that amount of time, DB will flush the transaction logs to disk before continuing on.

This is a very simple update. We can perform the entire thing in RepMgr::init() immediately after we set the application's priority and before we open our environment handle.

int RepMgr::init(RepConfigInfo *config)
{
    int ret = 0;

    app_config = config;

    dbenv.set_errfile(stderr);
    dbenv.set_errpfx(progname);

    if ((ret = dbenv.repmgr_set_local_site(app_config->this_host.host,
        app_config->this_host.port, 0)) != 0) {
        cerr << "Could not set listen address to host:port "
             << app_config->this_host.host << ":"
             << app_config->this_host.port
             << "error: " << ret << endl;
    }

    for ( REP_HOST_INFO *cur = app_config->other_hosts; cur != NULL;
        cur = cur->next) {
        if ((ret = dbenv.repmgr_add_remote_site(cur->host, cur->port,
                                                0)) != 0) {
                cerr << "could not add site." << endl
        }
    }

    if (app_config->totalsites > 0) {
        try {
            if ((ret = dbenv.rep_set_nsites(app_config->totalsites)) != 0)
                dbenv.err(ret, "set_nsites");
        } catch (DbException dbe) {
            cerr << "rep_set_nsites call failed. Continuing." << endl;
        }
    }

    dbenv.rep_set_priority(app_config->priority);

    /* Permanent messages require at least one ack */
    dbenv.repmgr_set_ack_policy(DB_REPMGR_ACKS_ONE);
    /* Give 500 microseconds to receive the ack */
    dbenv.rep_set_timeout(DB_REP_ACK_TIMEOUT, 500);

    dbenv.set_cachesize(0, CACHESIZE, 0);
    dbenv.set_flags(DB_TXN_NOSYNC, 1);

    ...