1.36 00:23:35 HA now supports sets of replicas. When there is a set of replicas, replicas
will monitor each other.
Also, changes "Primaries" into "MonitoredNodes". There are different kinds
of nodes we monitor: Primaries, Replicas, and Providers.
2004/10/21 mbalazin Got the AMNESIA mode running. I'm using that mode for my other fault-tolerance stuff
2004/06/03 jhhwang In passive standby, if a backup node takes over, it becomes primary.
2004/06/02 jhhwang changes for passive standby
2004/05/31 mbalazin Small bug fix in the computation of the last tuple on a stream.
2004/05/31 mbalazin When starting, the HA module tells the QueryProcessor module:
- the recovery method to use
- the primary status (true/false)... this status can change later
- the list of secondaries
2004/05/31 mbalazin Decoupling such that QueryProcessor/DataPath do not invoke methods directly on the HA module but rather send the HA module a message on its input queue.
I agree that this is extra overhead but this is the current architecture.
Given the tight coupling between DataPath, QueryProcessor, and HA we should probably put all these three classes in a single module under a single BasicComponent.
This code is not tested but it compiles... I'm going to test it now
2004/05/29 jhhwang minor fixes
plan to work on queue trimming, then duplicate elimination.
2004/05/29 jhhwang minor fix
2004/05/29 jhhwang replicas for active-standby also forward backup tuples when the replicas take over.
2004/05/29 jhhwang HA switches from secondary to primary when it takes over the failed one.
This is for active standby
2004/05/29 mbalazin Bug fix in the method that finds a tuple in a circular buffer. There was a condition where the method wouldn't find the tuple although it was there.
2004/05/29 mbalazin just a precaution: when a network partitions, don't change replicas to another with the same owner ;-)
2004/05/28 jhhwang Now an HA module can figure out if it's primary or not from medusa_config.xml.
The default is primary (i.e., in case if being primary or not is not specified in the xml config file).
2004/05/28 mbalazin - Replicas are now setup and created up front rather than after failure.
- Moved the RecoveryType enumeration into common/Recovery.h because both the QueryProcessor and Admin need to use that data structure.
- Changed the names of streams in the query to be shorter
2004/05/28 mbalazin bug fix in addition to my earlier bug fix... until the primary comes up, we also need to get a fresh endpoint everytime we send a keep-alive message.
2004/05/27 jhhwang Updates for references to secondaries.
2004/05/27 jhhwang updates for references to secondaries
2004/05/27 mbalazin Bug fix: For Primary class, added a field called m_started that is initialized to false and then set to true when the primary responds to the first keep_alive message. That way, the secondary doesn't try to recover for the primary when it simply gets started first.
2004/05/26 mbalazin Small bug fix. The primary version was also in the list of replicas.
2004/05/24 mbalazin Avoid trying to find a replica when none exists
2004/05/24 mbalazin Bug fix: failures caused by network partitions should also be handled outside of the monitoring loop...
2004/05/24 mbalazin Simplest version of upstream backup is working
2004/05/21 mbalazin Network partition demo version 1.0: Detecting parttion and hot-swapping replica!!! (Gap recovery though)
2004/05/20 mbalazin Updated query processor to support hot-swapping of input stream replicas
Updates HA module to keep track of available replicas
2004/05/20 mbalazin Bug fix. HA was not starting when a node had no statically assigned primaries... but it needs to run to monitor upstream nodes
2004/05/20 mbalazin Quite minor but need to synch laptops
2004/05/18 mbalazin - Added a new client application for demoing HA and network partitions
- Modified failure detections to work for network partitions too. When there is a partition, RPC calls don't fail. They simply never return.
2004/05/18 mbalazin small oops... HA was not getting started after last bug fix ;-)
2004/05/18 mbalazin Bug fix!
The XML parser is not thread safe.
- Moved code that reads and parses XML config files from in thread initialization into constructors. This affected HA and NHOptimizer.
- This move required that node identifiers be set directly in the constructors rather than before starting the component's thread. I modified BasicComponent's constructor to take the node identifier as argument. I had to modify the constructor of all the Borealis components to support this additional argument.
2004/05/17 mbalazin Just tweaks of constants and messages that appear in traces
2004/05/17 mbalazin - Modified primary monitoring: 3 consecutive keep_alive messages must fail before we declare a failure of the primary
- Adjusted a few constants for load management
2004/05/13 mbalazin Adding start and stop to HA module
2004/05/12 mbalazin Distinguishing different types of primaries
2004/05/06 mbalazin - Added observable/observer pattern for HA module to monitor all input streams through the Admin module
- Primary adds the owners of these input streams to its list of primaries
2004/04/14 mbalazin New dir structure