[Solved] What is the current practice in handling connection errors when the service “offers” multiple IP addresses?

Alexis Wilke Asks: What is the current practice in handling connection errors when the service “offers” multiple IP addresses?
With the current implementation of getaddrinfo(), I’m not given any information about a timeout of the IP address(es) returned. The library implementing that function has the information, but I haven’t yet seen a function to retrieve the timeout.

What I’m wondering is how do people current implement the concept of connecting to a service when they are given multiple addresses and the connection either never happens or fails after a while.

I can think of multiple algorithms and I’m wondering which one is currently used:

  1. User gather host name (say from a .conf file)
  2. User creates a “connection object”
  3. User says “connect” on that object
  4. Object transform the host name in a list of IP addresses
  5. Object tries to connect to first IP
    1. Connection in (5) fails, try with next IP
    2. Connection in (5.1) fails, repeat until all IPs were tested
    3. All IPs were exhausted, sleep and try again from (4)
  6. Connection in 5 succeeded, run with that connection until we lose it
  7. Connection is lost, try again from (4) or from (5.1)?

My main problem is what happens in step (7). Should we try again from (4), which means we are not unlikely to retry the same IP address, or should we try from (5.1), in which case we may be testing with an out of date IP address… (it could be days between step (6) and step (7)). I think that if I had the timeout for the IP address, I could do a smart decision since I would know whether I should go back to step (4). Without that timeout, I’m kind of stuck…

What is the current practice?

