Configuring HAProxy 1.4 to do host-based reverse proxying

Foreword: the new(er) HAProxy 1.5 supports map-based hosts, which are the recommended way. But here’s a guide for those of us who are stuck with the older version.

So, you’ve got several application servers, with ports all over the place. How to organize this mess to be accessible more easily? By using DNS and a reverse proxy that is aware of the Host-header. HAProxy is perfect for this. High performance and low footprint. As a bonus HAProxy can also be configured to terminate HTTPS requests so that even your dumbest services can benefit from encryption!

For setting up the proxying, here’s a handy little list:

  1. Install; on Debian, run apt-get install haproxy
  2. Configure; take a look at this paste and copy the contents to /etc/haproxy/haproxy.cfg
  3. Enable; to mark that you’ve actually edited the configuration, go to /etc/default/haproxy and set ENABLED=1.
  4. Run; service haproxy restart.

And that is it! Now you have a basic HAProxy installation that reverse proxies requests to two different hosts/ports based on the Host-header. Simply add more backends and acl/use_proxy combos to introduce new services.

But the fun doesn’t end here! Now you have a bunch of backend servers whose requests are all originating from a simple host, and that breaks ACLs and logging and everything! To fix this you’ll have to go manually through each and every service and make the necessary configurations.

For example, to make HFS to trust the X-Forwarded-From header set by HAProxy, you’ll have to edit its configuration file manually, as per this guide.

For Apache there exists a whole module for this: mod_remoteip. Simply include the module and set RemoteIPHeader X-Forwarded-For and RemoteIPInternalProxy proxy_ip_here. You may also need to change %h to %a in LogFormat to get the logging to work correctly.

No matter what you are using, the common thing is to mark your proxy machine as trusted, so that the real remote IP can be read from the header. Be aware that the header contains a comma separated list of proxies(or just multiple consecutive headers of the same name), and the last one is your proxy. The rest can be freely set by the client, and can not be trusted.