First you need determine what what your requirements are, i.e., what level of fault tolerance you need. (On load sharing see below.) How much can you afford the system to be down, and how often?
Note that, whatever you provide for fault tolerance, it protects you against (a certain kind of) hardware failures. It will hardly ever catch failures related to software bugs, configuration, or operational errors. In systems like yours, which consist only of one or two machines, these hardware failures are very seldom events. With proper hardware, they will perhaps occur once in the lifetime of a server, and then they may not. So you need to see how much effort you really want to spend on this point.
That said, your options are approximately the following, in order of both increasing cost and increasing complexity.
- Run the system on one server. Replace the machine on a reasonable schedule (say, every 3 years).
- Get a good hardware maintenance contract for the server (one that entitles you to an engineer on-site for swapping parts, if needed).
- Get an even better maintenance contract, which will make the engineer arrive on-site in even shorter time.
- Buy a spare server, identical to the first one, and keep it off-line (i.e., not productive). If the first server fails, swap the second one in, and restore your data from backups.
- The previous option can be refined by preparing the spare server with OS/web server/database/configurations ready, by copying your production data over to the spare server at frequent intervals, etc.
- Build an automatic failover to the second server.
By the way, the first implementation step for #6 would be to buy another two servers, as a test system.
Now a remark on load sharing: Strictly speaking, you can't have load sharing and fault tolerance with only two servers. If your load is so high that it needs to be shared between two servers, then a failure of one server will bring the system down, so there is no fault tolerance. This might seem a bit nit-picky; but, if your Moodle system is really so critical that you can't afford a few hours of downtime every couple of years (and that's why you would need automatic failover), then such details become relevant.
Hope that helps