Implementation

Apache Module

The core of the redirector is mod_zrkadlo, a module for the Apache HTTP server, written in C, and designed for high performance and scalability, with security in mind.

mod_zrkadlo is pronounced mod zurrcat low. Zrkadlo is Slovakian for mirror, a word that I learnt when I was travelling in Slovakia in 2007.

It does only one database query per client request, using database connection pools provided by the Apache DBD framework.

To cope with filetrees that are huge and changed frequently, the redirector doesn't simply choose one mirror for a client once, but acts as granular as on file-level, because mirrors are known to be incomplete, especially if content changes often. To achieve this, the redirector is supported by an SQL database which knows the exact contents of each mirror. The database is periodically updated by scanning all mirrors with a scanner program. In addition, there is a probing program which intermittently checks each mirror for responsiveness, and which can disable or pause redirection to a certain mirror, should it fail.

This page shows pseudocode which gives an outline how the redirection module works.

Mirror Database

The mirror database is a MySQL database. Its main purpose is to store data about

  • mirrors (their location, base URL, ...)
  • files that were seen while scanning the mirrors

Mirror Scanner

The mirror scanner is a program which crawls the mirrors via rsync, FTP, or HTTP protocol. It updates the database with the file list found on the mirror machines, and checks whether the mirrors support the correct delivery of large files.

Mirror Probe

The mirror probe is run at short intervals and checks for each mirror if it is alive. If a mirror doesn't reply, redirection to it is disabled until it comes back.