Mirrors

From SME Server
Revision as of 19:35, 13 July 2011 by Cactus (talk | contribs) (Why we prefer push monitoring)
Jump to navigation Jump to search


Status of the contribs.org mirrors

To view the status of contribs.org mirrors, look at http://mirror.contribs.org/mirrors.

Accessing the contribs.org mirros

To access the contribs.org download mirrors, always use the URL http://mirror.contribs.org. The current releases can be found at http://mirror.contribs.org/smeserver/releases/, the contribs section can be found at: http://mirror.contribs.org/smeserver/contribs/

  Tip:
If you use the mirrors.contribs.org URL you will automatically be redirected to a mirror that is current within the last 8 hours.


Figures

Hard disk size

The amount of physical hard disk size taken up by the contribs.org data will be about 30Gb, the /release tree, which holds the iso images and the repositories, accounts for about 6Gb of said amount. The /contribs tree will account for a little over 1Gb of the total. The obsolete directory accounts for 16Gb, and the testing directory which include the next version SME8 is 7Gb big.

If you are short in space you can easily save 16Gb with --exclude="obsolete/" and 7Gb more with --exclude="testing/"

Bandwith

Due to the small number of mirrors, the bandwidth routed to your servers will be in the order of 200Gb/month on average, which translates to about 80kB/s.

Requirements for mirrors

  • Static IP address
  • Dedicated user for syncing
  • Allow SSH from internet to static IP (port doesn't matter)

How to become a mirror site?

If you or your company has some spare bandwidth and would like to be included in mirror.contribs.org, become a mirror by following these steps:

Preparing your system

  1. Create a storage location for mirror
    mkdir -p {/path/to/your/data/store/}
  2. Create a new user to perform sync. If you are running SME Server you can create the user through the server-manager panel.
  3. Now it is time to download the ftpsync script and all files it requires: wget http://contribs.org/ftpsync.tgz
  4. Extract the tarball in users directory tar zxof ftpsync.tgz
  5. Change the ownership of the directories to the new user chown -R {user} bin etc log .ssh {/path/to/your/data/store/}
  6. Now we have installed and set things up as is required but we will need to update the configuration file to point to the storage location of the data (TO) in the config file (etc/ftpsync.conf). Use your favorite text editor for it.

Testing your setup

  1. Now it is time to perform the initial sync (and test that script does what it needs to)
    su - {user} -s /bin/bash
    ~/bin/ftpsync
  2. Now check heck the logs to see if there are any errors. Since the initial sync will take a lot of time you can best do this in a second terminal window: cd ~/log cat rsync-ftpsync.error.0

Configuring the web server

You will need to configure your web server to make the files available to the public.

For that you need to enable the FollowSymLinks option in the apache config file.

If your mirror is hosted on a SME Server 7.x (or higher version), in an ibay, you should issue the following commands after creating the ibay:

db accounts setprop {ibayname} FollowSymLinks enabled
signal-event ibay-modify {ibayname}

Keeping your mirror up-to-date

The ftpsync script allows for two sync methods: push or pull.

  Note:
We prefer you configure your mirror to be setup as a push mirror because:
  • Sync only happens when there are changes
  • Changes are propagated as close to real-time as possible
  • Changes can be staged (sync data first, repodata second)
  • Less out of sync mirrors for yum


Why we prefer push

First some background on ssh. Ssh allows people to connect to accounts on different machines in a secure way. Not only are passwords never passed in the clear, once you connect to a machine you are basically guaranteed that future connections will be to the same machine. This prevents many man-in-the-middle attacks.

One capability ssh has is the ability for a user to take the public identity key for a user on another machine and add it to a file of authorized keys on your machine. By default, the user on the other machine (who has the private identity key associated with the public identity key given to you) then has login privileges to your account. It is possible, though, to add text to an authorized key restricting the type of access a person accessing your account using that key has.

So to protect the downstream mirror, the key provided by the upstream mirror has text added to it to limit it to only give the person accessing your account permission to do one thing — start the program on your machine that updates your mirror. Even if someone (an evil third party) was able to break the key, the most they could do is to start the mirror program on your machine. You do not even have to worry about multiple copies of the program being started as a lockfile is used.

On the upstream end, rsync can be configured to restrict who can mirror a given area by username and password. These are totally separate from /etc/passwd so a push server doesn't have to worry about giving others access to their machine. As it is set up, the username and password are passed in the clear. This shouldn't be a problem though, as the worst that can happen is that a third party gains the ability to mirror the Debian pages from that site.

(source: Debian: Push mirroring)

How push works

Below is a short description of the push process:

  1. Master mirror updates timestamp file
  2. Master initiates ssh into tier 1 mirrors to start stage 1 sync (wait)
  3. Tier 1 mirrors rsync everything but repodata from designated targets (no delete)
  4. Tier 1 mirrors initiate ssh into tier 2 mirrors to start stage 1 sync (wait)
  5. Repeat prior to steps for each tier under 2
  6. Master initiates ssh into tier 1 mirrors to start stage 2 sync
  7. Tier 1 mirrors rsync everything from designated targets (with delete)
  8. Tier 1 mirrors initiate ssh into tier 2 mirrors to start state 2 sync
  9. Repeat prior to steps for each tier under 2
  10. Master mirror checks freshness of mirrors and generates mirrorlists
Configuring for push

The push system uses private public key pairs for communication, for this you will need to execute some additional configuration steps:

  1. First and foremost you will need SSH to be configured and running on your server. If you are using SME Server for your mirror you will have to enable remote access on your server through the server-manager.
  2. You will also need to enable bash as the shell for this user.
    If you are running SME Server you can do that like this: db accounts setprop {user} Shell /bin/bash signal-event user-modify {user}
  3. You will also have to append the keys to the authorized_keys file of the user su - {user} -s /bin/bash cat .ssh/pushmirror-*.pub >> .ssh/authorized_keys
Configuring for pull
  Note:
We prefer you configure your mirror to be setup as a push mirror, but if you can not do so or have other ways for not doing so you can also configure your mirror to pull.


Configuring for a pull based mirror is easy. Just schedule a cron job to run every 2 hours that does the exact same sync command you do to get the mirror in the first place, you can add a comment like in the example below:

1 */2  * * *  {user} ~/bin/ftpsync

Advertising your mirror

After your mirror is synced and working properly the last thing you need to do is let us know by filing a bug report on bugs.contribs.org under the website category or by following this link . Please include the following in the bug report:

  • name of site
  • primary contact name/email
  • location/country
  • bandwidth available to mirror
  • URL to site (for freshness checks and yum)
  • hostname to connect to (for ssh)
  • port to connect to (for ssh)
  • username to connect with (for ssh)

Configuration options

The ftpsync configuration script has a number of options you can configure. You might have already seen some of them when you had to adjust the storage location in the configuration process. The configuration file is well documented but we will discuss some of the features here.

  Incomplete:
This article or section needs to be expanded. Please help to fill the gaps or discuss the issue on the talk page