Using Git to sync a website

November 24, 2008 at 12:00 PM in: software

Git is an extremely powerful tool for file sharing and version control. Unfortunately it is too flexible for most uses - you need to construct a workflow for your particular usecase, and if you screw things up you may need an expert to help you recover.

I decided that Git was the right tool for maintaining, and ran straight into an undocumented use case.  So here I describe the setup and workflow I use for the site, in the hope that it may be useful to somebody. I assume basic familiarity with Git and Ssh.

The basic setup is that I have two "non-bare" repositories, one for the live site and one for development at home. My home machine is behind a firewall, so I need to use git-push to move changes to the live site after they are tested.

The obvious thing to do would be to push from my local master branch to the site's master branch. Unfortunately, this operation will screw up the repository on the live site, and you will have to find a Git expert to help you recover. The people who develop Git are aware of this problem but do not plan to fix it, which is one example of why Git is not ready for production use unless you have an expert on staff.

The only safe way to push to a "non-bare" repository is hinted at by Junio Hamano in an email message. You need to use git-push to transfer the changes from the home repository to a "remote" branch on the production website, then do a separate git-merge on the site to merge the changes into the live tree.

Here's how I set up the live site and home development site. All commands below are executed on the home (development) machine.

First, set up some variables that will be used frequently:

    # $remoteacct is your account on the production website.
    # $homesite is the path to the development copy of
    # the site on the local machine
    # Now test that ssh and git work on the remote machine.
    # This command may not print anything but it should succeed before
    # going any further.
    ssh $remoteacct git config --list

Now, initialize the remote git repository.  Here I assume it's in ~/public_html on REMOTEHOST.  You probably want hide the ~/public_html/.git directory using .htaccess too:

    ssh $remoteacct 'cd ~/public_html; git init ; git add .'
    ssh $remoteacct 'git commit -a -m "initial contents"'

Create the development repository by cloning the site to your home machine.  This git-clone command also creates the remotes/site branch on the home machine to track the state of the production repository, and populates the development area with a copy of the live site. Usually the branch would be called remotes/origin, but since we're usually going to be pushing rather than pulling, the name "site" is less ambiguous:

    git clone --origin site ssh://$remoteacct/~/public_html $homesite

Here's the unusual step: add a git-config option to allow git-push to work from the home master to the live site's remotes/home area, in addition to the pull setup created by cloning. Note that "site" in the command corresponds to "--origin site" in the git-clone command:

    cd $homesite
    git config '+refs/heads/:refs/remotes/home/'
    git push site

Now, whenever you need to update the site, you can use git-push to send the latest home commits to the remotes/home branch on the live site, and run git-merge on the live site to bring the changes online. The "home" in the git-merge matches the remotes/home in the git-config command above:

    git commit -a -m "commit updates in development area"'
    git push site
    ssh $remoteacct 'cd public_html ; git merge home/master'

To bring changes from the live site to the home repository, they must first be committed on the live site using git-commit. Then use git-pull to fetch the committed changes and merge them into the checked out home repository. You don't have to name the remote branch to use in git-pull because it is remembered from "git-clone --origin site ...".

    ssh $remoteacct 'cd public_html ; git commit -a -m "commit live site"'
    git pull

That's it! Use at your own risk - I'm not a Git expert at this point, but so far it's working for me.

blog comments powered by Disqus