git

Drupal Deployment with Git Submodules

(There are screencasts illustrating these concepts at the bottom.)

I've been using git submodules for Drupal site deployment for quite awhile now, and I wanted to share my experience (for good and bad) with others who might be interested in using submodules (and to solicit comments from those who know how to do it better).

What are git submodules?

First "git submodules" have nothing to do with Drupal modules. They're a way of including one git repository inside another, in its entirety, and managing it from within the parent repository.

In a regular git repository, you have one .git directory in the root directory of the project that contains the entire history and all the revisions for that project. Every change in the project is tracked in that one .git repository.

Using git submodules you have one or more sub-repositories of the main one. They look exactly like a regular git repository, but the parent repository knows just enough about them to manipulate them.

For example, if we're using the regular Drupal checkout as our main repository and checking out Drupal modules as submodules, we might do this:

Simpler Rebasing (avoiding unintentional merge commits)

Executive summary: Set up to automatically use a git pull --rebase

Please just do this if you do nothing else:

git config --global branch.autosetuprebase always

About rebasing and pulling

I've written a couple of articles on rebasing and why to do it, but it does seem that the approaches can be too complex for some workflows. So I'd like to propose a simpler rebase approach and reiterate why rebasing is important in a number of situations.

Here's a simple way to avoid evil merge commits but not do the fancier topic branch approaches:

  • Go ahead and work on the branch you commit on (say 7.x-1.x)
  • Make sure that when you pull you do it with git pull --rebase
  • Push when you need to.

That's it.

Reasons to rebase to keep your history clean

There are two major reasons not to go with the default "git pull" merge technique.

Handling the new .gitignore file in D7 and D8

Drupal 7 and Drupal 8 recently added a default (and sensible) .gitignore file to the standard repository, and while this solves some problems, it has also caused some confusion. (issue)

Here's a link to the actual new .gitignore. Essentially, it excludes the sites/default/files and sites/default/settings.php files from git source control.

Playing around with Git

Even if you've just arrived into the Gitworld, you've already noticed that things are really fast and flexible. Some people claim that the most important thing about the distributed nature of Git is that you can work on an airplane, but I claim that's totally bogus. The coolest thing about the distributed nature of Git is that you can fool around all you want, and try a million things, and even mess up your repo as much as you want... and still recover. Here's how, with a screencast at the bottom.

With CVS or Subversion or any number of other VCS's, when you commit or branch, you have the potential of causing yourself and others pain forever. And there's no way around it. You just have to do your best and hope it comes out OK. But with Git, you can try anything and rehearse it in advance, in a total no-consequence environment. Here are some suggestions of where to start.

Drupal Patching, Committing, and Squashing with Git

Back in the bad old days (like 2 weeks ago) there was exactly one way to create patches and exactly one way to apply them. Now we have so many formats and so many ways to work it can be mind boggling. As a community, we're going to have to come to a consensus on which of these many options we prefer. I'm going to take a quick look at the options and how to work among them when necessary.

Ways to create patches

My preference when making changes is always to commit my changes whether or not the commits will be exposed later. That lets me keep track of what I'm doing, and make commit comments about my work. So in each if these cases we'll start with a feature branch based on origin/7.x-1.x with:

# Create a feature branch by branching from 7.x-1.x
git checkout -b my_feature_99999 origin/7.x-1.x
[edit A.txt]
git add .
git commit -m "Added A.txt"
[edit B.txt]
git add .
git commit -m "Added B.txt"

First, I may want to know what commits and changes I'm going to be including in this patch.

Pushing and Deleting Git Topic Branches (Screencast)

Creating topic branches (also called "feature branches") is easy in Git and
is one of the best things about Git. Sometimes we also want to get those pushed
from our local repo for various reasons:

  1. To make sure it's safe on another server (for backup).
  2. To let others review it.
  3. To let others build upon it.

Here we'll just be dealing with #1 and #2, and not talking about how to collaborate
on a shared branch. We'll be assuming that only the original author will push to it,
and that nobody else will be doing work based on that branch.

Update: See the drush implementation of this approach in the comment below.

DamZ taught me a great new piece of git trivia today. You can use a local repository as a kind of cache for a git clone.

Let's create a reference repository for Drupal (and it will be bare, because we don't need any files checked out)
git clone --mirror git://git.drupal.org/project/drupal.git ~/gitcaches/drupal.reference

That makes a complete clone of Drupal's full history in ~/gitcaches/drupal.reference

Now when I need to clone the Drupal project's entire history (as I might do often in testing) I can

git clone --reference ~/gitcaches/drupal.reference git://git.drupal.org/project/drupal.git

And the clone time is on the order of 2 seconds instead of several minutes. And yes, it picks up new changes that may have happened in the real remote repository.

To go beyond this (again from DamZ) we can have a reference repository that has many projects referenced within it.

A Rebase Workflow for Git

In this post I'm going to try to get you to adopt a specific rebase-based workflow, and to avoid (mostly) the merge workflow.

What is the Merge Workflow?

The merge workflow consists of:
git commit -m "something"
git pull   # this does a merge from origin and may add a merge commit
git push   # Push back both my commit and the (possible) merge commit

Note that you normally are forced to do the pull unless you're the only committer and you committed the last commit.

Why Don't I Want the Merge Workflow?

As we saw in Avoiding Git Disasters, the multiple-committer merge workflow has very specific perils due to the fact that every committer for a time has responsibility for what the other committers have committed.

These are the problems with the merge workflow:

Avoiding Git Disasters: A Gory Story

I learned the hard way recently that there are some unexpectedly horrible things that can happen to a project in the Git source control management system due to its distributed nature... that I never would have thought of.

There is one huge difference between Git and older server-based systems like Subversion and CVS. That difference is that there's no server. There's (usually) an authoritative repository, but it's really fundamentally just a peer repository that gets stuff sent to it. OK, we all knew that. But that has some implications that aren't obvious at first. In Subversion, when you make a change, you just push that change up to the server, and the server handles applying just that change to the master copy of the project. However, in Git, and especially when using the default "merge workflow" (I'll write about merge workflow versus rebase workflow in another article), there are times when a single developer may be in charge of (and able to unintentionally break) the entire codebase all at once. So here I'm going to describe two ways that I know of that this can happen.

Git over an ssh tunnel (like through a firewall or VPN)

It's a treasured geek secret that ssh can tunnel TCP connections like ssh all over the internet. What does that mean? It means that you can access machines and ports from your local machine that you never thought you could, including git repositories that are behind firewalls or inside VPNs.

There are two steps to this. First, we have to set up an ssh tunnel. We have 3 machines we're interacting with here:

  • The local machine where we want to be able to do a git clone or git pull or whatever. I'll call it localhost.
  • The internet or VPN host that has access to your git repository. Let's call it proxy.example.com.
  • The host that has the git repository on it. We'll call it git.example.com, and we'll assume that access to git is via ssh on port 22, which is very common for git repos with commit access.

Step 1: Set up a tunnel (in one window). ssh -L3333:git.example.com:22 you@proxy.example.com

Topics: 

Pages

Subscribe to git
Drupal theme by Kiwi Themes.