git filter-branch command:git filter-branch -f --env-filter ' export GIT_AUTHOR_NAME="`echo 798h23l$GIT_AUTHOR_NAME | md5sum`"; export GIT_AUTHOR_EMAIL="`echo 2i3fsdf$GIT_AUTHOR_EMAIL | md5sum`"; export GIT_COMMITTER_EMAIL="`echo d0af09$GIT_COMMITTER_EMAIL | md5sum`"; export GIT_COMMITTER_NAME="`echo 0sdfk9$GIT_COMMITTER_NAME | md5sum`";' --msg-filter '' -- --all
This blanks out the commit message, and sets the author and committer email and name field to a salted and hashed value (so that authors can be compared across commits, but not explicitly known).
Specifically, the
--env-filter flag takes a chunk of shell code that sets various environment variables used to reset the author and committer email and name for each commit. I've set each variable to the md5sum of itself plus a salt. The --msg-filter also takes shell code to modify the commit message, but I've just set it to the empty string. The -- --all at the end just says to apply this filter to every commit.The git logs now contain the entries that look like following:
commit a0951d56c0d31ef99e0cf9366d0b53cd75bb1073 Author: d62ac7eb9aa769878de91bcd0459b5c5 - <07bd53e7c6c874dfb93d69158135ca9b > Date: Wed Jul 12 16:01:32 2000 +0000 commit b2d8ae2baedc388727e39918af2741e5ba308ebd Author: 7e521d4c41fa5a518201fb2278acb3c3 - <86a9c9f7d22d7c47f281dc4a68dee371 > Date: Tue Jul 11 23:50:22 2000 +0000A clone of this repository will contain none of the original commit messages or author/committer data, and so will be fine to distribute.