Skip to content

Instantly share code, notes, and snippets.

Forked from guss77/
Last active August 9, 2024 16:34
Show Gist options
  • Save wdouglascampbell/91ef486849c444c759033b1da3f7bf84 to your computer and use it in GitHub Desktop.
Save wdouglascampbell/91ef486849c444c759033b1da3f7bf84 to your computer and use it in GitHub Desktop.
#1) on register project, enable Groups Migration API, and create OAuth2 credentials.
#2) set the client_id and client_secret variables below to the values provided when creating the OAuth2 credentials.
#3) make copies of the Client ID and Client Secret for the OAuth2 credentials and use them
#4) get authorization code at the following link using web browser
# (make sure you sign in with an account that has access to Google Groups you are importing to).
# you will need to replace [CLIENT_ID] with the previously obtained Client ID
#5) use the following command to obtain a OAuth2.0 Refresh Token
# you will need to replace [AUTH_CODE] with the value obtained in step #4 and
# replace [CLIENT_ID] and [CLIENT_SECRET] with the values obtained in step #1
# curl --request POST --data "code=[AUTH_CODE]&client_id=[CLIENT_ID]&client_secret=[CLIENT_SECRET]&redirect_uri=urn:ietf:wg:oauth:2.0:oob&grant_type=authorization_code"
#6) set the refresh_token below with the value returned
#7) you may now run the import. sometimes the process will stop. if that
# happens, just wait 5 seconds or so and then run it again. it will resume
# where it left off.
# for long running imports, the access token will be refreshed when there is
# less than 5 minutes (300 seconds) left until its expiration. if the
# messages you are importing are really large or your connection really slow
# such that they take longer than 5 minutes to import you may want to adjust
# the line in the code below.
# finally it should be mentioned that Google does not provie a particularly
# fast way to accomplish this since it does not support parallel uploads
# to the same Google Groups archive. if parallel uploads were allowed the
# code could be optimized to take advantage. the only thing that would need
# to be encounted for is that message threads need to be uploaded in order.
# this means the only topic threads could be run in parallel.
function usage() {
echo "usage: $0 <group-address> <mbox-dir>"
) >&2
exit 5
[ -z "$GROUP" -o -z "$MBOX_DIR" ] && usage
token=$(curl -s --request POST --data "client_id=$client_id&client_secret=$client_secret&refresh_token=$refresh_token&grant_type=refresh_token" | sed -n "s/^\s*\"access_token\":\s*\"\([^\"]*\)\",$/\1/p")
# create done folder if it doesn't already exist
mkdir -p $DONE_FOLDER
for file in $MBOX_DIR/*; do
echo "importing $file"
response=$(curl -s -H"Authorization: Bearer $token" -H'Content-Type: message/rfc822' -X POST "$GROUP/archive?uploadType=media" --data-binary @${file})
result=$(echo $response | grep -c "SUCCESS")
# check to see if it worked
if [[ $result -eq 0 ]]; then
echo "upload failed on file $file. please run command again to resume."
exit 1
# it worked! move message to the done folder
mv $file $DONE_FOLDER/
if [[ $i -gt 9 ]]; then
expires_in=$(curl -s "$token" | sed -n "s/^\s*\"expires_in\":\s*\([0-9]*\),$/\1/p")
if [[ $expires_in -lt 300 ]]; then
# refresh token
echo "Refreshing token..."
token=$(curl -s --request POST --data "client_id=$client_id&client_secret=$client_secret&refresh_token=$refresh_token&grant_type=refresh_token" | sed -n "s/^\s*\"access_token\":\s*\"\([^\"]*\)\",$/\1/p")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment