BackUps
Setting Up Back Ups of Super Computer Data
The /compute
folder on the BYU supercomputer is not backed up, so you will need to back up your data regularly. This tutorial explains how to set up an automatic back up to regularly copy your super computer /compute
folder to your cloud Box account.
Explanation of Variables Found Below
localusername
is your username on the local machinefslusername
is your username on the supercomputerCAEDMusername
is your CAEDM usernamecomputebkp
is the name of a folder created on your Box accounttestfile
is the name of a small file on the supercomputer used for testing
Set Up Rclone to Work With Box
Rclone is a program that allows you to move files to cloud storage through the command line. You need to do two things: 1) create a folder on Box to contain the backup files and 2) configure Rclone to access to your Box account without a password. The latter is done by creating a unique key and placing this key on your super computer account.
- Create a Folder on your Box Account
- Navigate to box.byu.edu in a web browser.
- Supply your BYU ID and password and authenticate through DUO.
- Create a folder on Box by clicking on "New." You can name it anything you desire. The examples below use the name
computebkp
. - Don't close the browser. Stay logged into Box.
- Configure Rclone
- Install Rclone on your local machine if it not already installed
- This can be done through YAST.
- Select Software Management.
- Search for rclone.
- Run the following command in the command line of your local machine
rclone authorize box
- Click on the big button "Grant Access to Box" that appears in your browser.
- If you have logged into box.byu.edu in a browser before running the command
rclose authorize box
then a new tab should appear in your browser with a big blue button. - Once you click "Grant Access to Box" then the webpage should change to show "Success!" and instruct you to returns to Rclone.
- If you have logged into box.byu.edu in a browser before running the command
- Return to the window where you ran
rclone authorize box
on your local machine and copy the access token for later.- The token will be between the symbols ---> <----. It should be something like
{"access_token":"...."}
where "..." is a long string with several things in it. - Paste this key into a temporary file so that you have it for later.
- The token will be between the symbols ---> <----. It should be something like
- Create the Rclone configuration file
- Log into the supercomputer.
- Create a file called
rclone.conf
and place it at~/.config/rclone/rclone.conf
. - The file should be the following.
[boxRaw] type = box token = PASTE_TOKEN_HERE [box] type = chunker remote = boxRaw: chunk_size = 30G hash_type = sha1
- Replace "
PASTE_TOKEN_HERE
" with the token obtained from runningrclone authorize box
on your local machine. (Remember that it should be something like{"access_token":"...."}
). - Make sure to save this file to
~/.config/rclone/rclone.conf
.
- Install Rclone on your local machine if it not already installed
- Check that Rclone Is Set Up Properly on the Supercomputer
- Load the rclone module on the supercomputer using the command
module load rclone
. - Select a small file to copy. For this example the file is named
testfile
, and the folder created on Box iscomputebkp
. - Run the command
rclone copy testfile box:computebkp
- Check to see if
testfile
appears on Box in the foldercomputebkp
. If it doesn't, review each step of "II. Configure Rclone."
- Load the rclone module on the supercomputer using the command
Set Up Restic on the Super Computer
Restic is a program that automates many of the tasks needed to create backups. The purpose is to create a series of snapshots of your files so that you can go back in time and review different versions. It can be used with Rclone but serves a different purpose. Rclone is the utility that copies files to cloud storage. Restic is the utility that creates the snapshots and labels them. It uses Rclone to make the copies but adds the additional information to each copy to identify each snapshot and stores all the information in a repository. You will set up the repository on Box.
- Load the Restic Module on the Supercomputer
- Log into the supercomputer
- load the Restic (and rclone) module (s) using the following command.
module load restic rclone
- Create a Restic Password File
- Create a file and place it at
~/.restic-password
. - Place a random, uniquie, secure password in the file.
- Do not forget this password. You will not be able to retrieve backups if you forget it.
- Make the password at least 16 characters long.
- These websites can help you generate strong passwords.
- Do not use a password that you use elsewhere. This password file could be compromised if someone hacks the supercomputer. The supercomputer administrators can also view it at any time.
- Change the permissions on the password file so that only you as owner can view it or write to it by running the following command.
chmod 600 ~/.restic-password
- Create a file and place it at
- Initialize the Repository on Box
- Run the following command
restic -p ~/.restic-password -r rclone:box:computebkp init
computebkp
is the name of the folder you created on Box- This command will take several seconds to execute.
- If everything works properly, you should see "
created restic repository
..." and a note reminding you to not lose your password.
Creating a Backup
Once Restic and Rclone are set up as described above, you can create a backup of your compute
folder.
CAUTION Creating a backup can take a long time. The first time you run Restic it could take 1-2 days depending on the amount of data you have. Subsequent backups should not take as long because only the files that are changed or new compared to the previous snapshot are backed up.
The following command will create a backup of your compute
folder on the supercomputer.
restic -p ~/.restic-password -r rclone:box:computebkp backup ~/compute --tag first_backup
- The flag
--tag
assigns this snapshot the tag offirst_backup
. - Tagging is for convenience in searching snapshots later.
- The
--tag
flag can be omitted.
Automate the Backup Process
Once you have Rclone and Restic set up, you can automate the process to regularly back up your data. This is done by: 1) creating a bash script with the Restic backup command, and 2) telling the supercomputer to regularly run this script. The latter is done using cron.
Cron is a utility that runs on all Linux computers. You create a cron table, crontab, with the script path and the time interval at which to run the script, and cron runs the scripts as indicated in the table. Cron can be used to schedule any job desired; it is not limited to this use for backups.
- Create the Backup Script
- Download the file backup_compute.sh .
- This file is a bash script that contains the Restic command to back up the compute folder on the supercomputer to Box.
- Variables are used in this script to define the folder to back up and the Restic repository on Box so that you can easily change it for other purposes. It is currently set up to work with the names for folder as described in the steps above.
- Log into the supercomputer.
- Save
backup_compute.sh
in thebin
folder in your home directory on the supercomputer. You may need to create this folder usingmkdir ~/bin
- Change the mode of the file so that it is an executable using the following command.
chmod 755 ~/bin/backup_compute.sh
- Download the file backup_compute.sh .
- Create the Entry in the Cron Table
- Log into the supercomputer.
- Execute the following command which will open a text editor.
crontab -e
- The file that opens may be blank or may have content.
- The editor will likely be VI.
- Place the following line in the file and save the file as normal.
0 2 * * 6 ~/bin/backup_compute.sh
- This command tells Cron to run the script
backup_compute.sh
at 2:00 AM every Saturday. - You can change the frequency of the backups by changing the first 5 characters in this line.
- Search Google for crontab examples to learn more about the syntax for crontab.
- https://crontab.guru/ is helpful to learn about how to set the syntax of the table.
If you have done everything correct, your compute
table should now be regularly backed up to box. Check your backups regularly to ensure they are running.
Other Helpful Tutorials
Rclone
Restic [Includes a discussion of how to retrieve backups.]
Backing Up Your Data