Technical How-To’s & Notes
Technical How-To’s & Notes
Archiving Your Research Files
Archiving Your Research Files
- Put the data or project files in a subdirectory
- Run the script called make-archive which compresses your data
- You will receive an email confirming the success of the operation
- Check the data and burn to a DVD or external drive.
What files should be archived?
- research project-related files after the project is done
- data you are finished with but want to keep
- files you rarely need to use
Example: Jane Doe has three projects. Each project is in a separate subdirectory
on research.hbs.edu
. In a terminal session:
researchgrid$ pwd
/export/home/faculty/jdoe/data/
researchgrid$ ls -l
drwx------ 5 jdoe faculty 96 Oct 5 2015 AcmeCase
drwx------ 16 jdoe faculty 8192 Oct 3 09:17 airlineDat
drwx------ 2 jdoe faculty 01 Oct 23 13:16 bank_proj
researchgrid$
To see how much data (in kilobytes) is in each directory:
researchgrid$ du -sk *
12976 AcmeCase
340093 airlineDat
880144 bank_proj
The listing above shows that the directories contain about 13, 340 and 880 megabytes, respectively.
From the parent directory of the subdirectories you wish to archive, run a script
by typing make-archive <dirname>
. In this example, we archive the contents of the AcmeCase directory:
researchgrid$ pwd
/export/home/faculty/jdoe/data/
researchgrid$ make-archive AcmeCase
The server shortly responds with this message:
Creating compressed archive file"/export/scratch/archive_jdoe/AcmeCase.tar.bz2" from
all items in directory "AcmeCase"
The total data size (in KB) of the source is: 12976
Compression done.
Now testing archive and making table of contents.
Done.
Mail has been sent to jdoe@hbs.edu with details of this archive operation.
The size of the compressed archive is 2797 KB.
**NOTE: If the size of the compressed archive is more than 3,500,000 KB (3.5 GB), the files should be divided across two or more subdirectories. The make-archive script is then run on each subdirectory.
Here is what takes place when the make-archive script is run:
- The script creates an archive directory with your HBS intranet username in the /export/scratch directory named archive_username.
- In the above example, all of the files in the AcmeCase directory were compressed and copied to the archive directory.
- You will receive an email confirming the operation.
- A text file named
<username>.info.txt
containing a list of the filenames and uncompressed sizes will also be included in your archive directory. Instructions about uncompressing the files for use at a future time is included in that same text file.
Burn the compressed file to DVD or an external drive along with a copy of the text file since that file contains instructions on how to uncompress your data. Once you have copied the compressed file and the text files, you should delete the original data files and directory from your account area:
researchgrid$ pwd
/export/home/faculty/jdoe/data/
researchgrid$ rm AcmeCase/*.*
researchgrid$ rmdir AcmeCase
Any questions? Please contact Research Computing Services. And thank you for helping manage and conserve space on the research grid!