Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Research Computing Services
  • Online Requests
  • FAQ
  • Blog
  • Contact Us
  • About Us
  • Faculty Projects
  • Training
  • Compute Cluster & Data Storage
  • Data Practices
  • Help
  • …→
  • Harvard Business School→
  • Research Computing Services→
  • Data Practices
    • Data Practices
    • Best Practices
    • Data Tips & Toolkits
    • Database Best Practices
    • Transferring Data
    →
  • Database Best Practices
    • Database Best Practices
    • Delimiters
    • Exporting
    • Import & Export of Text Files
    • Importing
    →
  • Importing→

Database Best Practices

Database Best Practices

  • Delimiters
  • Exporting
  • Import & Export of Text Files
  • Importing

Importing

Importing

  • Best Practices
  • Data Tips & Toolkits
  • Database Best Practices
    • Delimiters
    • Exporting
    • Import & Export of Text Files
    • Importing
  • Transferring Data
7ms

Basics of Importing

The following is a basic overview of the import process. Please see the subsequent example for commands that are to be used. Complete documentation for the LOAD DATA command to be run within MariaDB may be found at [https://mariadb.com/kb/en/library/load-data-infile/].

The general process is as follows:

  • (1) Move your data to the appropriate import folder
  • (2) Within MariaDB, create the database table that will hold the imported data
  • (3) Within MariaDB, import the data
  • (4) Validate the import
  • (5) Remove your data from the import folder

In order to (2) create the database table, you will need to have an understanding of your data. You will need to know the name of all columns, as well as each column’s data type (integer, numeric with decimals, string of characters, etc) and each column’s maximum width. As an example, if one of the columns in your data is US phone numbers of the format 6174953292, then you may opt to use int(10). This tells us that all entries will be integers with up to 10 digits. However, if you suspect some entries have dashes such as 617-495-3292, then you will need to use char(12) which stores the data as a string of characters, up to 12 characters in length.

Please feel free to contact RCS at research@hbs.edu with any questions you may have regarding data import.

Example of Importing

We will use the following to illustrate importing:

  • MariaDB username = jharvard
  • MariaDB database = jharvard_database
  • MariaDB database table = table_import
  • Import filename = SampleData.txt
  • Data:

Column_1

Column_2

Column_3

25

Harvard Way

Boston, MA

  • which as a file appears as:
  • Line 1: Column_1|Column_2|Column_3
  • Line 2: 25|Harvard Way|Boston, MA
  • Before importing, we will need to create our database table. Creating a table includes specifying the maximum size of each column. We will keep this example simple by specifying each column be char and have a maximum length of 20. Please note that you can modify your table at a later time, for instance if you need to example a column from char(20) to char(30).

To begin, prepare the import folder as described above in the section Prevent Others From Accessing Your Data

mkdir /export/mdb_external/import/jharvard
chmod 700 /export/mdb_external/import/jharvard

Move your data to this import folder

mv SampleData.txt /export/mdb_external/import/jharvard

! Note that mv moves the file, as opposed to cp which copies the file!

cp SampleData.txt /export/mdb_external/import/jharvard

Log into MariaDB

mysql -h HOSTNAME -u jharvard -p

Within MariaDB, create table that will hold imported data

use jharvard_database; create table table_import (Column_1 char(20), Column_2 char(20), Column_3 char(20));

Within MariaDB, import data

load data local infile ‘/export/mdb_external/import/jharvard/SampleData.txt’ into table table_import fields terminated by ‘|’ lines terminated by ‘\n’ ignore 1 lines;

Please note our command has 4 sections:

  • (1) load data local infile ‘/export/mdb_external/import/jharvard/SampleData.txt’
  • specify file to import
  • (2) into table table_import
  • specify table that will hold the imported data
  • (3) fields terminated by ‘|’ lines terminated by ‘\n’
  • specify delimiters (Click here for more information)
  • (4) ignore 1 lines
  • include this only if your file includes column header information

Official documentation for this command may be found at [https://mariadb.com/kb/en/mariadb/load-data-infile/].

We can check do a preliminary check on the first 10 rows of our data via

  • select * from table_import limit 10;

Log out of MariaDB

>exit;

Delete import folder

rm -rf /export/mdb_external/import/jharvard
ǁ
Campus Map
Research Computing Services (RCS) 
Harvard Business School
Baker Library, B90, 25 Harvard Way
Boston, MA 02163
Phone: 617.495.6100
Email: research@hbs.edu
→Map & Directions
→More Contact Information
→Terms Of Service
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College