Load Data from Amazon S3 using COPY
tip
Expected time: 5 minutes ⏱
Databend COPY can read object files(CSV or Parquet format) from Amazon S3 buckets, To understand Amazon S3’s core concepts please read Amazon S3 documentation.
Before you begin
- AWS Account:  This Quickstart uses Amazon S3 and requires an AWS account’s access key idandsecret access key.
- Databend: You will connect to the database and using COPY to pull data from your Amazon S3 bucket, please see How to deploy Databend.
Part 1: Creating an Amazon S3 Bucket and Adding a File
- On your local machine, create a text file with the following CSV contents and name it books.csv:
books.csv
Transaction Processing,Jim Gray,1992
Readings in Database Systems,Michael Stonebraker,2004
This CSV file field delimiter is , and the record delimiter is \n.
- In S3 create a bucket and upload books.csvto the bucket.
Part 2: Creating a Database and Table using COPY
create database book_db;
use book_db;
create table books
(
    title VARCHAR(255),
    author VARCHAR(255),
    date VARCHAR(255)
);
Now that the database and table have been created.
In Part 1 of this Quickstart, you uploaded the books.csv file to your bucket.
To use the COPY data loading, you will need the following information:
- The name of the S3 URI(s3://bucket/to/path/), such as: s3://databend-bohu/data/
- Your AWS account’s access keys, such as:- Access Key ID: your-access-key-id
- Secret Access Key: your-secret-access-key
 
Using this URI and keys, execute the following statement, replacing the placeholder values with your own:
copy into books
  from 's3://databend-bohu/data/'
  credentials=(aws_key_id='<your-access-key-id>' aws_secret_key='<your-secret-access-key>')
  pattern ='.*[.]csv'
  file_format = (type = 'CSV' field_delimiter = ','  record_delimiter = '\n' skip_header = 0);
Now, let's check the data to make sure data has actually loaded:
mysql> select * from books;
+------------------------------+----------------------+-------+
| title                        | author               | date  |
+------------------------------+----------------------+-------+
| Transaction Processing       |  Jim Gray            |  1992 |
| Readings in Database Systems |  Michael Stonebraker |  2004 |
+------------------------------+----------------------+-------+
tip
If the file(s) is large and we want to check the file format is ok to parse, we can use the SIZE_LIMIT:
copy into books
  from 's3://databend-bohu/data/'
  credentials=(aws_key_id='<your-access-key-id>' aws_secret_key='<your-secret-access-key>')
  pattern ='.*[.]csv'
  file_format = (type = 'CSV' field_delimiter = ','  record_delimiter = '\n' skip_header = 0)
  size_limit = 1; -- only load 1 rows