List files and folders of AWS S3 bucket using prefix & delimiter

How to use S3 ruby sdk to list files and folders of S3 bucket using prefix and delimiter options. We talk about S3 and the various options the ruby sdk provides to search for files and folders.

Posted by Ameena on 01 Feb 2017

Amazon Simple Storage Service which is also known as Amazon S3 is highly scalable, secure object storage in the cloud. It is used to store and obtain any amount of data at any time and from anywhere on the web. Amazon S3 is mainly used for backup, faster retrieval and reduce in cost as the users have to only pay for the storage and the bandwith used.

S3 terminologies

Object

> Every file that is stored in s3 is considered as an object. Each Amazon S3 object has file content, key (file name with path), and metadata.

Bucket

> Buckets are collection of objects (files). Each bucket can have its own configurations and permissions.

Methods required for listing

1) new()

Aws::S3::Resource class provides a resource oriented interface for Amazon S3 and new() is used here for creating s3 resource object with arguments region and security credentials.
#aws_objects.rb
s3 = Aws::S3::Resource.new({
  region: ENV['AWS_REGION'],
  access_key_id: ENV['AWS_ACCESS_KEY_ID'],
  secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
})

2) bucket()

It is used to get the bucket by passing the name of the bucket as an argument.
#aws_objects.rb
...
s3.bucket("mycollection")

3) objects()

It is used to get all the objects of the specified bucket. The arguments prefix and delimiter for this method is used for sorting the files and folders.
Prefix should be set with the value that you want the files or folders to begin with. Delimiter should be set if you want to ignore any file of the folder.
#aws_objects.rb
...
s3.bucket("mycollection").objects(prefix:'', delimiter: '')

Examples

Assuming that the heirarchy is as below:

mycollection #bucket name

photos
  2017
    image1.jpg
    image2.jpg
  2016
    myphoto.jpg
    image1.jpg
  2010
    image1.jpg
photo
  2010
    image1.jpg
  2016
    image1.jpg
audio
  random.mp3
  2010
    one.mp3
  2016
    two.mp3
  jan
    2016
      two.mp3
      one.mp3
  feb
    2016
      three.mp3
random1.jpg
random2.mp3
random3.jpg
2016_random.jpg
2016_random2.jpg
2016_random1.mp3

Example 1 - List only the files in the bucket

Code

```ruby #aws_objects.rb ... puts s3.bucket("mycollection").objects(prefix:'', delimiter: '/').collect(&:key)
<h4 style="text-align: left;">Output</h4>
The output will be all the files present in the first level of bucket. As the prefix is set to nothing, all files will be considered. And delimiter is set to "/" which means only the files which has no "/" will be fetched and if there is any file which has a "/" will be ignored. Hence, the output will be

random1.jpg random2.mp3 random3.jpg 2016random.jpg 2016random2.jpg 2016_random1.mp3

<br>

<h3 style="text-align: left;">Example 2 - List all the files and folders in the bucket</h3>

<h4 style="text-align: left;">Code</h4>
```ruby
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'', delimiter: '').collect(&:key)

Output

The output will be all the files and folders present in the bucket. Both prefix and delimiter is set to nothing which means any file with any begining and no restriction on the path also. So,
2016_random.jpg
2016_random2.jpg
2016_random1.mp3
audio/
audio/2010/
audio/2010/one.mp3
audio/2016/
audio/2016/two.mp3
audio/feb/
audio/feb/2016/
audio/feb/2016/three.mp3
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3
audio/random.mp3
photo/
photo/2010/
photo/2010/image1.jpg
photo/2016/
photo/2016/image1.jpg
photos/
photos/2010/
photos/2010/image1.jpg
photos/2016/
photos/2016/image1.jpg
photos/2016/myphoto.jpg
photos/2017/
photos/2017/image1.jpg
photos/2017/image2.jpg
random1.jpg
random2.mp3
random3.jpg

Example 3 - List all the contents of the folder "photos/2017" (specific folder) in the bucket

Code

```ruby #aws_objects.rb ... puts s3.bucket("mycollection").objects(prefix:'photos/2017/', delimiter: '').collect(&:key)
<h4 style="text-align: left;">Output</h4>
In the folder photos/2017/, only two files will be sorted because the prefix is set to
"photos/2017/" which means, display only those files and folders which begin with photos/2017/ and
ignore rest.

photos/2017/ photos/2017/image1.jpg photos/2017/image2.jpg

<br>

<h3 style="text-align: left;">Example 4 - List all the files and folders of the folder "audio" in the bucket</h3>

<h4 style="text-align: left;">Code</h4>
```ruby
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'audio/', delimiter: '').collect(&:key)

Output

audio/
audio/2010/
audio/2010/one.mp3
audio/2016/
audio/2016/two.mp3
audio/feb/
audio/feb/2016/
audio/feb/2016/three.mp3
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3
audio/random.mp3

Example 5 - List only the files in the folder "audio" in the bucket

Code

```ruby #aws_objects.rb ... puts s3.bucket("mycollection").objects(prefix:'audio/', delimiter: '/').collect(&:key)
<h4 style="text-align: left;">Output</h4>
At first, it sorts those files and folders which begin with audio/ out of all the files present in the bucket. The result of first sort is

audio/ audio/2010/ audio/2010/one.mp3 audio/2016/ audio/2016/two.mp3 audio/feb/ audio/feb/2016/ audio/feb/2016/three.mp3 audio/jan/ audio/jan/2016/ audio/jan/2016/one.mp3 audio/jan/2016/two.mp3 audio/random.mp3

Next, Out of all the above files it sorts those files which do not have "/" after the prefix. There is only file and hence that is the result

audio/ audio/random.mp3

<br>

<h3 style="text-align: left;">Example 6 - List only the files in the folder "audio/jan" in the bucket</h3>

<h4 style="text-align: left;">Code</h4>
```ruby
#aws_objects.rb
...
puts s3.bucket("mycollection").objects(prefix:'audio/jan/', delimiter: '/').collect(&:key)

Output

First, sorting of files which begin with "audio/jan/" will be collected which gives the result as
audio/jan/
audio/jan/2016/
audio/jan/2016/one.mp3
audio/jan/2016/two.mp3

And now it starts looking if there is any file with no "/" after the prefix and then it concludes there is no file so it returns only the folder name

audio/jan/

I hope this blog helped you understand about how to use S3 delimiter and prefix in ruby.

Keep Coding !!!

Contact us to work on your website integration with S3.

Ameena


Have a Project in mind?