---
title:
  "Efficient uploading and persistent storage of NeetoRecord videos using AWS S3"
description:
  "How to split a large recording file into parts and uploaded them
  independently to AWS S3 for persistent storage"
canonical_url: "https://www.bigbinary.com/blog/persistant-storage-for-recordings-in-s3-loom-alternative-part-2"
markdown_url: "https://www.bigbinary.com/blog/persistant-storage-for-recordings-in-s3-loom-alternative-part-2.md"
---

# Efficient uploading and persistent storage of NeetoRecord videos using AWS S3

How to split a large recording file into parts and uploaded them independently
to AWS S3 for persistent storage

- Author: Unnikrishnan KP
- Published: March 16, 2024
- Categories: NeetoRecord, s3

This is part 2 of our blog on how we are building
[NeetoRecord](https://www.neeto.com/neetorecord), a Loom alternative. Here are
[part 1](https://www.bigbinary.com/blog/build-web-based-screen-recorder-loom-alternative-part-1)
and
[part 3](https://www.bigbinary.com/blog/mp4_transmuxing_and_streaming_support-loom-alternative-part-3).

In the previous blog, we learned how to use the Browser APIs to record the
screen and generate a WEBM file. We now need to upload this file to persistent
storage to have a URL to share our recording with our audience.

Uploading a large file all at once is time-consuming and prone to failure due to
network errors. The recording is generated in parts, each part pushed to an
array and joined together. So it would be ideal if we could upload these smaller
parts as and when they are generated, and then join them together in the backend
once the recording is completed. AWS's
[Simple Storage Service (S3)](https://aws.amazon.com/s3/) made a perfect fit as
it provides cheap persistent storage, along with
[Multipart Uploads](https://docs.aws.amazon.com/AmazonS3/latest/userguide/mpuoverview.html)
feature.

S3 Multipart Uploads allow us to upload large objects in parts. Rather than
uploading the entire object in a single operation, multipart uploads break it
down into smaller parts, each ranging from 5 MB to 5 GB. Once uploaded, these
parts are aggregated to form the complete object.

## Initialization

The process begins with an initiation request to S3, where a unique upload ID is
generated. This upload ID is used to identify and manage the individual parts of
the upload.

```
s3 = Aws::S3::Client.new

resp = s3.create_multipart_upload({
  bucket: bucket_name,
  key: object_key
})

upload_id = resp.upload_id
```

## Upload Parts

Once the upload is initiated, we can upload the parts to S3 independently. Each
part is associated with a sequence number and an ETag (Entity Tag), a checksum
of the part's data.

Note that the minimum content size for a part is 5MB (There is no minimum size
limit on the last part of your multipart upload). So we store the recording
chunks in local storage until they are bigger than 5MB. Once we have a part
greater than 5MB, we upload it to S3.

```
part_number = 1
content = recordedChunks

resp = s3.upload_part({
  body: content,
  bucket: bucket_name,
  key: object_key,
  upload_id: upload_id,
  part_number: part_number
})

puts "ETag for Part #{part_number}: #{resp.etag}"
```

## Completion

Once all parts are uploaded, a complete multipart upload request is sent to S3,
specifying the upload ID and the list of uploaded parts along with their ETags
and sequence numbers. S3 then assembles the parts into a single object and
finalizes the upload.

```
completed_parts = [
  { part_number: 1, etag: 'etag_of_part_1' },
  { part_number: 2, etag: 'etag_of_part_2' },
  ...
  { part_number: N, etag: 'etag_of_part_N' },
]

resp = s3.complete_multipart_upload({
  bucket: bucket_name,
  key: object_key,
  upload_id: upload_id,
  multipart_upload: {
    parts: completed_parts
  }
})
```

## Aborting and Cancelling

At any point during the multipart upload process, you can abort or cancel the
upload, which deletes any uploaded parts associated with the upload ID.

```
s3.abort_multipart_upload({
  bucket: bucket_name,
  key: object_key,
  upload_id: upload_id
})
```

The uploaded file will finally be available at `s3://bucket_name/object_id`

S3 Multipart Uploads offers us several advantages:

### Fault tolerance

We can resume uploads from where they left off in case of network failures or
interruptions. Also, uploading large objects in smaller parts reduces the
likelihood of timeouts and connection failures, especially in high-latency or
unreliable network environments.

### Upload speed optimization

With multipart uploads, you can parallelize the process by uploading multiple
parts concurrently, optimizing transfer speeds and reducing overall upload time.

## Links

- [Human page](https://www.bigbinary.com/blog/persistant-storage-for-recordings-in-s3-loom-alternative-part-2)
