If you are building an application that needs to store customers’ data in the cloud, you have a few options to think about.
In this post I will be comparing two common options we analyzed in the past and list their pros and cons based on our trials. I will be using AWS S3 and DigitalOcean Spaces as reference.
👉 The options we compared were:
A. Separate buckets for customers. Where each bucket represents a customer’s folder.
B. Separate customer folders in one bucket (recommended most of the time).
A. Separate buckets
👍 The pros of using separate buckets:
- More organization, security and data separation since every customer has his/her own data in a particular bucket.
- The ability to allow customers to access their data by giving them access to their own buckets.
- Easily block access, delete or know the size of a customer’s bucket.
👎 The cons of using separate buckets:
- If your app relies on a naming convention for each customer’s bucket and for some reason the name of the bucket is taken or unavailable then you have to break the naming convention.
- It is extremely complex if you ever think of backing up customer data (may have to back up hundreds or thousands of buckets).
- It is almost impossible to migrate to another storage provider (migrate hundreds or thousands of customers’ buckets from S3 to DigitalOcean Spaces for example) if you decide you need to migrate to another provider.
- High overhead if you need to make global changes to the buckets’ settings (change the storage class for example of buckets).
B. Separate folders in one bucket
👍 The pros of using separate folders all in one bucket:
- Very easy to manage the bucket settings for all customers, no overhead.
- Easy to migrate data and easy to back up the whole bucket into another bucket or even back it up on another storage provider (back up an S3 bucket into DigitalOcean Spaces).
- You can fully control the naming convention of customer folders, no worries about unavailable names (bucket-name/customer-01, bucket-name/customer-02).
👎 The cons of using separate folders all in one bucket:
- You need to depend on the app layer to block access or know the size of a customer’s folder. Can be done but not as easy as option ‘A’.
- Full access to the bucket means full access to all customers’ folders.
- On AWS S3, you can create credentials that give access to a particular folder in a bucket. Making it function as a virtual bucket (pro for option B if you are using S3).
- On DigitalOcean Spaces, in the time being, any credentials you create give access to all DigitalOcean Spaces buckets and all their folders (con for option A if you are using DigitalOcean).
Let me know if that’s helpful and what you end up doing!
Just like you diversify income sources, diversify your cloud infrastructure providers!
We are always taught that we “should not put all our eggs in one basket” (🥚🥚🥚 + 🧺 = ☠) to avoid the risk of losing everything at once. This can be applied to many verticals, whether it’s applied to skills, investing, income sources, etc… We always need to diversify.
Should we also apply the rule and diversify when it comes to cloud providers? We will figure out.
Last December, on Christmas Day (disasters usually happen on the holidays for some reason 🧐 when everyone is off) one of our clients had their AWS account suspended. Because of the suspension — which we had no idea why it happened in the first place — their production servers, databases and storage completely stopped. Connections to the servers or the databases were timing out, nothing could be reached.
They were using the compute service (EC2) for multiple load-balanced servers, a central caching server, the relational database service (RDS) as a central database serving all applications and the storage service (S3) as a CDN plus an object store for everything else. Luckily the DNS was not managed by Route53 — so that gave some hope in restoring backups on another cloud until the issue is resolved…
We wanted to dig into the AWS account suspension issue deeper to see why it happened and if it was possible to resolve it and get everything up and running quickly. While checking the account billing (since that’s the only thing you can do for a suspended AWS account) we noticed high usage of massively large Windows instances that incurred tremendous charges we know nothing about.
The server instances that we saw on the bill were the most powerful ones to date (Windows running on p3dn.24xlarge) — these were actually just unveiled by Amazon the same month:
“p3dn.24xlarge has 2.5 GHz (base) and 3.1 GHz (sustained all-core turbo) Intel Xeon P-8175M processors and supports Intel AVX-512.”
Amazon states the following use cases for these machines:
“Machine/Deep learning, high performance computing, computational fluid dynamics, computational finance, seismic analysis, speech recognition, autonomous vehicles, drug discovery.”
The mentioned instances ran for a couple of days on the client’s AWS account before the suspension. What the client knows for sure is that they have not launched these instances by themselves or anyone who has authorized access to the account. Which leaves us thinking about two possible scenarios:
- The AWS account was hacked and someone created these server instances
- It could be -but unlikely- that it’s a billing error where AWS mistakenly added these charges to the bill
Unfortunately, solving the problem was taking some time, so it made sense to take more than one action in the same time.
Temporary Solution / Hope
The client has always had regular file and database backups 👍 taken hourly, daily and weekly. We concluded that it was time to temporarily deploy all servers and databases on another cloud provider from the most recent backups.
It all looked good until we realized that all backups were stored on Amazon S3 😱 — and that was the exact moment when the last hope vanished since we could not even restore backups because S3 was suspended and practically learned that we should apply this saying:
“Don’t put all your eggs in one basket”
It’s just not enough to have regular backups, you are still not safe!
You need to have regular backups stored in more than one cloud basket since a single cloud account can simply disappear for a reason or two.
Looking at some of the possible options to see if they were going to be sufficient for a quick recovery:
- ❌ Raw backups stored on a single provider (S3) only — insufficient if the account is suspended
- ❌ Full server snapshots/images at your cloud provider — insufficient if the account is suspended
- ✅ Parallel cloud (servers, database, CDN, storage) running on another provider in addition to AWS (or on stand-by) — more expensive and higher overhead, but mostly sufficient
- ✅ Raw backups stored on multiple storage providers, say S3 and another storage (Google Cloud Storage, DigitalOcean Spaces, etc…) — sufficient in restoring application files and databases in case one account is suspended
- Expect the worst — be prepared
- Enable two-factor authentication for all your accounts
- Deploy your web infrastructure on more than one cloud if possible
- Always have regular backups of your application files, databases and static resources (assets, user content and uploads) stored on more than one storage provider — or at least on a different provider than your cloud infrastructure
If setting up backups for every project is too much manual work -and indeed it can be- then give SimpleBackups a try.
SimpleBackups makes it a breeze to schedule automated backups of all your website files and databases in a simple dashboard. You will get alerts if any of your backups fail and you can store your backups on different storage providers like AWS S3, Google Cloud Storage, DigitalOcean Spaces and more.
In this little piece I am going to highlight some of the reasons of why backups may fail.
While many of the reasons below are common and apply to a wide range of different backup methods, I am specifically assuming you are using a backup service like SimpleBackups.io to back up your servers and databases.
- Not enough disk space
- Server runs out of memory
- Server has been placed behind a firewall and cannot be accessed
- The backup is taking too long to be created and eventually times out
- Trying to back up a non-existent directory or one that has been deleted
- Trying to back up a directory which you don’t have permissions to read
- Invalid/changed server credentials (host, port, username, ssh key, or password)
- A problem uploading backup to remote storage
- Invalid/changed storage credentials (key, secret, region, or bucket)
- Trying to back up an empty database
- Invalid/changed database credentials (db host, db port, db username, db password, or db name)