How can Google Workspace backups be optimized for large organizations?
Large organizations may find that it can take a considerable amount of time for CubeBackup to complete the initial backup for all users and data. This is especially true for organizations with thousands of users and a large number of Drive files or Gmail messages in their G Suite domain.
The first backup must download all data from your G Suite domain, but since CubeBackup employs incremental backups, subsequent backups will only download new or modified data. Thus, while it may take a long time to finish the first backup if there are many terabytes of data in your domain, later backups will complete much more quickly.
Best Practices for Large Organizations:
1. Store the data index on a local SSD.
The data index acts as a cache to speed up the backup process. Placing the data index on a fast local disk can greatly improve the backup speed. If you have put the data index on an HDD or a network storage, please consider changing it to a local SSD.
If CubeBackup runs on an AWS EC2 VM, consider using a Provisioned IOPS SSD (io1) volume to store the data index.
2. Make sure your backup server has at least 8GB of memory.
CubeBackup runs backup jobs in parallel. Usually more than 10 backup threads are running simultaneously, so it can consume quite a bit of memory. Although 4GB is often sufficient for small organizations, 8GB or above is strongly recommended for large organizations.
3. Run CubeBackup on an AWS EC2 VM and back up G Suite data to an AWS S3 bucket.
Local storage can fast, but for large organizations, CubeBackup runs even faster on an AWS EC2 instance while backing up data directly to an AWS S3 bucket. AWS S3 simply has greater throughput capability than most local storage. The network bandwidth on AWS EC2 is usually much faster than your local network, plus you don’t need to worry about the backup jobs consuming the bandwidth of your office network.
4. Ensure that the “Backup files shared with me” option is not enabled.
Enabling this option in CubeBackup may result in many duplications in the backup data. When this option is enabled, any files shared among users in the domain will be duplicated and stored separately for each user. This problem is greatly magnified in large organizations with hundreds or thousands of users.
5. Consider running multiple CubeBackup instances on separate servers/VMs.
If you have more than two thousand users in your G Suite domain, please consider splitting these users over several CubeBackup instances to speed up the backup process. CubeBackup supports selecting users based on their OUs, so that you can easily separate users in your domain into different servers.
For example, if you have 3,000 users, you can:
- Run CubeBackup on 2 AWS EC2 instances, each of which backs up 1,500 users. The backup data on these 2 EC2 VMs are stored in the same S3 bucket (Of course, you could split the data into two buckets instead, if you choose).
- Or, run CubeBackup on 3 local servers/VMs, each of which backs up 1,000 users. Each server should point to different local disks or different NAS partitions.
Last but not least, do not hesitate to contact us (firstname.lastname@example.org) if you need help.