{"id":126,"date":"2014-11-20T03:00:13","date_gmt":"2014-11-20T03:00:13","guid":{"rendered":"http:\/\/www.cubebackup.com\/blog\/?p=126"},"modified":"2014-12-05T12:00:56","modified_gmt":"2014-12-05T12:00:56","slug":"automatic-backup-linux-using-rsync-crontab","status":"publish","type":"post","link":"https:\/\/www.cubebackup.com\/blog\/automatic-backup-linux-using-rsync-crontab\/","title":{"rendered":"Automatic backup plan for Linux servers using rsync and crontab"},"content":{"rendered":"<p>Linux servers are widely used by lots of companies for hosting their websites, databases, or other services. \u00a0I personally likes Linux system much more than Windows Server, not only because Linux is free, but you get better performance and more powerful tools on Linux. \u00a0To set up a Linux server, sometime you don&#8217;t actually need a physical server, \u00a0subscribing a VPS from Linode, DigitalOcean or Amazon could be a better choice. \u00a0Though most cloud server providers have \u00a0backup or snapshot services (usually not free), it&#8217;s better to have your own backup plan.<\/p>\n<p>The <strong>rsync<\/strong> command is a ideal tool for copying and synchronizing files and directories to a remote computer, while the <strong>crontab<\/strong> command is used to schedule jobs\u00a0to be executed periodically. \u00a0Combinding these two commands, we can setup a light-weight and effective backup solution.<\/p>\n<p>Due to the difference of Linux distributions, in this article, we use\u00a0CentOS\/Redhat system as an example to introduce how to setup up the backup plan. \u00a0However, before that, here are several questions to think about:<\/p>\n<p><strong>First, \u00a0what data should be backed up?<\/strong><\/p>\n<p>Generally, we only need to backup data important to us, like website pages, databases, configuration files and personal data. \u00a0It is generally not necessary to backup data such as Linux system files, installed software.<\/p>\n<p>Here \u00a0are several directories that need to be taken care of:<\/p>\n<ul>\n<li>\/etc directory: \u00a0Though some files in this directory don&#8217;t need to backup, I wouldn&#8217;t bother to pick them out. \u00a0And since the total size of this directory is usually no more than 50 Megabyte, it would not hurt to back up the whole directory.<\/li>\n<li>\/home directory: This is the location for personal user data of\u00a0all accounts (except root), \u00a0and obviously, the backup plan should cover this directory. \u00a0However, there is a problem: There are lots of cache data, log files, download software, or history records located in this directory. It is just meaningless to backup these data. \u00a0Rather than backing up the whole \/home directory, \u00a0putting only specific sub-directories, such as \/home\/someone\/gitdata, \/home\/anotherone\/documents into your backup list would be a better choice.<\/li>\n<li>\/var\/www directory: \u00a0This is the default directory for website files. (If your web files are located in other directories, find them out and put into your backup list ).<\/li>\n<li>\/var\/spool\/mail: \u00a0This is where the mail data is located, and definitely should be \u00a0backed up.<\/li>\n<li>\/var\/lib\/mysql: \u00a0This is the directory for holding the database data. \u00a0Include this directory in your backup list.<\/li>\n<\/ul>\n<p>You may have other utilities or service data scattered in other directories which need to be backed up, \u00a0think carefully and find them out before you take action.<\/p>\n<p><strong>Second, fully backup or incremental backup?<\/strong><\/p>\n<p>If you want to fully backup all data listed above every time, you can compose a bash script to archive all files using Linux <strong>tar<\/strong> command and then send (<strong>scp<\/strong>) the tarball to the backup location. This method works well for backup in a local area network, but might not feasible for backing up data from a remote VPS to a local computer because hundreds or even thousands of megabytes are transferred between the two remote computers each time. It is a waste of bandwidth and disk storage.<\/p>\n<p>Incremental backup, which employs the rsync utility in Linux, backs up only modified data each time. For most cases, this is a right choice due to its efficiency and cost-saving.<\/p>\n<p><strong>Where to store the backup data?<\/strong><\/p>\n<p>Generally, backup data should be stored on a remote computer, either another Linux VPS, or a Linux computer\u00a0inside your company.<\/p>\n<p><strong>How to schedule an automatic backup plan?<\/strong><\/p>\n<p>Cron job is the best choice\u00a0to schedule command to be executed periodically, for example, a backup script\u00a0can be scheduled at midnight each day.<\/p>\n<h1 style=\"text-align: justify\">The following are\u00a0the detailed instructions for\u00a0making an automatic and incremental backup:<\/h1>\n<p><em>Host A: the active server with CentOS\/Redhat system.<\/em><\/p>\n<p><em>Host B: the backup service with CentOS\/Redhat system.<\/em><\/p>\n<p>I.) \u00a0Make sure rsync is installed on both Host A and Host B. \u00a0If it is not installed, install it using command:<\/p>\n<pre><strong style=\"color: #000000;font-style: normal\">yum install rsync<\/strong>.<\/pre>\n<p>II.) Login to host B using root account (Crontab requires a root user permission, though theoretically\u00a0it\u00a0can be done using a non-root account, but not easy) , \u00a0and create a directory to\u00a0hold the backup data:<\/p>\n<pre><strong style=\"color: #000000;font-style: normal\">mkdir -p \u00a0\/var\/ServerBackup<\/strong><\/pre>\n<p>III.) Generate the SSH key pair :<\/p>\n<pre><strong style=\"color: #000000;font-style: normal\">ssh-keygen<\/strong><\/pre>\n<p>Two files are generated in the \/root\/.ssh directory: id_rsa is a private key file, while id_rsa.pub is a public key file, which must to be copied to Host A:<\/p>\n<pre><strong>scp \/root\/.ssh\/id_rsa.pub \u00a0root@A.com:\/root\/id_rsa.pub<\/strong><\/pre>\n<p>IV.) \u00a0Login to host A using root account, and attach the content in id_rsa.pub file to the authorized_keys file. \u00a0If \/root\/.ssh\/authorized_keys file doesn&#8217;t exist in host A, execute the following commands to create it first:<\/p>\n<pre><strong><em> mkdir -p \/root\/.ssh<\/em><\/strong>\r\n <strong><em>chmod 700 \/root\/.ssh<\/em><\/strong>\r\n <strong><em>touch \/root\/.ssh\/authorized_keys<\/em><\/strong>\r\n <strong><em>chmod 600 \/root\/.ssh\/authorized_keys<\/em><\/strong><\/pre>\n<p>To attach the public key generated in host B to authorized_keys file:<\/p>\n<pre><strong>cat \/root\/id_rsa.pub &gt;&gt; \/root\/.ssh\/authorized_keys<\/strong><\/pre>\n<p>Now we can use scp or rsync to transfer data from\u00a0host A to host B without password required.<\/p>\n<p>V.) \u00a0Modify \/etc\/crontab file to schedule the execution of backup script. \u00a0Add this line to the end of the crontab file:<\/p>\n<pre><strong>0 2 * * * root bash backup.sh<\/strong>\u00a0 \u00a0 \u00a0# the script file backup.sh is scheduled to be executed every day at 2:00AM.<\/pre>\n<p>The content of the backup.sh script is something like this:<\/p>\n<pre style=\"color: #000000\">#!\/bin\/sh\r\n\r\n<strong>\/usr\/bin\/rsync -avz -e \"ssh -i \/root\/.ssh\/id_rsa.pub\" \u00a0root@A.com:\/etc  \/var\/ServerBackup\r\n\/usr\/bin\/rsync -avz -e \"ssh -i \/root\/.ssh\/id_rsa.pub\"  --exclude mysite\/updraft  --exclude mysite\/.cache   \u00a0root@A.com:\/var\/www\u00a0 \u00a0\/var\/ServerBackup\r\n........  (other similar commands)\r\n\/usr\/bin\/rsync -avz -e \"ssh -i \/root\/.ssh\/id_rsa.pub\" \u00a0root@A.com:\/var\/lib\/mysql \u00a0 \/var\/ServerBackup<\/strong><\/pre>\n<p>This script is just a sample and you can modify it based on your need. \u00a0You can use Linux man pages to get more usage of rsync.<\/p>\n<p>Now you can rest easy without worrying about the data loss.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Linux servers are widely used by lots of companies for hosting their websites, databases, or other services. \u00a0I personally likes Linux system much more than Windows Server, not only because Linux is free, but you get better performance and more powerful tools on Linux. \u00a0To set up a Linux server, sometime you don&#8217;t actually need [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"zakra_sidebar_layout":"customizer","zakra_remove_content_margin":false,"zakra_sidebar":"customizer","zakra_transparent_header":"customizer","zakra_logo":0,"zakra_main_header_style":"default","zakra_menu_item_color":"","zakra_menu_item_hover_color":"","zakra_menu_item_active_color":"","zakra_menu_active_style":"","zakra_page_header":true,"footnotes":""},"categories":[28],"tags":[7,8,9],"class_list":["post-126","post","type-post","status-publish","format-standard","hentry","category-backup","tag-backup","tag-linux","tag-rsync"],"_links":{"self":[{"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/posts\/126","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/comments?post=126"}],"version-history":[{"count":22,"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/posts\/126\/revisions"}],"predecessor-version":[{"id":149,"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/posts\/126\/revisions\/149"}],"wp:attachment":[{"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/media?parent=126"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/categories?post=126"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cubebackup.com\/blog\/wp-json\/wp\/v2\/tags?post=126"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}