storage

Configure an External RAID-1 USB Drive for Backups - Ubuntu 12.04

I have a WD USB 3.0 500gb hard drive for my work laptop who's sole purpose in life is to record back-ups of my Ubuntu 12.04 system.  I've never formatted the device and it's always worked very well for me despite my cavalier treatment: tossing it around in my backpack during my commute mostly. I use a program called back-in-time for my backups and I like it for it's ease-of-use coupled with a healthy sense of fire-and-forget.

Recently, my back-ups started failing with generic write-fail errors so I committed to re-formatting the device.  Since I was going to nuke it, I decided to implement RAID-1 during it's resurrection just to have an "extra" copy of my back-ups.  The partition I back-up is only 100gb -- which is currently about 55% full -- so splitting the device into two logical drives and configuring under RAID shouldn't be a problem, right?

I mean, I have absolutely NO experience with RAID so how hard can this be?

After a few false starts, I honed down the process and this is what I'm sharing with you today.  Oh, and as side note -- in the first paragraph, I linked to this particular drive on Amazon as a courtesy -- I'm not getting and referrals or anything like that...just thought you might want to see what we're dealing with.

Phase 1 - Installation of Tools

Ok - so, to get started, I needed to install mdadm which was pretty easy:

[cc lang='bash' line_numbers='false']

# sudo apt-get install mdadm

[/cc]

What was somewhat confusing was the need for mdadm to install a mailing agent - for the monitoring tool, but during the installation I elected to not configure the mail program and everything still went smoothly.

Next, I installed gparted and all of the available options through the Ubuntu Software Center app.

I already had back-in-time installed and I provided the link above.  Use the Ubuntu Software Center to install this app if you don't already have it.

With all the software in-place, it's time to configure the device.

Part 2 - Wiping and Configuring the Device

Plug your device into your system and wait for it to automount -- which it will do if it still has the factory-equipped partition installed.  Once it's installed, go to your command line and unmount the device:

[cc lang='bash' line_numbers='false']

# umount /dev/{your_mount_point_here}

[/cc]

Either sudo this command, or sudo over to root.  I chose the latter as I get tired of forgetting to sudo.

Once the device is unmounted, start-up gparted,, authenticate, and wait for gparted to stop scanning for devices.  Change your device over to your external drive which was, for me: /dev/sdb.

(Side note: The WD drive comes with it's own software which just swell if you're a Windows or a Mac user.  Otherwise, the software is just taking up space.  Copy these programs if you want but you can always download replacements from the WD website.)

For me, the drive was formatted into one single, large, partition.  Delete this partition and don't look back.

Next, split your partition into two smaller partitions, equally dividing the available space between the two.  RAID-1 is mirroring - you're creating two logical partitions in this device but you will mount it (the system will see it) as a single device.  Data is written to the first partition and RAID copies the data over to the second partition, mirroring the data to the second logical partition.

I chose the ext3 filesystem for my partition simply because:

  • it's better than ext2
  • it's robust
  • I don't believe I'll get any speed benefits from ext4 (USB, eh?)
Make a note of the names of the logical partitions (Mine was /dev/sdb1 and /dev/sdb2.) as you're going to need to know this later.

Note too that you're not actually doing anything at this point - you're just building a task-list for gparted to execute when you're finished creating tasks.

Next, flag each partition as a RAID partition by right-clicking -> Manage Flags -> Raid.

Once this is done, you're ready to execute your task list which will partition and flag the devices.  When the task list completes successfully, you can quit gparted.

Step 3 - MDADM

Next, you need to create the software RAID volume using the mdadm tool.

This is very easy and is done with a single command:

[cc lang='bash' line_numbers='false']

# mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdb2

[/cc]

What this command does is:

  • invokes mdadm with the --create option (useful for creating RAID arrays)
  • specifies the "verbose" flag so you can get meaningful diagnostics should something head south
  • specifies the mount-point device (/dev/md0) for your RAID
  • specifies your RAID level (--level-1) (remember, 1 = mirroring)
  • specifies the number of devices in your RAID (--raid-devices=2)
  • lists the device links for the logical drives to be used in the RAID (/dev/sdb1, /dev/sdb2)
In the screen-shot below, I highlighted the raid mountpoint device, /dev/md0, because (a) I kept forgetting it was a required parameter to the command and (b) I am going to need this device name for the next command...
[cc lang='bash' line_numbers='false']

# mdadm --create --verbose /dev/md0 --level=1 --raid-devices=2 /dev/sdb1 /dev/sdb2 mdadm: /dev/sdb1 appears to contain an ext2fs file system size=244174848K mtime=Wed Dec 31 16:00:00 1969 mdadm: Note: this array has metadata at the start and may not be suitable as a boot device. If you plan to store '/boot' on this device please ensure that your boot-loader understands md/v1.x metadata, or use --metadata=0.90 mdadm: /dev/sdb2 appears to contain an ext2fs file system size=244177920K mtime=Wed Dec 31 16:00:00 1969 mdadm: size set to 244173688K Continue creating array? yes mdadm: Defaulting to version 1.2 metadata mdadm: array /dev/md0 started.

[/cc]

Press enter to execute the command and you should get the prompt back after a few seconds.  Note that the program requires no input from you.

Step 4 - Making and Mounting

This is the step that most of the online and available pages on mdadm leave out.

You still need to format your filesystem!

So, let's do this with mkfs...

[cc lang='bash' line_numbers='false']

# mkfs -t ext3 /dev/md0 mke2fs 1.42 (29-Nov-2011) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) Stride=0 blocks, Stripe width=0 blocks 15261696 inodes, 61043422 blocks 3052171 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=4294967296 1863 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872

Allocating group tables: done Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done

[/cc]

Note that I use the RAID mount-point address of /dev/md0 to create the filesystem.

This will take a few minutes to run so go take a stretch break and play with the dog or annoy your hot receptionist or something.

Once this step completes, you're pretty much done -- all that's left is mounting the device and using it.

[cc lang='bash' line_numbers='false']

# mount /dev/md0 /media/raid

<do stuff like start a back-up>

# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/PAMCAKES-root 84G 9.4G 70G 12% / udev 3.9G 4.0K 3.9G 1% /dev tmpfs 1.6G 920K 1.6G 1% /run none 5.0M 0 5.0M 0% /run/lock none 3.9G 49M 3.9G 2% /run/shm /dev/sda2 242M 159M 71M 70% /boot /dev/mapper/PAMCAKES-user 93G 48G 42G 54% /home /dev/md0 230G 22G 196G 11% /media/raid

[/cc]

Note that your available filesystem space is approximate one-half the capacity of the drive.

That's right - you paid a whopping $89 for a 500gb device out of which you can use about 230gb.

But you have two of them.  So if your first partition fails, you've got a back-up of your back-up thanks to the wondrous magic of RAID.  And, you learned how to set-up a RAID device, so, win.

Unless one of your kids (or grandkids) decides that your sleek, new, external device looks better in the fish tank than it does on your desk, you have an additional and available option for recovering lost data.

Keep in mind that this is a software solution -- hardware RAID is still the preferred way to go when dealing with issues of redundant data storage.  But this works, too.

Step 5 - Maintenance

To stop your RAID device use this command:

# mdadm --stop /dev/md0

To see the state of your RAID, cat /proc/mdstat to your terminal:

[cc lang='bash' line_numbers='false']

# cat /proc/mdstat Personalities : [raid0] [raid1] md0 : active raid1 sdb2[1] sdb1[0] 244173688 blocks super 1.2 [2/2] [UU] [>....................] resync = 1.8% (4474560/244173688) finish=4948.7min speed=806K/sec

unused devices: <none>

[/cc]

To get details about your RAID, use mdadm:

[cc lang='bash' line_numbers='false']

# mdadm --detail /dev/md0 /dev/md0: Version : 1.2 Creation Time : Tue Jul 24 14:41:55 2012 Raid Level : raid1 Array Size : 244173688 (232.86 GiB 250.03 GB) Used Dev Size : 244173688 (232.86 GiB 250.03 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent

Update Time : Tue Jul 24 14:42:51 2012 State : active, resyncing Active Devices : 2 Working Devices : 2 Failed Devices : 0 Spare Devices : 0

Resync Status : 2% complete

Name : pamcakes:0 (local to host pamcakes) UUID : b542c27c:1d620c3e:6f07b9e8:53aee08d Events : 1

Number Major Minor RaidDevice State 0 8 17 0 active sync /dev/sdb1 1 8 18 1 active sync /dev/sdb2

[/cc]

To uncouple the RAID device:

# mdadm --remove /dev/md0

 

That's it for today...hope this helps...there's a lot of good information already available on the web for using mdadm and software RAID - I just wanted to consolidate everything into a contiguous process.

Later, should sufficient motivation present itself, I'll follow-up with alerts and what not...

 

Online Cloud Storage...Which one?

In a recent article, I wrote about cloud storage to use for my source-code repository.  I chose ZumoDrive as the tool to implement this because it allowed filesystem level access to my files from the desktop.  Or, in other words, my desktop sees the ZumoDrive like another physical device attached to my computer. However, in actually using ZumoDrive, I noticed some ... features ... that I wasn't too pleased with.  Instead of capping on just ZumoDrive, I thought I'd offer a narrow perspective on the capabilities and ... features ... of some of the more popular online cloud-based storage options.

My selection process was based on simply whether or not I could use the storage from my Mac.  Let's get started...

The first system tested was Memopal -- this solution is available on all platforms, (Windows, Linux, Mac, Android, iPhone), and offers 2Gb of free storage.  It advertises itself as "online backup and storage" as it archives your files in real-time to their remote servers.

You can browse any of your files online, using a web-browser, and you can also share these files with other users.  Memopal allows transfers of files that are larger than 1Gb, so using this service as a means to copy files to other users, because you can't send large attachments though email, make this a handy solution.

What's also amazing is that you can get 200Gb for only $49 per year.

What's not so amazing is that it doesn't provide you with desktop level access to your files so you can't use the offline storage as a real-time disk file system.  For my needs and purposes, I'm going to pass and un-install the product.

I have been a .mac (or .me) account owner since 2007 and have witnessed several upgrades to the service.  I was in the process of dumping my me.com account when Apple suddenly extended my account until June 30, 2012 in anticipation of the release of their iCloud offering.

The iDisk, as it's referred to via your mobile-me account, is slightly more than 15Gb of online storage which used to cost you $99/year.  (You got other stuff besides the storage which supposedly made it a "deal", but Apple has been forced to re-price their offerings in order to remain strategically competitive with other cloud vendors.)  The iDisk is configurable from your System Preferences menu and, if you're one of the account holders at the time Apple froze the offering, you can no longer upgrade or increase your storage capacity.  What you had is what you have until the iCloud becomes available.

What's good about this storage is that it's accessible as a mount-point (file system access as a device drive) to your system which means you an use it finder, or through any application, to access your files.  It's totally transparent as a remote device.  I also like that I have to manually mount the device to access it so there's never any background "sync" happening to slow my system down when I don't need it.

The downside is the bandwidth limitation of 200Gb of data transfer per month.  If I'm doing a lot of development, I'd imagine I could hit that pretty quickly just checking-in, and then creating and modifying the existing code base.  So I've never tried to use my iDisk for anything other than storing static documents that I don't need clogging up my physical devices.

Because of the bandwidth limitation, Mobile-Me does not satisfy my requirements.

ZumoDrive is the software I initially chose to use as my cloud-storage choice for my source-code control.  ZumoDrive is also the reason why I am writing this article, wishing I had done my due-dilligence in evaluating the software before committing (svn pun) to it.

ZumoDrive offers you 1Gb of free storage which is easily expandable to 2Gb once you complete the "belts" in their "dojo".  Cute.  Basically a test-drive through the product, training in the Dojo advances you through the belts until your max your training at black-belt and you've doubled your storage to 2Gb.

ZumoDrive is software you download and install.  It's available on all desktop and mobile platforms.  ZumoDrive mounts on your desktop as a virtual drive, which meets our purposes of remote file-system storage.  Unfortunately, ZumoDrive caches a copy of your file(s) on your local drive and then updates the remote drive when not in use.  Like, when you're playing World-of-Warcraft and need the network bandwidth because, you know, it's not already laggy enough in capital cities.

If you've followed my previous tutorial and created a TrueCrypt container on your cloud drive, then the real downside of this system makes itself readily apparent.  Uploading a 500Mb file to the server at (average speed) 77Kb/sec is going to take a LONG time.  Changing the cache options to minimize the amount of diskspace stored locally didn't impact this -- the software still sees a single, 500Mb file.

While I love the concept of ZumoDrive -- MobileMe without the bandwidth constraints -- the local caching of the TrueCrypt volume murders the concept since it's doesn't see the files within the TrueCrypt container.

SugarSync is an online cloud storage system that offers a 30-day free trial.  You can get 30Gb of online cloud storage for $5/month or $50/year.  They have a 250Gb plan for $250/year which is sort of funny when you look at what Memopal was charging: $50 for 200Gb...

I'm not going to incur another monthly charge for online storage so, to be honest, I didn't even bother to download and install the product for evaluation.

TeamDrive offers 2Gb of free cloud storage.  From reviewing the product on the website, I knew it wouldn't meet my requirements, but there were enough enticements to the feature-set that I went ahead and downloaded and installed the 100Mb file anyway.  TeamDrive is accessed through a custom-application that's finder-like in it's UX.  You can also access the application my clicking on the relevant icon in the menubar.  TeamDrive offers collaboration and synchronization as it's main features.

The UX is intuitive although window's-like.  Since TeamDrive is primarily collaboration software, it keeps track of the users who are in your team.  Although I didn't think much of the product for what I need, I was encouraged to evaluate the offering because it had a feature I'd not seen before -- the ability to create and host your own TeamDrive server.

I've been using Dropbox for over a year now.  It's my primary means of transferring files between home and work.  I also like the fact that 1Password uses it automagically to synchronize itself.  Dropbox offers 2Gb of free cloud storage that is accessible from pretty much every known appliance available on the market today.  Dropbox provides you with a file-system mount point so that you can access your files via Finder making it perfect for what I need.

The only downside, for me, is that I depend on Dropbox for file storage for other things such that my available bandwidth is only about a 500Mb.  The upgrade costs are prohibitively expensive, especially when compared with other offerings in the industry.  Sorry, Dropbox, but $200/year for 100Gb is not a value-add.

The final note, for DropBox, is that it, too, views an encrypted file container as a single file.  Dropbox's upload/synchronization speed was even worse than ZumoDrive's at 55K/sec.

I also have an Amazon Cloud account -- which is 20Gb of free storage -- because I use Amazon's MP3 cloud on my Android device.  I mention it here because of the phenomenal amount of storage that you get for FREE.  I can only access my cloud storage through the web, which is ok since all I'm storing there are my MP3 files that I've downloaded (DRM Free - hear that, Apple?) from their music service. Unfortunately, I can't access the storage from my desktop...

Other Mentions:

I looked at these products but didn't bother with the evaluation since I could read, from the product descriptions, that they would not satisfy my requirements.

idrive.com -- 5Gb basic (free) solution limited to only back-up and recovery.

syncplicity.com - web-based back-up and recovery tool.  Free, but limited to 2-computer access and 2Gb storage.

spideroak.com - pretty much web-based b&r and file sharing through the web ux.

Conclusions

What I want to do just isn't possible at the current time because of restrictions of my DSL and the way cloud services view a TrueCrypt container.  I've got a pretty good working overview of what's available and I explored a lot of solutions that were pretty damn exciting.  I also think that we're going to see the cloud marketplace evolve rapidly and those companies which are charging significant amounts per megabyte are going to have to rethink their pricing strategies or risk becoming fossil fuel.

I'm also excited by what Apple will bring to the table with iCloud -- I think that we'll be able to have file-system level access to our cloud storage but I'm also sure that the same limitations will apply for synchronization...