Herbalife's Organic Data Growth Creates Unexpected Backup Challenges
Exposed. That was the position that Herbalife's Principal IT Engineer, Andy Hansen, found himself in more frequently in mid-2007 as he watched Herbalife's data growth explode and the backup software that he was using struggle to keep up. Much of Herbalife's new data growth was driven by its new corporate-wide enterprise resource planning (ERP) software initiative that increased its production data stores from 32 TBs to 240 TBs of data. This growth plus new backup demands left Hansen uncertain as to if Herbalife could recover from data loss or application disruption should any type of outage occur - minor or major.
The rapid explosion of data that Herbalife experienced only exacerbated the issues that Hansen was already having with his current backup software. So in his role as Principal IT Engineer, he was tasked with identifying a solution that addressed the problems Herbalife was encountering and could scale into a new environment. The specific problems that Hansen was encountering included:
The rapid explosion of data that Herbalife experienced only exacerbated the issues that Hansen was already having with his current backup software. So in his role as Principal IT Engineer, he was tasked with identifying a solution that addressed the problems Herbalife was encountering and could scale into a new environment. The specific problems that Hansen was encountering included:
- No backup reporting. Hansen had little or no insight into the causes of why his backup jobs were failing or what steps he needed to take to fix them. Since his current backup software stored its indexes and backup job information in a SQL Server database, he had to write his own SQL queries to pull needed information out of the database to try to understand why backup jobs were failing.
- Backups failing in the middle of bundled backup jobs. His backup software allowed him to bundle multiple backups for individual servers into one queue such that when the backup of one application server was complete, the backup of the next application server in the queue would begin. The problem that Hansen was encountering with this approach was two-fold. First, if the backup of one application server in the backup queue failed, all of the other backups after that application server in the queue would also fail. Second, it was very difficult to construct queries to find out exactly why the backup of a particular application server in the queue failed. As a result, Hansen was often left to guess exactly why his backups were failing since he did not have the time to research and diagnose the cause of each backup failure.
- No Linux agents. Herbalife's new ERP application used an Oracle database that ran on a Linux platform. However Hansen's backup software did not offer Linux agents for Oracle backup at that time.
- No integration between different instances of the same backup software. As part of Herbalife's ERP initiative, it was consolidating the management of its data in its central and remote sites so all of its resources could be centrally tracked and managed. When Hansen checked on how well his current backup software would support this new configuration, he discovered that there was no way to combine the different instances of the backup software catalogs at the central and remote sites so they functioned as one.
Leave a comment