TechSoup.org The place for nonprofits, charities, and libraries

Data Warehousing for Nonprofits

Is a data warehouse right for you?

Data Warehousing for Nonprofits 
Laura S. Quinn - October 12, 2009

Many organizations use multiple software packages that each track some aspect of their constituents. Even though each system works well by itself, it may be difficult to look across the organization to get a full picture. Some organizations consider a single constituent relationship management system (CRM), one software package that replaces all of the current systems to create a single, integrated view.

This article is courtesy of Idealware, which provides candid information to help nonprofits choose effective software. For more articles and reviews, go to www.idealware.org.

Many organizations use multiple software packages that each track some aspect of their constituents. Even though each system works well by itself, it may be difficult to look across the organization to get a full picture.

Some organizations consider a single constituent relationship management system (CRM), also known as a customer relationship management system – one software package that replaces all of the current systems to create a single, integrated view. This can be a compelling option, especially for smaller nonprofits and charities. However, for larger organizations that are happy with their existing constituent management software and use their specialized features, no single software is likely to address everything needed.

Enter Data Warehousing

Data warehousing is a centralized repository of data that automatically pulls information from each of your systems in order to report on it or analyze it in a central place. Each of your existing software packages continues to work in exactly the same way, but you also have the ability to see all of their data together in one place.

This isn't a low-cost or a low-impact solution — it's likely to take someone experienced between 80 to 120 hours, or potentially far more, to plan out, set up, and build the data exchange processes for a warehouse. You'll need internal IT staff or a strong continuing relationship with a consultant to support it. For larger organizations, however, it can be an interesting option to help understand how the organization is really performing, even across similar types of programmatic function. It can also help identify opportunities to improve service to constituents and understand data in a deeper and more meaningful way.

Components of Data Warehousing

A data warehouse is much more of a set of processes than an actual piece of software you can buy off the shelf. What will you need in order to create a data warehouse?

Defining Your Reporting Needs

Let's begin with the end in mind. Consider the following questions:

  • What reports are you trying to create out of your data warehouse?
  • What information should be included?
  • What analysis is necessary?
  • How flexible do reports need to be over time?

Your goals in this area affect everything else. Take the time to carefully analyze your needs, and determine what systems you'll need to pull data from, what fields to extract, and how you'd ultimately like to see and use the data.

For example, say you want to identify your super-users from a specific region who may be ideal to help procure bigger donation sources. You may need to cross-reference multiple databases to see which supporters have given large donations and participated in a set number of action campaigns in the past year. Think of how you plan to use this data to enhance your work and plan backward from there.

Extracting Data from Your Existing Software

Once your needs are nailed down, you'll need to define a process to get data out of each of your existing systems, ideally in some sort of automated way. This would mean, for instance, the ability for a programmer to access the data via code with an API, to access the database directly, or to prompt the system to automatically export data into a file at the same time every week.

If your software package doesn't allow you to access the data, there's unfortunately little recourse. This means that the ability to access data should be a core consideration when choosing any system. As a last resort, you could plan to manually export the data into a defined file format at the same time each week, but this would introduce a large possibility of human error into the process.

Creating a Data Repository

You'll need somewhere to put the data you'll need to store. It should be a structured database with some flexibility to define the fields that you want to track and the relationships between them. As a minimal solution, you might even be able to set up one of your existing systems to take in information from your other systems — perhaps using custom field functionality. Microsoft Access could also serve as a lightweight data repository.

You could also consider using a flexible constituent management tool with strong reporting for your data repository. SugarCRM or Microsoft Dynamics may serve this purpose. Using one of these tools streamlines the reporting process, as you can use the same software package both to store the data and to create your reports.

However, if one of your key goals is to facilitate complex reporting and data mining and you have tens of thousands of rows of data, these lightweight solutions may slow the system's response time. More robust databases — like MySQL, Microsoft SQL Server, or Oracle – can provide more powerful tools to let you optimize the way your database works so as to provide reports more quickly.

Amazon's SimpleDB can also provide you with a virtual data repository. Its cost is based on the file space and bandwidth you use, but is likely to require considerable technology expertise to implement.

Gathering Data

After you've defined where the data is coming from and where it's going, you'll need to work through the most complicated steps: matching up the data across different systems and defining the business rules to dictate how it should go into the repository.

Before you can even think about combining data from multiple systems, it's critical to ensure that the data itself is defined and stored in standard formats. If a member ID is a six-digit alphanumeric code in one system, and an eight digit number in another, no automated process is going to be able to match them.

Similarly, the data may be messy and undependable; for instance, the names in your volunteer management systems may be spelled incorrectly. Trying to combine it will only lead to trouble. Organizations frequently have to spend considerable amounts of time standardizing and cleaning their data before they're able to view it in a consolidated way.

Matching and Mapping Data

The next step is to think through the mapping and matching process. Which field stores the member ID in each system? The email address? If the different systems have different email addresses for the same constituent, which should win? Define a "system of record" for each field, to define which is likely to have the best data.

You'll then need to define a set of business rules to specify how data should be transformed. For example, do you need to convert spelled out states to state abbreviations to make the data match? Also, what data overrides other data? These business rules are often very complex and the most time consuming part of a data warehousing project.

Moving Data

Once your business rules are defined, you'll need to set up processes to actually transform the data, carry out the rules, and move the data from their starting systems into the repository. These data processing programs can be written by hand in a number of different programming languages, or you might be able to find some software tools that can help. This class of tool is typically called ETL software (Extract, Transform, and Load).

It's worth looking at the package you're going to use for your data repository to see what ETL functions it provides. For instance, SQL Server and SugarCRM both include some ETL support. There are also stand-alone tools which can provide powerful (but complicated) support — for instance, Talend, Jitterbit, and Pentaho are all open-source ETL packages.

Creating Reports and Analyzing Data

Finally, consider the repository and reporting solution that's most likely to meet your needs. It's convenient to store the data in a package that has its own reporting tools. For example, one of your existing systems, Access, SugarCRM, or using Microsoft's Reporting Services or Analysis Services that come bundled with SQL Server. With these options, though, you're likely to run into functional limitations if you have more complex needs.

A specialized reporting tool, like Crystal Reports or JasperReports, can sit on top of your data repository and provided more sophisticated reporting support. Tools like these will also be more complicated to implement and train staff to use. Alternatively, if your reporting needs won't change much over time, you could create reports as web pages that pull data directly from your repository.

There's also a whole category of powerful and very expensive reporting and analysis tools primarily used by the corporate world. Tools like Business Objects and Cognos provide very flexible and powerful support with relatively user friendly interfaces, but can cost hundreds of thousands of dollars to license and implement.

Is a Data Warehouse for You?

Data warehousing is certainly not a solution for every organization. It's not likely to be a great option for tiny organizations or those without any IT support structure. It's a more advanced technique which will require technical skills. It also requires considerable investment: a minimum of 80 hours for modest systems and up to hundreds or even thousands of hours for very complex systems.

Consider whether you've defined your reporting needs carefully enough to really take advantage of a data warehouse. Are you sure you're doing everything you can with your existing systems? Have you thought about the data dashboards that help executives, board members, and staff members easily see how the organization is performing?

Building a warehousing and reporting infrastructure on top of your existing systems will also increase your dependence on them and make it harder to swap out. In fact, many organizations may find more value in replacing every system with one large software package. Integrated packages like these can handle all of the processes and constituents. This is especially true for smaller organizations.

For an organization of 75 staff members or more, a data warehouse may be the right solution. These larger organizations tend to have greater IT support structures, a well-thought-out reporting strategy, and a number of existing systems that that work well. It's a robust and often powerful way to support cross-organizational reporting and data analysis.

For More Information

Salvation Army CIO uses IT to support nonprofit: A fairly detailed case study of Salvation Army's data tracking infrastructure.

A Web-Based Data Warehouse: A look at the data warehouse developed by the Elizabeth Glaser Pediatric AIDS Foundation.

Many thanks to the experts who contributed their time and expertise:

  • Paul Hagen, a consultant with Kairos Strategies who provides business, marketing, sustainability, and technology strategy consulting to help organizations, with an emphasis on Constituent Relationship Management.
  • Peter Davis, a consultant with Freeflow Digital which helps organizations free up and simplify data.
  • Eric Leland, a consultant with Five Paths, provides a wide variety of technology services to small to mid-sized nonprofits, including software selection and data management.

Image: Analytics, Shutterstock