Backgrounding non-profits, companies and people


In the U.S., non-profit organizations can be considered as charities, exempt from tax, if their activities fall under exempt purposes specified by the Internal Revenue Service (IRS). These purposes are summarized by the IRS as “charitable, religious, educational, scientific, literary, testing for public safety, fostering national or international amateur sports competition, and preventing cruelty to children or animals.”

Charities meeting these requirements are known as 501(c)(3) organizations, after the section of the U.S. tax code that governs their operations. As we will see, this includes some organizations with large turnovers. Some organizations have both commercial arms that are treated as for-profits, and non-profit arms that fall under 501(c)(3).

While 501(c)(3) organizations do not have to pay tax, they do have to submit an annual report to the IRS, known as a Form 990. As a journalist trying to gather basic information on a non-profit, the idiosyncratic Form 990 is your friend.

There are several versions of this form, with a condensed version for smaller non-profits, see here. 501(c)(3) organizations with for-profit arms must file their taxable business income using a form called 990-T.

We will consider the full Form 990, also known as 990-EO, filed by organizations with annual earnings of $200,000 or more, and assets of $500,000 or more.

Two websites provide a good starting point to search for non-profits and view their 990s:


From the homepage, you can search for non-profits by name. The advanced search page is here. Once on the page for an individual non-profit, for example here for the American Chemical Society, who will see a green button to the right to download the three most recent 990s:

You will need to sign up for a free account to view these forms. You can also pay for access to more reports and certain information extracted from them — however, I have never found any need to do this.


This is a project set up by the data journalist Luke Rosiak, originally with backing from the Sunlight Foundation. It is systematically digitizing 990s using optical character recognition, so that they become fully searchable. So not only can you use the search box on the home page to search for an organization and its 990s, but you can also search the documents for mentions of people, payments from one non-profit to another, and so on. There is a good write-up of its uses in journalism here.

For comparison with Guidestar, here is the page for the American Chemical Society:

The Tax Documents (Form 990s) tab should have links to 990-EO and 990-T documents going back over 15 years. The most recent ones should habe searchable Raw text. (You can also download the scanned PDFs and search them directly in a PDF viewer such as Adobe Acrobat or Preview.)

When CitizenAudit launched, it was entirely free. It has now become self-supporting, which means that after 40 searches or page views from a particular IP address, you will be asked to sign up for a paid account. This limits its utility, however if you have a project for which you need to gather alot of information on non-profits, and particularly if you need to look at how money passes from one to another, buying a 30-day pass may be a worthwhile option. The full Funders and Grantees section, which lists organizations giving and receiving money to the non-profit in question, is only available with a paid account.

What’s in a Form 990?

Schedules of interest:

An example Form 990

Here is the latest (2014) Form 990 for the American Chemical Society:

In class, we will use this form to answer the following questions:

Your web browser doesn't have a PDF Plugin. Instead you can Click here to download the PDF

More resources for repoting on nonprofits

For excellent reporting on charities behaving badly, read America’s Worst Charities, by the Center for Investigative Reporting and the Tampa Bay Times. Kendall Taggart, now with BuzzFeed News, has a great slide deck explaining the resources she used to report these stories. She explains more about how the reporting was done here.

America’s Worst Charities is particularly interesting because the reporters weren’t just analyzing the spending of one or two charities: They looked at thousands, focusing on those that used the services of professional fundraising companies, and then highlighting those that spent proportionately the least on their supposed charitable function. To do this, they downloaded bulk financial data, extracted from 990s, provided by the IRS. The slide deck explains that they used Open Refine for cleaning this data, and SQLite for analyzing it — tools we’ve explored in previous weeks.

The slide deck also recommends resources to track charity finances available in some states, and a database of disciplinary actions taken by states against charities, compiled by CIR and the Tampa Bay Times.


If a company is public, meaning its shares are traded on a stock exchange, then the value of its shares, and the volume of their trading, can provide useful information about its general health, and reveal significant events affecting the company. Google and Yahoo both operate finance sites at which you can view these metrics over time.

In class, we will use Google Finance to explore the history of trading in the stock of Valeant Pharmaceuticals, which has become controversial after being accused of price gouging.

At Google Finance, you can view the graphs of stock value and trading volume over various periods (up to ten years), and compare both to Related companies, listed below the graph, or standard indices such as the Dow Jones Industrial Average. Companies can be searched by their full names, or by their abbreviated stock market ticker — VRX in the case of Valeant.

Yahoo finance also allows you to download data on the stock value history of companies by constructing urls of the following form:


Dow Jones Industrial Average:^DJI

Search here for the tickers/symbols for companies and indices to use in constructing these urls.

Securities and Exchange Commission

Public companies are regulated by the Securities and Exchange Commission (SEC), which describes its role as to “protect investors, maintain fair, orderly, and efficient markets, and facilitate capital formation.” Traded companies are obliged to report certain information to the SEC, which you can view by searching for them at the SEC’s main company search page.

At the page for each company, you will find its filings to the SEC over time. Here is a guide to the codes for some of the most important documents:

You can search for SEC enforcement actions here. Try searching for Sequenom and see what you find! (This brief story will provide some background to why Sequenom got into trouble.)

Here is the SEC’s full text search, which can be useful, for example, for finding references to individuals, and privately-held companies, in documents filed by publicly-traded companies.


Sqoop is a new site that aims to provide a one-stop-shop for information on companies. Sign up for a free account here.

You can search using the names of companies, individual corporate officers, or keywords, and it will return links to SEC filings, patents and cases in in federal courts. (You will still need a login for the PACER federal court search system to access federal court documents.)

Here are the results for a search on Valeant:

You can save a search to receive updates by email of new documents.

If you are a local reporter, Squoop also provides the option to narrow your SEC filing searches by geography. Click on the Locations button after running your search to activate this option.

Sqoop is currently made freely avilable to journalists, as it tries to expand its user base. It is a for-profit, however, so it is possible that charges will be introduced in future.

Privately-held companies

It is much harder to do background research on privately-held companies, which do not have to file reports to the SEC. So think about the agencies (state and federal) that have some oversight or regulatory role. For example, a company running animal experiments will have to file reports about its use of animals to the U.S. Department of Agriculture’s Animal and Plant Health Inspection Service, which can be searched here.

While somewhat dated, here is a useful guide to backgrounding privately-held companies.

In the U.S., companies are registered with the authorities in the state they call home. You can search for businesses registered in California here. You can order basic documents about companies’ registration using this form.

But be aware that the name a company trades under may not be its registered name. To find out the registered name, you may need to run a “fictitious business name” search at the county level. Here is the search site for Santa Cruz County, and here is useful a site from which you can navigate down to all sorts of public records at state and country levels, including fictitious business name searches.

OpenCorporates is an ambitious effort to build a database of information on companies throughout the world, public and private.

Finding former employees

Any time you are investigating an organization, whether a company or a nonprofit, former employees can be an invaluable resource. (Of course, you need to be wary of embittered ex-employees settling scores.) In recent years, LinkedIn has emerged as the best way of finding them.

If you are a professional journalist (so you may have to wait until you’re out in the world of employment!) you can ask to join the LinkedIn for Journalists group. You should then get an invitation to attend a webinar, and once you’ve done that you’ll be granted a free premium account, which allows you to use the sites advanced search functions and send InMail messages to people you want to contact.


Searching for information on individual people can be difficult, without access to a commercial research tool such as Accurint. In my experience, most online search sites that offer background checks, contact information, and so on, are more effective at parting you from your money than providing good information.

Again, it helps to think about official agencies to which individuals must report information. Who Is John Doe, a site put togther by investigative reporter Duff Wilson, provides a very useful guide to potential sources of information. Here is another site that provides links to various people-searching resources.

Making maps with CartoDB

The data we will use

Download from here, unzip the folder and place on your desktop. It contains the following:

mapping Folder containing:

Getting started with CartoDB, and uploading data

CartoDB is a cloud-based mapping application that makes it easy to produce interactive, online maps. These maps can include animations of data over time. It is also a geospatial database, allowing you to process geodata using Structured Query Language.

Login to a new CartoDB account, and you should see a screen like this:

Open the drop-down menu under Maps at top left and switch to Your datasets. The click the green NEW DATASET button at top right:

You should now see the following screen:

With the Data file tab selected, click the Browse button, navigate to the zipped shapefile and click Open. Then click the green Connect dataset button.

CartoDB can import geodata in a variety of formats, including CSV, KML, GeoJSON and zipped shapefiles. See here for more on imports and supported data formats.

Once the data has imported, you will see the uploaded data table in DATA VIEW:

Notice that, in addition to the fields from the original data, each row has been given a cartodb_id, which is a unique identifier for each. The table also has a field called the_geom which has the tag GEO. This field is central to how CartoDB works, defining the geometry of any map you make. These geometries can be points, lines or polygons (areas) — which is what we have here.

You can rename fields, sort the table by the data in them, or change their data type (for example from numbers to strings of text), by clicking the downward-pointing triangle next to the header of each.

The important column in this data is acc_val, which represents the peak ground accelaration expected over 50 years, with a two-percent probability. The numbers are expressed as a percentage of g, the acceleration due to gravity.

Switch to MAP VIEW to see the basic, unstyled map:

Click the small return arrow at top left to go back to the overview of your datasets.

Notice that the top menu has a link to DOCUMENTATION, which has links to CartoDB’s technical manuals. The Data library link contains useful datasets that you can import into your own account. Take a few minutes to explore what’s there, before returning to your Datasets.

Now click the NEW DATASET button again and import the file oregon_dams.csv, which should look like this in the DATA VIEW:

Notice that the_geom for points is given by their longitude and latitude co-ordinates.

Notice that there is a column called hazard, with valaues of H for high, S for significant, and L for low. H means a dam could cause loss of human life if it failed; S means failure could cause significant economic or environmental damage. However, these ratings do not mean that a dam is likely to fail.

Click on the MAP VIEW to see the locations of all of the dams:

Run a SQL query to select potentially hazardous dams in high seismic risk zones

We will now run the following query, to filter the dams to return those with a hazard rating of H or S only in the zones of highest seismic risk, where acc_val is 40 or more:

SELECT oregon_dams.*
FROM oregon_dams, oregon_seismic_risk
WHERE ST_WITHIN(oregon_dams.the_geom, oregon_seismic_risk.the_geom) AND oregon_seismic_risk.acc_val >= 40 AND (oregon_dams.hazard = 'H' OR oregon_dams.hazard='S')

CartoDB is based on a PostgreSQL database. It works similarly to SQLite, although the syntax for some queries is a little different. Here is a PostgreSQL tutorial, if you would like to learn more.

Importantly, CartoDB allows you you run spatial/grographic queries using an extension to PostgreSQL called PostGIS. PostGIS functions can, for instance, calculate distances or areas, and all begin with the prefix ST_. See here for a full list of PostGIS special functions.

The query above uses ST_Within (see here) to select everything from the oregon_dams dataset where the geometry for those points falls within the geometry of the oregon_seismic_risk zones, if acc_val is 40 or more. When running a query like this, referencing the_geom from two datasets/tables, both tables must appear in the FROM clause, separated by a comma.

Click Apply query and you will be prompted to create dataset from query. Click on this link, and rename the new dataset as oregon_dams_hazard by clicking on its name at top left.

Select the MAP VIEW to see the filtered dataset on a map:

Create a map combining two datasets

Exit this map and reopen the oregon_seismic_risk dataset. Then click the VISUALIZE button at top right.

You will then see a prompt to create a new map. Click the green OK, CREATE MAP button. Rename this map oregon_seismic_dams by clicking on its name at top left.

Now add oregon_dams_hazard to the map, by clicking on the blue + button to the right. At the dialog box, select the oregon_dams dataset and click the ADD LAYER button.

Now select MAP VIEW to see both layers on the same map:

Select a basemap

Close the panel at bottom left suggesting interesting maps.

Now choose a basemap for your visualization, by clicking Select basemap at bottom left. Take a few minutes to explore the built-in basemap options.

Style the maps using the CartoDB wizard

Notice that the toolbar at right has tabs numbered 1 and 2. It you hover over them, you will see that they correspond to the oregon_seismic_risk and oregon_dams_hazard layers respectively.

Click on 1 to expose the Map layer wizard for the oregon_seismic_risk layer, which can also be reached by clicking the paintbrush icon:

(You can collapse the wizards at any time by clicking to the left of any of the icons.)

Notice that opening the wizard has also exposed blue toggle controls for each layer, which can be used to turn the visibility for each on and off. Hide the oregon_dams_hazard layer so we can see what we are doing.

Scroll from left to right through the visualization options, and select CHOROPLETH to make a map where larger values for seismic risk correspond to more intense colors

Set acc_val as the data Column, select 5 Buckets, and set them by Quantile. The map should now look like this:

Now click 2 to switch to the oregon_dams_hazard layer, and turn on its visibility.

In the Map layer wizard, select CATEGORY and color the circles by their hazard level, by selecting hazard as the data Column. The map should now look like this:

Go back into the Map layer wizard and manually edit the colors, by clicking on the colored boxes, selecting new colors so that high risk dams are emphasized with a more intense color:

Edit the legend

The meaning of legend will not immediately be obvious to someone who does not know what the numbers mean, and what H and S refer to.

For the oregon_dams_hard layer, click on the legend icon:

Change the Title from empty to Dam hazard:

Now click on the </> link to the HTML for the legend, and edit to the following, and click Apply:

<div class='cartodb-legend category'>    
<div class="legend-title">Dam hazard</div>
        <div class="bullet" style="background: #0F3B82"></div> High
        <div class="bullet" style="background: #5CA2D1"></div> Significant

Edit the legend for the oregon_seismic_risk layer, adding Seismic risk for title, and replacing the numbers for Left label and Right label with Low and High. Click Apply and the map should look like this:

Configure tooltips

Select the oregon_dams_hazard layer, and click the infowindow icon:

In the Hover tab, select the dam_name toggle control, and uncheck title. Now the dam’s name should appear when you hover over each point.

Configure the map options, and publish

We are almost ready to publish the map, but before doing so, click Options at the bottom left of the map to select the controls and other items you want to include. Here the Search box, which geocodes locations entered by the user and zooms to them, is disabled; the option to switch to a Fullscreen view of the map is enabled:

Also explore the Add Element button at top left, which allows you to add a title and other annotations to your map.

Having finished working on the visualization, click the PUBLISH button at top right. This will call up the following options:

Copy the code from Embed it to obtain an iframe which will allow you to embed the map on any web page, in the following format:

<iframe width="100%" height="520" frameborder="0" src="" allowfullscreen webkitallowfullscreen mozallowfullscreen oallowfullscreen msallowfullscreen></iframe>

(Note that you can edit the dimensions of the iframe — here set at 100% of the width of the div in which it appears — and 520 pixels high) as required.)

Further reading/resources

CartoDB tutorials