Distinctive Birds

This chapter uses VisiData to answer this question: What birds account for the largest percentages of their state’s collisions?

In the process, we’ll cover the following skills:

  • Selecting rows that match a phrase
  • Toggling selected rows
  • Creating filtered sheets
  • Renaming sheets and columns
  • Creating frequency tables
  • Creating and removing key columns
  • Creating new columns
  • Joining sheets

Note

If you haven’t already, download faa-wildlife-strikes.csv, a dataset of all aircraft-wildlife collisions reported to the Federal Aviation Adminsitration between 2010 and mid-2016.

Open the wildlife-strikes dataset in VisiData

Run this command in your terminal:

vd faa-wildlife-strikes.csv

If it worked, you should see something like this:

 OPERATOR           | ATYPE          | INCIDENT_DATE      | STATE   | AIRPORT            | PHASE_O> 
 BUSINESS           | PA-28          | 05/22/15 00:00:00  | FL      | VERO BEACH MUNICIP…| APPROACH 
 BUSINESS           | BE-1900        | 06/18/15 00:00:00  | AK      | KENAI MUNICIPAL AR…| APPROACH 
 BUSINESS           | PA-46 MALIBU   | 09/20/15 00:00:00  | TX      | DAVID WAYNE HOOKS …|          
 DELTA AIR LINES    | B-717-200      | 11/07/15 00:00:00  | MO      | LAMBERT-ST LOUIS I…| APPROACH 
 BUSINESS           | BE-90 KING     | 12/17/15 00:00:00  | FL      | POMPANO BEACH AIRP…| LANDING… 
 DELTA AIR LINES    | B-757          | 07/17/15 00:00:00  | VI      | HENRY E ROHLSEN AR…|          
 DELTA AIR LINES    | B-717-200      | 08/02/15 00:00:00  | TX      | SAN ANTONIO INTL   | APPROACH 
 BUSINESS           | C-414          | 08/03/15 00:00:00  | TX      | LONE STAR EXECUTIV…| DEPARTU… 
 ALLEGIANT AIR      | MD-80          | 09/02/15 00:00:00  | FL      | TAMPA INTL         | APPROACH 
 TRANS STATES AIRLI…| EMB-145        | 09/07/15 00:00:00  | MO      | LAMBERT-ST LOUIS I…| APPROACH 
 BUSINESS           | C-172          | 11/28/15 00:00:00  | FL      | OPA-LOCKA EXECUTIV…| APPROACH 
 GOVERNMENT         | EC120          | 12/08/15 00:00:00  | CA      | NORMAN Y. MINETA S…|          
 AMERICAN AIRLINES  | A-321          | 05/06/15 00:00:00  | FL      | FORT LAUDERDALE/HO…| APPROACH 
 EXPRESSJET AIRLINES| CRJ100/200     | 05/06/15 00:00:00  | AR      | FORT SMITH REGIONA…| CLIMB    
 MESA AIRLINES      | CRJ900         | 05/08/15 00:00:00  | AR      | BILL AND  HILLARY …| LANDING… 
 BUSINESS           | HELICOPTER     | 05/06/15 00:00:00  |         | UNKNOWN            | En Route 
 DELTA AIR LINES    | A-320          | 05/07/15 00:00:00  | CA      | METRO OAKLAND INTL |          
 DELTA AIR LINES    | A-320          | 05/08/15 00:00:00  | UT      | SALT LAKE CITY INTL|          
 LUFTHANSA          | A-380          | 05/10/15 00:00:00  | TX      | GEORGE BUSH INTERC…| CLIMB    
 BUSINESS           | C-172          | 05/08/15 00:00:00  | FL      | ORLANDO SANFORD IN…| APPROACH 
 SPIRIT AIRLINES    | A-319          | 05/10/15 00:00:00  | IL      | CHICAGO O'HARE INT…| CLIMB    
 EXPRESSJET AIRLINES| EMB-145        | 05/11/15 00:00:00  | AL      | BIRMINGHAM-SHUTTLE…| LANDING… 
faa-wildlife-strikes| saul.pw/VisiData v1.0 | opening faa-wildlife-strikes as csv |      73448 rows 

Select only known species

For many of the wildlife strikes in the dataset, species is “unknown”. We want to focus just on the known species, so we’ll filter out the unknowns in this step.

First, navigate to the SPECIES column. Then, do the following:

  • Press | to raise the select-by-search prompt
  • Type unknown
  • Press Enter

Once you do that, you should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 FL      | VERO BEACH MUNICIP…| APPROACH       |          |         | Unknown bird       | 1        
 AK      | KENAI MUNICIPAL AR…| APPROACH       |          |         | Unknown bird       | 1        
 TX      | DAVID WAYNE HOOKS …|                |          |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Unknown bird       | 1        
 FL      | POMPANO BEACH AIRP…| LANDING ROLL   | 0        |         | Unknown bird       | 1        
 VI      | HENRY E ROHLSEN AR…|                |          |         | Unknown bird       | 1        
 TX      | SAN ANTONIO INTL   | APPROACH       |          |         | Unknown bird       | 1        
 TX      | LONE STAR EXECUTIV…| DEPARTURE      |          |         | Unknown bird       | 1        
 FL      | TAMPA INTL         | APPROACH       | 6000     |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 FL      | FORT LAUDERDALE/HO…| APPROACH       | 1500     |         | Unknown bird - sma…| 1        
 AR      | FORT SMITH REGIONA…| CLIMB          |          |         | Unknown bird - sma…| 1        
 AR      | BILL AND  HILLARY …| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
         | UNKNOWN            | En Route       |          |         | Unknown bird       | 1        
 CA      | METRO OAKLAND INTL |                |          |         | Unknown bird       | 1        
 UT      | SALT LAKE CITY INTL|                |          |         | Unknown bird       | 1        
 TX      | GEORGE BUSH INTERC…| CLIMB          |          |         | Unknown bird       | 1        
 FL      | ORLANDO SANFORD IN…| APPROACH       |          |         | Unknown bird       | 1        
 IL      | CHICAGO O'HARE INT…| CLIMB          | 12000    |         | Unknown bird       | 1        
 AL      | BIRMINGHAM-SHUTTLE…| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
faa-wildlife-strikes| "unknown" | 25988 matches for /unknown/ | selected 25988 row |     73448 rows 

Now, all the unknown species are selected. But we want the opposite of that: only known species selected. To do that, let’s toggle the selection-ing for all rows, by typing gt (mnemonic: “global toggle”). Once you do that, you should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 FL      | VERO BEACH MUNICIP…| APPROACH       |          |         | Unknown bird       | 1        
 AK      | KENAI MUNICIPAL AR…| APPROACH       |          |         | Unknown bird       | 1        
 TX      | DAVID WAYNE HOOKS …|                |          |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Unknown bird       | 1        
 FL      | POMPANO BEACH AIRP…| LANDING ROLL   | 0        |         | Unknown bird       | 1        
 VI      | HENRY E ROHLSEN AR…|                |          |         | Unknown bird       | 1        
 TX      | SAN ANTONIO INTL   | APPROACH       |          |         | Unknown bird       | 1        
 TX      | LONE STAR EXECUTIV…| DEPARTURE      |          |         | Unknown bird       | 1        
 FL      | TAMPA INTL         | APPROACH       | 6000     |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 FL      | FORT LAUDERDALE/HO…| APPROACH       | 1500     |         | Unknown bird - sma…| 1        
 AR      | FORT SMITH REGIONA…| CLIMB          |          |         | Unknown bird - sma…| 1        
 AR      | BILL AND  HILLARY …| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
         | UNKNOWN            | En Route       |          |         | Unknown bird       | 1        
 CA      | METRO OAKLAND INTL |                |          |         | Unknown bird       | 1        
 UT      | SALT LAKE CITY INTL|                |          |         | Unknown bird       | 1        
 TX      | GEORGE BUSH INTERC…| CLIMB          |          |         | Unknown bird       | 1        
 FL      | ORLANDO SANFORD IN…| APPROACH       |          |         | Unknown bird       | 1        
 IL      | CHICAGO O'HARE INT…| CLIMB          | 12000    |         | Unknown bird       | 1        
 AL      | BIRMINGHAM-SHUTTLE…| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
faa-wildlife-strikes|                                                             gt     73448 rows 

Now that we’ve selected our desired columns, let’s create a new sheet containing only those columns, by pressing ". The result should look something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 TX      | GEORGE BUSH INTERC…| LANDING ROLL   | 0        |         | Sparrows           | 1        
 FL      | VERO BEACH MUNICIP…| DEPARTURE      |          |         | Sandhill crane     | 1        
 CA      | ONTARIO INTL ARPT  |                |          |         | Owls               | 1        
 TN      | MC GHEE TYSON ARPT | APPROACH       |          |         | Barn swallow       | 1        
 SC      | CHARLESTON AFB/INT…| DEPARTURE      |          |         | Gulls              | 1        
 CA      | WHITEMAN AIRPORT   | LANDING ROLL   | 0        |         | Rock pigeon        | 1        
 WA      | SEATTLE-TACOMA INTL|                |          |         | Barn owl           | 1        
 HI      | KAHULUI ARPT       | LANDING ROLL   | 0        |         | Gulls              | 1        
 TX      | CORPUS CHRISTI INT…| TAKE-OFF RUN   | 0        |         | Gulls              | 1        
 FL      | TALLAHASSEE REGION…| APPROACH       |          |         | Gulls              | 2-10     
 TX      | JACK BROOKS REGION…| APPROACH       |          |         | Owls               | 1        
 PA      | CAPITAL CITY ARPT …| LANDING ROLL   | 0        |         | Bank swallow       | 1        
 AL      | MOBILE REGIONAL    | APPROACH       | 2000     | 200     | Gulls              | 1        
 PA      | HARRISBURG INTL    | DEPARTURE      |          |         | Swallows           | 1        
 CT      | BRADLEY INTL       |                |          |         | Horned lark        | 2-10     
 MA      | GENERAL EDWARD LAW…|                |          |         | Horned lark        | 1        
 CA      | SACRAMENTO INTL    | Climb          | 1500     | 210     | Northern pintail   | 2-10     
 OH      | RICKENBACKER INTL  |                |          |         | American robin     | 1        
 PA      | HARRISBURG INTL    |                |          |         | American kestrel   | 1        
faa-wildlife-strikes_selectedref|                                                  "     47460 rows 

Finally, let’s un-select all the rows in our filtered sheet, by typing gu (mnemonic: “global unselect”):

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 TX      | GEORGE BUSH INTERC…| LANDING ROLL   | 0        |         | Sparrows           | 1        
 FL      | VERO BEACH MUNICIP…| DEPARTURE      |          |         | Sandhill crane     | 1        
 CA      | ONTARIO INTL ARPT  |                |          |         | Owls               | 1        
 TN      | MC GHEE TYSON ARPT | APPROACH       |          |         | Barn swallow       | 1        
 SC      | CHARLESTON AFB/INT…| DEPARTURE      |          |         | Gulls              | 1        
 CA      | WHITEMAN AIRPORT   | LANDING ROLL   | 0        |         | Rock pigeon        | 1        
 WA      | SEATTLE-TACOMA INTL|                |          |         | Barn owl           | 1        
 HI      | KAHULUI ARPT       | LANDING ROLL   | 0        |         | Gulls              | 1        
 TX      | CORPUS CHRISTI INT…| TAKE-OFF RUN   | 0        |         | Gulls              | 1        
 FL      | TALLAHASSEE REGION…| APPROACH       |          |         | Gulls              | 2-10     
 TX      | JACK BROOKS REGION…| APPROACH       |          |         | Owls               | 1        
 PA      | CAPITAL CITY ARPT …| LANDING ROLL   | 0        |         | Bank swallow       | 1        
 AL      | MOBILE REGIONAL    | APPROACH       | 2000     | 200     | Gulls              | 1        
 PA      | HARRISBURG INTL    | DEPARTURE      |          |         | Swallows           | 1        
 CT      | BRADLEY INTL       |                |          |         | Horned lark        | 2-10     
 MA      | GENERAL EDWARD LAW…|                |          |         | Horned lark        | 1        
 CA      | SACRAMENTO INTL    | Climb          | 1500     | 210     | Northern pintail   | 2-10     
 OH      | RICKENBACKER INTL  |                |          |         | American robin     | 1        
 PA      | HARRISBURG INTL    |                |          |         | American kestrel   | 1        
faa-wildlife-strikes_selectedref|                                                 gu     47460 rows 

Rename the filtered sheet

By default, our sheet will be titled “faa-wildlife-strikes_selectedref”. To make it slightly easier to distinguish from other sheets, let’s rename it. To rename a sheet, do the following:

  • Press Shift-S to open the Sheets Sheet
  • Press _ to widen the name column so that we read all the names
  • Navigate down to our sheet of interest
  • Press e to begin editing the name
  • Type the new name we want; in this case known_species

At this point, you should see something like this:

 name                             | progressPct   | nRows  #| nCols  #| nVisibleCols  #| cursorDis> 
 sheets                           | 0            #|       3 |       9 |              9 | faa-wildl… 
known_species_____________________| 0            #|   47460 |      16 |             16 | Owls       
 faa-wildlife-strikes             | 0            #|   73448 |      16 |             16 | Unknown b… 
                                                                                                    
                                                                                                    

When you’ve entered the name, press Enter to complete the edit (or Control-c to cancel the edit).

Then, either (a) press q to leave the Sheets Sheet, or (b) navigate back to the row representing our known_species and press Enter to return to it.

Count the number of collisions per state

To get the denominator for our calculations, we’ll want to know the total number of reported collisions for each state.

Back in our known_species sheet, navigate to the STATE column:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 TX      | GEORGE BUSH INTERC…| LANDING ROLL   | 0        |         | Sparrows           | 1        
 FL      | VERO BEACH MUNICIP…| DEPARTURE      |          |         | Sandhill crane     | 1        
 CA      | ONTARIO INTL ARPT  |                |          |         | Owls               | 1        
 TN      | MC GHEE TYSON ARPT | APPROACH       |          |         | Barn swallow       | 1        
 SC      | CHARLESTON AFB/INT…| DEPARTURE      |          |         | Gulls              | 1        
 CA      | WHITEMAN AIRPORT   | LANDING ROLL   | 0        |         | Rock pigeon        | 1        
 WA      | SEATTLE-TACOMA INTL|                |          |         | Barn owl           | 1        
 HI      | KAHULUI ARPT       | LANDING ROLL   | 0        |         | Gulls              | 1        
 TX      | CORPUS CHRISTI INT…| TAKE-OFF RUN   | 0        |         | Gulls              | 1        
 FL      | TALLAHASSEE REGION…| APPROACH       |          |         | Gulls              | 2-10     
 TX      | JACK BROOKS REGION…| APPROACH       |          |         | Owls               | 1        
 PA      | CAPITAL CITY ARPT …| LANDING ROLL   | 0        |         | Bank swallow       | 1        
 AL      | MOBILE REGIONAL    | APPROACH       | 2000     | 200     | Gulls              | 1        
 PA      | HARRISBURG INTL    | DEPARTURE      |          |         | Swallows           | 1        
 CT      | BRADLEY INTL       |                |          |         | Horned lark        | 2-10     
 MA      | GENERAL EDWARD LAW…|                |          |         | Horned lark        | 1        
 CA      | SACRAMENTO INTL    | Climb          | 1500     | 210     | Northern pintail   | 2-10     
 OH      | RICKENBACKER INTL  |                |          |         | American robin     | 1        
 PA      | HARRISBURG INTL    |                |          |         | American kestrel   | 1        
known_species| "STATE"                                                             c     47460 rows 

Then, to create a frequency table for the column, press Shift-F. Once you do, you should see something like this:

 STATE    count  #| percent  %| histogram                                                         
 TX      ‖    4670 |      9.84 | ************************************************…‖                 
             4428 |      9.33 | ************************************************…                 
 CA          3391 |      7.14 | ************************************************…                 
 FL          3096 |      6.52 | ************************************************…                 
 CO          2914 |      6.14 | *************************************************                 
 NY          2626 |      5.53 | ********************************************                      
 IL          2153 |      4.54 | ************************************                              
 NJ          1681 |      3.54 | ****************************                                      
 PA          1665 |      3.51 | ****************************                                      
 OH          1515 |      3.19 | *************************                                         
 MI          1234 |      2.60 | *********************                                             
 MO          1140 |      2.40 | *******************                                               
 UT          1049 |      2.21 | *****************                                                 
 WA           874 |      1.84 | **************                                                    
 HI           873 |      1.84 | **************                                                    
 GA           824 |      1.74 | **************                                                    
 NC           813 |      1.71 | *************                                                     
 MA           801 |      1.69 | *************                                                     
 TN           774 |      1.63 | *************                                                     
 IN           773 |      1.63 | *************                                                     
 LA           737 |      1.55 | ************                                                      
 OR           734 |      1.55 | ************                                                      
known_species_STATE_freq|                                                          F        63 bins 

Spruce up the frequency table

Because we’ll later be joining this sheet to another sheet, let’s spruce it up a bit by doing the following:

  • Navigate to the count column. Rename it by pressing ^, typing state_total, and then pressing Enter
  • Press _ to resize column slightly, so that the full column name can be seen
  • Hide the percent and histogram columns by navigating to each and pressing -.

After those steps, the sheet should look something like this:

 STATE    state_total#                                                                            
 TX              4670                                                                             
                 4428                                                                             
 CA              3391                                                                             
 FL              3096                                                                             
 CO              2914                                                                             
 NY              2626                                                                             
 IL              2153                                                                             
 NJ              1681                                                                             
 PA              1665                                                                             
 OH              1515                                                                             
 MI              1234                                                                             
 MO              1140                                                                             
 UT              1049                                                                             
 WA               874                                                                             
 HI               873                                                                             
 GA               824                                                                             
 NC               813                                                                             
 MA               801                                                                             
 TN               774                                                                             
 IN               773                                                                             
 LA               737                                                                             
 OR               734                                                                             
known_species_STATE_freq|                                                          -        63 bins 

Count the number of collisions per state and species

Now that we have the denominator — collisions per state — let’s calculate the numerator: collisions per species per state.

To do that, we’ll want to create a frequency table for the combination of the STATE and SPECIES columns. Here’s how:

  • Use the Sheets Sheet (Shift-S) to navigate back to the known_species sheet
  • Navigate to the STATE column, and press ! to make it a “key” column
  • Do the same thing for the SPECIES columns

At this point, you should see something like this:

 STATE   | SPECIES            ‖<PHASE_OF_FLT   | HEIGHT   | SPEED   | BIRDS_STRUCK   | EFFECT     > 
 MO      | Owls               ‖ APPROACH       |          |         | 1              | NONE         
 FL      | Hawks               APPROACH       |          |         | 1              | NONE         
 CA      | Gulls                              |          |         | 1              | NONE         
 TX      | Sparrows            LANDING ROLL   | 0        |         | 1              | NONE         
 FL      | Sandhill crane      DEPARTURE      |          |         | 1              | NONE         
 CA      | Owls                               |          |         | 1              | NONE         
 TN      | Barn swallow        APPROACH       |          |         | 1              | NONE         
 SC      | Gulls               DEPARTURE      |          |         | 1              | NONE         
 CA      | Rock pigeon         LANDING ROLL   | 0        |         | 1              | NONE         
 WA      | Barn owl                           |          |         | 1              | NONE         
 HI      | Gulls               LANDING ROLL   | 0        |         | 1              | NONE         
 TX      | Gulls               TAKE-OFF RUN   | 0        |         | 1              | NONE         
 FL      | Gulls               APPROACH       |          |         | 2-10           | NONE         
 TX      | Owls                APPROACH       |          |         | 1              | NONE         
 PA      | Bank swallow        LANDING ROLL   | 0        |         | 1              | NONE         
 AL      | Gulls               APPROACH       | 2000     | 200     | 1              | NONE         
 PA      | Swallows            DEPARTURE      |          |         | 1              | NONE         
 CT      | Horned lark                        |          |         | 2-10           |              
 MA      | Horned lark                        |          |         | 1              |              
 CA      | Northern pintail    Climb          | 1500     | 210     | 2-10           | Precautiona… 
 OH      | American robin                     |          |         | 1              |              
 PA      | American kestrel                   |          |         | 1              |              
known_species|                                                                     !     47460 rows 

Now, type gF, which will create a frequency table of all keyed columns. Once you do, you should see something like this:

 STATE   | SPECIES             count  #| percent  %| histogram                                     
 CO      | Horned lark        ‖    1117 |      2.35 | ********************************************… 
 TX      | Mourning dove           984 |      2.07 | ********************************************… 
 TX      | Rock pigeon             339 |      0.71 | ************************                      
 HI      | Pacific golden-plo…     329 |      0.69 | ***********************                       
 FL      | Barn swallow            316 |      0.67 | **********************                        
 CO      | Western meadowlark      307 |      0.65 | *********************                         
 FL      | Mourning dove           304 |      0.64 | *********************                         
 CO      | Mourning dove           282 |      0.59 | ********************                          
         | Perching birds (y)      274 |      0.58 | *******************                           
 IL      | American kestrel        273 |      0.58 | *******************                           
 IL      | Killdeer                266 |      0.56 | *******************                           
 TX      | Killdeer                260 |      0.55 | ******************                            
 UT      | Horned lark             260 |      0.55 | ******************                            
 CA      | American kestrel        258 |      0.54 | ******************                            
 FL      | Killdeer                236 |      0.50 | ****************                              
 CA      | Red-tailed hawk         229 |      0.48 | ****************                              
 TX      | Barn swallow            219 |      0.46 | ***************                               
 CA      | Cliff swallow           213 |      0.45 | ***************                               
 NY      | Barn swallow            208 |      0.44 | **************                                
 CA      | Barn owl                200 |      0.42 | **************                                
 NY      | American kestrel        196 |      0.41 | **************                                
         | Mourning dove           195 |      0.41 | *************                                 
known_species_STATE-SPECIES_freq|                                                 gF      5135 bins 

Just like we did with the state-frequency table, let’s simplify this table by removing the percent and histogram columns; navigate to each of those columns and press -, which should result in something like this:

 STATE   | SPECIES             count  #                                                           
 CO      | Horned lark            1117                                                            
 TX      | Mourning dove           984                                                            
 TX      | Rock pigeon             339                                                            
 HI      | Pacific golden-plo…     329                                                            
 FL      | Barn swallow            316                                                            
 CO      | Western meadowlark      307                                                            
 FL      | Mourning dove           304                                                            
 CO      | Mourning dove           282                                                            
         | Perching birds (y)      274                                                            
 IL      | American kestrel        273                                                            
 IL      | Killdeer                266                                                            
 TX      | Killdeer                260                                                            
 UT      | Horned lark             260                                                            
 CA      | American kestrel        258                                                            
 FL      | Killdeer                236                                                            
 CA      | Red-tailed hawk         229                                                            
 TX      | Barn swallow            219                                                            
 CA      | Cliff swallow           213                                                            
 NY      | Barn swallow            208                                                            
 CA      | Barn owl                200                                                            
 NY      | American kestrel        196                                                            
         | Mourning dove           195                                                            
known_species_STATE-SPECIES_freq|                                                  -      5135 bins 

Prepare the state-species frequency table for joining

There’s just one step left before we can join the tables: making sure that the two frequency tables share the exact same key column. (VisiData uses each sheet’s “key” columns to determine which rows to join.)

Because the key for the state-counts table is the STATE column, this table should also have STATE as its only key column. That means we need to un-key the SPECIES column. Luckily, that’s easy. Just navigate to the SPECIES column and press ! to toggle it’s status from keyed to un-keyed:

 STATE    SPECIES            | count  #                                                           
 CO      ‖ Horned lark        |    1117                                                            
 TX       Mourning dove      |     984                                                            
 TX       Rock pigeon        |     339                                                            
 HI       Pacific golden-plo…|     329                                                            
 FL       Barn swallow       |     316                                                            
 CO       Western meadowlark |     307                                                            
 FL       Mourning dove      |     304                                                            
 CO       Mourning dove      |     282                                                            
          Perching birds (y) |     274                                                            
 IL       American kestrel   |     273                                                            
 IL       Killdeer           |     266                                                            
 TX       Killdeer           |     260                                                            
 UT       Horned lark        |     260                                                            
 CA       American kestrel   |     258                                                            
 FL       Killdeer           |     236                                                            
 CA       Red-tailed hawk    |     229                                                            
 TX       Barn swallow       |     219                                                            
 CA       Cliff swallow      |     213                                                            
 NY       Barn swallow       |     208                                                            
 CA       Barn owl           |     200                                                            
 NY       American kestrel   |     196                                                            
          Mourning dove      |     195                                                            
known_species_STATE-SPECIES_freq|                                                  !      5135 bins 

Join the two frequency tables

Now, for the moment we’ve all been waiting for: Let’s join the tables!

First, press Shift-S to open the Sheets Sheet, and then press _ to widen the name column so that we read all the names:

 name                             | progressPct   | nRows  #| nCols  #| nVisibleCols  #| cursorDis> 
 sheets                           | 0            #|       5 |       9 |              9 | sheets     
 known_species_STATE-SPECIES_freq | 0            #|    5135 |       5 |              3 | 1117       
 known_species                    | 0            #|   47460 |      16 |             16 | 1          
 known_species_STATE_freq         | 0            #|      63 |       4 |              2 | 4670       
 faa-wildlife-strikes             | 0            #|   73448 |      16 |             16 | Unknown b… 
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
sheets|                                                                          _         5 sheets 

Then navigate to the known_birds_STATE-SPECIES_freq row, and press s to select it. Do the same for the known_birds_STATE_freq, so that the Sheets Sheet now looks like this:

 name                             | progressPct   | nRows  #| nCols  #| nVisibleCols  #| cursorDis> 
 sheets                           | 0            #|       5 |       9 |              9 | faa-wildl… 
 known_species_STATE-SPECIES_freq | 0            #|    5135 |       5 |              3 | 1117       
 known_species                    | 0            #|   47460 |      16 |             16 | 1          
 known_species_STATE_freq         | 0            #|      63 |       4 |              2 | 4670       
 faa-wildlife-strikes             | 0            #|   73448 |      16 |             16 | Unknown b… 
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
sheets| selected 1 more rows                                                     s         5 sheets 

Press & to raise the sheet-joining prompt, and type inner to indicate that we want to perform an inner join (a.k.a. “left join”, a.k.a. “INNER LEFT JOIN”):

 name                             | progressPct   | nRows  #| nCols  #| nVisibleCols  #| cursorDis> 
 sheets                           | 0            #|       5 |       9 |              9 | faa-wildl… 
 known_species_STATE-SPECIES_freq | 0            #|    5135 |       5 |              3 | 1117       
 known_species                    | 0            #|   47460 |      16 |             16 | 1          
 known_species_STATE_freq         | 0            #|      63 |       4 |              2 | 4670       
 faa-wildlife-strikes             | 0            #|   73448 |      16 |             16 | Unknown b… 
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
inner/outer/full/diff/append: inner                                                        5 sheets 

Press Enter to complete the action, after which you should see something like this:

 STATE    SPECIES            | count  #| state_total#                                             
 CO      ‖ Horned lark        |    1117 |        2914 ‖                                             
 CO       Western meadowlark |     307 |        2914                                              
 CO       Mourning dove      |     282 |        2914                                              
 CO       Cliff swallow      |     148 |        2914                                              
 CO       Lark bunting       |      91 |        2914                                              
 CO       Rock pigeon        |      82 |        2914                                              
 CO       American kestrel   |      78 |        2914                                              
 CO       Red-tailed hawk    |      73 |        2914                                              
 CO       Black-tailed jackr…|      61 |        2914                                              
 CO       Killdeer           |      41 |        2914                                              
 CO       Great horned owl   |      31 |        2914                                              
 CO       Black-tailed prair…|      31 |        2914                                              
 CO       Sparrows           |      30 |        2914                                              
 CO       Western kingbird   |      25 |        2914                                              
 CO       Desert cottontail  |      24 |        2914                                              
 CO       Perching birds (y) |      20 |        2914                                              
 CO       Burrowing owl      |      20 |        2914                                              
 CO       Mountain bluebird  |      16 |        2914                                              
 CO       Vesper sparrow     |      16 |        2914                                              
 CO       Red-winged blackbi…|      15 |        2914                                              
 CO       Little brown bat   |      15 |        2914                                              
 CO       Barn owl           |      15 |        2914                                              
known_species_STATE-SPECIES_freq+known_species_STATE_freq| "inner"                 &      5135 rows 

Calculate each species’ state-level percentages

Now that we have the numerator and denominator in the same sheet, let’s calculate the percentage of known-species collisions to each species in each state.

Let’s say we want the new column to appear as the last column in the sheet, so let’s navigate to the currently-last column by typing gl. Then let’s create the new column by pressing =, typing count * 100 / state_total, and then pressing Enter.

Once you do that, you should see something like this:

 STATE    SPECIES            | count  #| state_total#| count * 100 / state_total                  
 CO      ‖ Horned lark        |    1117 |        2914 | 38.33218943033631          %               
 CO       Western meadowlark |     307 |        2914 | 10.535346602608099         %               
 CO       Mourning dove      |     282 |        2914 | 9.67741935483871           %               
 CO       Cliff swallow      |     148 |        2914 | 5.078929306794784          %               
 CO       Lark bunting       |      91 |        2914 | 3.1228551818805763         %               
 CO       Rock pigeon        |      82 |        2914 | 2.8140013726835966         %               
 CO       American kestrel   |      78 |        2914 | 2.676733013040494          %               
 CO       Red-tailed hawk    |      73 |        2914 | 2.5051475634866165         %               
 CO       Black-tailed jackr…|      61 |        2914 | 2.0933424845573096         %               
 CO       Killdeer           |      41 |        2914 | 1.4070006863417983         %               
 CO       Great horned owl   |      31 |        2914 | 1.0638297872340425         %               
 CO       Black-tailed prair…|      31 |        2914 | 1.0638297872340425         %               
 CO       Sparrows           |      30 |        2914 | 1.029512697323267          %               
 CO       Western kingbird   |      25 |        2914 | 0.8579272477693891         %               
 CO       Desert cottontail  |      24 |        2914 | 0.8236101578586136         %               
 CO       Perching birds (y) |      20 |        2914 | 0.6863417982155113         %               
 CO       Burrowing owl      |      20 |        2914 | 0.6863417982155113         %               
 CO       Mountain bluebird  |      16 |        2914 | 0.5490734385724091         %               
 CO       Vesper sparrow     |      16 |        2914 | 0.5490734385724091         %               
 CO       Red-winged blackbi…|      15 |        2914 | 0.5147563486616334         %               
 CO       Little brown bat   |      15 |        2914 | 0.5147563486616334         %               
 CO       Barn owl           |      15 |        2914 | 0.5147563486616334         %               
known_species_STATE-SPECIES_freq+known_species_STATE_freq| "count * 100 / state_to =      5135 rows 

It worked! But the column name is a bit wonky and literal. Let’s make the following tweaks:

  • Rename the column by navigating to it, pressing ^ to enter the column-name-editing mode, typing pct_of_state, and then pressing Enter.
  • Press % to tell VisiData that it’s a “float”-type numeric column. (By default, VisiData assumes that newly created columns are just plain-old text.)
  • Press _ to resize the column to fit its contents more precisely

Now the sheet should look something like this:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 CO      ‖ Horned lark        |    1117 |        2914 |        38.33                               
 CO       Western meadowlark |     307 |        2914 |        10.54                               
 CO       Mourning dove      |     282 |        2914 |         9.68                               
 CO       Cliff swallow      |     148 |        2914 |         5.08                               
 CO       Lark bunting       |      91 |        2914 |         3.12                               
 CO       Rock pigeon        |      82 |        2914 |         2.81                               
 CO       American kestrel   |      78 |        2914 |         2.68                               
 CO       Red-tailed hawk    |      73 |        2914 |         2.51                               
 CO       Black-tailed jackr…|      61 |        2914 |         2.09                               
 CO       Killdeer           |      41 |        2914 |         1.41                               
 CO       Great horned owl   |      31 |        2914 |         1.06                               
 CO       Black-tailed prair…|      31 |        2914 |         1.06                               
 CO       Sparrows           |      30 |        2914 |         1.03                               
 CO       Western kingbird   |      25 |        2914 |         0.86                               
 CO       Desert cottontail  |      24 |        2914 |         0.82                               
 CO       Perching birds (y) |      20 |        2914 |         0.69                               
 CO       Burrowing owl      |      20 |        2914 |         0.69                               
 CO       Mountain bluebird  |      16 |        2914 |         0.55                               
 CO       Vesper sparrow     |      16 |        2914 |         0.55                               
 CO       Red-winged blackbi…|      15 |        2914 |         0.51                               
 CO       Little brown bat   |      15 |        2914 |         0.51                               
 CO       Barn owl           |      15 |        2914 |         0.51                               
known_species_STATE-SPECIES_freq+known_species_STATE_freq|                         _      5135 rows 

Sort by percentage

Of course, to answer our main question, we’ll need to sort the column. To sort it descendingly, press ]. Now you should see something like this:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 SK      ‖ Gulls              |       1 |           1 |       100.00                               
 NS       Striped skunk      |       1 |           1 |       100.00                               
 NL       Snow bunting       |       1 |           1 |       100.00                               
 MB       Mallard            |       1 |           2 |        50.00                               
 MB       Cedar waxwing      |       1 |           2 |        50.00                               
 AB       Canada goose       |       1 |           2 |        50.00                               
 AB       Perching birds (y) |       1 |           2 |        50.00                               
 PI       Yellow bittern     |      72 |         173 |        41.62                               
 CO       Horned lark        |    1117 |        2914 |        38.33                               
 HI       Pacific golden-plo…|     329 |         873 |        37.69                               
 AZ       Mourning dove      |     134 |         438 |        30.59                               
 MS       Mourning dove      |      60 |         197 |        30.46                               
 WV       Killdeer           |      21 |          78 |        26.92                               
 VT       American kestrel   |      29 |         109 |        26.61                               
 IA       Killdeer           |      49 |         189 |        25.93                               
 UT       Horned lark        |     260 |        1049 |        24.79                               
 WY       Greater sage-grouse|      10 |          41 |        24.39                               
 IN       Mourning dove      |     179 |         773 |        23.16                               
 LA       Barn swallow       |     168 |         737 |        22.80                               
 MD       European starling  |      90 |         405 |        22.22                               
 BC       Red-tailed hawk    |       2 |           9 |        22.22                               
 TX       Mourning dove      |     984 |        4670 |        21.07                               
known_species_STATE-SPECIES_freq+known_species_STATE_freq|                         ]      5135 rows 

Limit to rows with at least 20 collisions

Hmmm, many of the highest-ranking species-state combinations seem to come from “states” — like the striped skunk that was struck in Nova Scotia — with very few reported collisions. So let’s limit the results to species-state combinations with at least 20 reports.

To do that, we’ll create a new True/False that indicates whether a row represents 20 or more reports, and then select only the rows for which that’s true. Here’s how:

First, press = to raise the new-column prompt. Then, type count >= 20 at the prompt, and then press Enter to complete the action. Now you should see something like this:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%| count >= 20                 
 SK      ‖ Gulls              |       1 |           1 |       100.00 | False         ‖              
 NS       Striped skunk      |       1 |           1 |       100.00 | False                       
 NL       Snow bunting       |       1 |           1 |       100.00 | False                       
 MB       Mallard            |       1 |           2 |        50.00 | False                       
 MB       Cedar waxwing      |       1 |           2 |        50.00 | False                       
 AB       Canada goose       |       1 |           2 |        50.00 | False                       
 AB       Perching birds (y) |       1 |           2 |        50.00 | False                       
 PI       Yellow bittern     |      72 |         173 |        41.62 | True                        
 CO       Horned lark        |    1117 |        2914 |        38.33 | True                        
 HI       Pacific golden-plo…|     329 |         873 |        37.69 | True                        
 AZ       Mourning dove      |     134 |         438 |        30.59 | True                        
 MS       Mourning dove      |      60 |         197 |        30.46 | True                        
 WV       Killdeer           |      21 |          78 |        26.92 | True                        
 VT       American kestrel   |      29 |         109 |        26.61 | True                        
 IA       Killdeer           |      49 |         189 |        25.93 | True                        
 UT       Horned lark        |     260 |        1049 |        24.79 | True                        
 WY       Greater sage-grouse|      10 |          41 |        24.39 | False                       
 IN       Mourning dove      |     179 |         773 |        23.16 | True                        
 LA       Barn swallow       |     168 |         737 |        22.80 | True                        
 MD       European starling  |      90 |         405 |        22.22 | True                        
 BC       Red-tailed hawk    |       2 |           9 |        22.22 | False                       
 TX       Mourning dove      |     984 |        4670 |        21.07 | True                        
known_species_STATE-SPECIES_freq+known_species_STATE_freq| "count >= 20"           =      5135 rows 

Then, navigate to the new column and press Shift-F to create a frequency table of it, and navigate to the True row:

 count >= 20    count  #| percent  %| histogram                                                   
 False             4637 |     90.30 | ************************************************…           
 True          ‖     498 |      9.70 | ********                                         ‖           
                                                                                                    
                                                                                                    
                                                                                                    

Finally, press Enter to see just the rows for which count is at least 20:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%| count >= 20                 
 PI      ‖ Yellow bittern     |      72 |         173 |        41.62 | True                        
 CO       Horned lark        |    1117 |        2914 |        38.33 | True                        
 HI       Pacific golden-plo…|     329 |         873 |        37.69 | True                        
 AZ       Mourning dove      |     134 |         438 |        30.59 | True                        
 MS       Mourning dove      |      60 |         197 |        30.46 | True                        
 WV       Killdeer           |      21 |          78 |        26.92 | True                        
 VT       American kestrel   |      29 |         109 |        26.61 | True                        
 IA       Killdeer           |      49 |         189 |        25.93 | True                        
 UT       Horned lark        |     260 |        1049 |        24.79 | True                        
 IN       Mourning dove      |     179 |         773 |        23.16 | True                        
 LA       Barn swallow       |     168 |         737 |        22.80 | True                        
 MD       European starling  |      90 |         405 |        22.22 | True                        
 TX       Mourning dove      |     984 |        4670 |        21.07 | True                        
 NH       American kestrel   |      24 |         117 |        20.51 | True                        
 PI       Pacific golden-plo…|      35 |         173 |        20.23 | True                        
 SC       Killdeer           |      45 |         229 |        19.65 | True                        
 ID       American kestrel   |      30 |         172 |        17.44 | True                        
 VA       Mourning dove      |      92 |         544 |        16.91 | True                        
 VA       Killdeer           |      89 |         544 |        16.36 | True                        
 NE       Barn swallow       |      68 |         419 |        16.23 | True                        
 AR       Mourning dove      |      30 |         187 |        16.04 | True                        
 RI       American kestrel   |      35 |         219 |        15.98 | True                        
known_species_STATE-SPECIES_freq+known_species_STATE_freq_True|                   ^J       498 rows 

There you have it! The Yellow Bittern accounted for more than 40% of the reported, known-species collisions in “PI” (the FAA’s abbreviation for “USA-possessed Pacific Islands,” such as Guam). In Colorado, the Horned Lark has been nearly as collision-dominating, as has the Pacific Golden-Plover in Hawaii, and Mourning Doves in Arizona and Missouri.

Take it one step further

What if we want to find the species that are disproportionately involved in collisions in their state? How would you do that? (Hint: It involves just one more frequency table and one more join.)