Distinctive Birds

This chapter uses VisiData to answer this question: What birds account for the largest percentages of their state’s collisions?

In the process, we’ll cover the following skills:

  • Selecting rows that match a phrase
  • Toggling selected rows
  • Creating filtered sheets
  • Renaming sheets and columns
  • Creating frequency tables
  • Creating and removing key columns
  • Creating new columns
  • Joining sheets

Note

If you haven’t already, download faa-wildlife-strikes.csv, a dataset of all aircraft-wildlife collisions reported to the Federal Aviation Adminsitration between 2010 and mid-2016.

Open the wildlife-strikes dataset in VisiData

Run this command in your terminal:

vd faa-wildlife-strikes.csv

If it worked, you should see something like this:

 OPERATOR           | ATYPE          | INCIDENT_DATE      | STATE   | AIRPORT            | PHASE_O> 
 BUSINESS           | PA-28          | 05/22/15 00:00:00  | FL      | VERO BEACH MUNICIP…| APPROACH 
 BUSINESS           | BE-1900        | 06/18/15 00:00:00  | AK      | KENAI MUNICIPAL AR…| APPROACH 
 BUSINESS           | PA-46 MALIBU   | 09/20/15 00:00:00  | TX      | DAVID WAYNE HOOKS …|          
 DELTA AIR LINES    | B-717-200      | 11/07/15 00:00:00  | MO      | LAMBERT-ST LOUIS I…| APPROACH 
 BUSINESS           | BE-90 KING     | 12/17/15 00:00:00  | FL      | POMPANO BEACH AIRP…| LANDING… 
 DELTA AIR LINES    | B-757          | 07/17/15 00:00:00  | VI      | HENRY E ROHLSEN AR…|          
 DELTA AIR LINES    | B-717-200      | 08/02/15 00:00:00  | TX      | SAN ANTONIO INTL   | APPROACH 
 BUSINESS           | C-414          | 08/03/15 00:00:00  | TX      | LONE STAR EXECUTIV…| DEPARTU… 
 ALLEGIANT AIR      | MD-80          | 09/02/15 00:00:00  | FL      | TAMPA INTL         | APPROACH 
 TRANS STATES AIRLI…| EMB-145        | 09/07/15 00:00:00  | MO      | LAMBERT-ST LOUIS I…| APPROACH 
 BUSINESS           | C-172          | 11/28/15 00:00:00  | FL      | OPA-LOCKA EXECUTIV…| APPROACH 
 GOVERNMENT         | EC120          | 12/08/15 00:00:00  | CA      | NORMAN Y. MINETA S…|          
 AMERICAN AIRLINES  | A-321          | 05/06/15 00:00:00  | FL      | FORT LAUDERDALE/HO…| APPROACH 
 EXPRESSJET AIRLINES| CRJ100/200     | 05/06/15 00:00:00  | AR      | FORT SMITH REGIONA…| CLIMB    
 MESA AIRLINES      | CRJ900         | 05/08/15 00:00:00  | AR      | BILL AND  HILLARY …| LANDING… 
 BUSINESS           | HELICOPTER     | 05/06/15 00:00:00  |         | UNKNOWN            | En Route 
 DELTA AIR LINES    | A-320          | 05/07/15 00:00:00  | CA      | METRO OAKLAND INTL |          
 DELTA AIR LINES    | A-320          | 05/08/15 00:00:00  | UT      | SALT LAKE CITY INTL|          
 LUFTHANSA          | A-380          | 05/10/15 00:00:00  | TX      | GEORGE BUSH INTERC…| CLIMB    
 BUSINESS           | C-172          | 05/08/15 00:00:00  | FL      | ORLANDO SANFORD IN…| APPROACH 
 SPIRIT AIRLINES    | A-319          | 05/10/15 00:00:00  | IL      | CHICAGO O'HARE INT…| CLIMB    
 EXPRESSJET AIRLINES| EMB-145        | 05/11/15 00:00:00  | AL      | BIRMINGHAM-SHUTTLE…| LANDING… 
faa-wildlife-strikes| saul.pw/VisiData v1.5 | opening faa-wildlife-strikes as csv       73448 rows… 

Select only known species

For many of the wildlife strikes in the dataset, species is “unknown”. We want to focus just on the known species, so we’ll filter out the unknowns in this step.

First, navigate to the SPECIES column. Then, do the following:

  • Press | to raise the select-by-search prompt
  • Type unknown
  • Press Enter

Once you do that, you should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 FL      | VERO BEACH MUNICIP…| APPROACH       |          |         | Unknown bird       | 1        
 AK      | KENAI MUNICIPAL AR…| APPROACH       |          |         | Unknown bird       | 1        
 TX      | DAVID WAYNE HOOKS …|                |          |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Unknown bird       | 1        
 FL      | POMPANO BEACH AIRP…| LANDING ROLL   | 0        |         | Unknown bird       | 1        
 VI      | HENRY E ROHLSEN AR…|                |          |         | Unknown bird       | 1        
 TX      | SAN ANTONIO INTL   | APPROACH       |          |         | Unknown bird       | 1        
 TX      | LONE STAR EXECUTIV…| DEPARTURE      |          |         | Unknown bird       | 1        
 FL      | TAMPA INTL         | APPROACH       | 6000     |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 FL      | FORT LAUDERDALE/HO…| APPROACH       | 1500     |         | Unknown bird - sma…| 1        
 AR      | FORT SMITH REGIONA…| CLIMB          |          |         | Unknown bird - sma…| 1        
 AR      | BILL AND  HILLARY …| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
         | UNKNOWN            | En Route       |          |         | Unknown bird       | 1        
 CA      | METRO OAKLAND INTL |                |          |         | Unknown bird       | 1        
 UT      | SALT LAKE CITY INTL|                |          |         | Unknown bird       | 1        
 TX      | GEORGE BUSH INTERC…| CLIMB          |          |         | Unknown bird       | 1        
 FL      | ORLANDO SANFORD IN…| APPROACH       |          |         | Unknown bird       | 1        
 IL      | CHICAGO O'HARE INT…| CLIMB          | 12000    |         | Unknown bird       | 1        
 AL      | BIRMINGHAM-SHUTTLE…| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
faa-wildlife-strikes| "unknown" | search wrapped | 25988 matches for /unknown/ |  |     73448 rows… 

Now, all the unknown species are selected. But we want the opposite of that: only known species selected. To do that, let’s toggle the selection-ing for all rows, by typing gt (mnemonic: “global toggle”). Once you do that, you should see something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 FL      | VERO BEACH MUNICIP…| APPROACH       |          |         | Unknown bird       | 1        
 AK      | KENAI MUNICIPAL AR…| APPROACH       |          |         | Unknown bird       | 1        
 TX      | DAVID WAYNE HOOKS …|                |          |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Unknown bird       | 1        
 FL      | POMPANO BEACH AIRP…| LANDING ROLL   | 0        |         | Unknown bird       | 1        
 VI      | HENRY E ROHLSEN AR…|                |          |         | Unknown bird       | 1        
 TX      | SAN ANTONIO INTL   | APPROACH       |          |         | Unknown bird       | 1        
 TX      | LONE STAR EXECUTIV…| DEPARTURE      |          |         | Unknown bird       | 1        
 FL      | TAMPA INTL         | APPROACH       | 6000     |         | Unknown bird       | 1        
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 FL      | FORT LAUDERDALE/HO…| APPROACH       | 1500     |         | Unknown bird - sma…| 1        
 AR      | FORT SMITH REGIONA…| CLIMB          |          |         | Unknown bird - sma…| 1        
 AR      | BILL AND  HILLARY …| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
         | UNKNOWN            | En Route       |          |         | Unknown bird       | 1        
 CA      | METRO OAKLAND INTL |                |          |         | Unknown bird       | 1        
 UT      | SALT LAKE CITY INTL|                |          |         | Unknown bird       | 1        
 TX      | GEORGE BUSH INTERC…| CLIMB          |          |         | Unknown bird       | 1        
 FL      | ORLANDO SANFORD IN…| APPROACH       |          |         | Unknown bird       | 1        
 IL      | CHICAGO O'HARE INT…| CLIMB          | 12000    |         | Unknown bird       | 1        
 AL      | BIRMINGHAM-SHUTTLE…| LANDING ROLL   | 0        |         | Unknown bird - sma…| 1        
faa-wildlife-strikes|                                                            gt     73448 rows  

Now that we’ve selected our desired rows, let’s create a new sheet containing only those rows, by pressing ". The result should look something like this:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 TX      | GEORGE BUSH INTERC…| LANDING ROLL   | 0        |         | Sparrows           | 1        
 FL      | VERO BEACH MUNICIP…| DEPARTURE      |          |         | Sandhill crane     | 1        
 CA      | ONTARIO INTL ARPT  |                |          |         | Owls               | 1        
 TN      | MC GHEE TYSON ARPT | APPROACH       |          |         | Barn swallow       | 1        
 SC      | CHARLESTON AFB/INT…| DEPARTURE      |          |         | Gulls              | 1        
 CA      | WHITEMAN AIRPORT   | LANDING ROLL   | 0        |         | Rock pigeon        | 1        
 WA      | SEATTLE-TACOMA INTL|                |          |         | Barn owl           | 1        
 HI      | KAHULUI ARPT       | LANDING ROLL   | 0        |         | Gulls              | 1        
 TX      | CORPUS CHRISTI INT…| TAKE-OFF RUN   | 0        |         | Gulls              | 1        
 FL      | TALLAHASSEE REGION…| APPROACH       |          |         | Gulls              | 2-10     
 TX      | JACK BROOKS REGION…| APPROACH       |          |         | Owls               | 1        
 PA      | CAPITAL CITY ARPT …| LANDING ROLL   | 0        |         | Bank swallow       | 1        
 AL      | MOBILE REGIONAL    | APPROACH       | 2000     | 200     | Gulls              | 1        
 PA      | HARRISBURG INTL    | DEPARTURE      |          |         | Swallows           | 1        
 CT      | BRADLEY INTL       |                |          |         | Horned lark        | 2-10     
 MA      | GENERAL EDWARD LAW…|                |          |         | Horned lark        | 1        
 CA      | SACRAMENTO INTL    | Climb          | 1500     | 210     | Northern pintail   | 2-10     
 OH      | RICKENBACKER INTL  |                |          |         | American robin     | 1        
 PA      | HARRISBURG INTL    |                |          |         | American kestrel   | 1        
faa-wildlife-strikes_selectedref|                                                 "     47460 rows  

Rename the filtered sheet

By default, our sheet will be titled “faa-wildlife-strikes_selectedref”. To make it slightly easier to distinguish from other sheets, let’s rename it. To rename a sheet, do the following:

  • Press Space to raise the type-a-command prompt
  • Type rename-sheet (the command we want to use) and press Enter
  • At the next prompt, type the new name we want; in this case known_species

At this point, you should see something like this:

 CA      | SACRAMENTO INTL    | Climb          | 1500     | 210     | Northern pintail   | 2-10     
 OH      | RICKENBACKER INTL  |                |          |         | American robin     | 1        
 PA      | HARRISBURG INTL    |                |          |         | American kestrel   | 1        
rename sheet to: known_species                                                          47460 rows  

When you’ve entered the name, press Enter to complete the edit (or Control-c to cancel the edit).

Count the number of collisions per state

To get the denominator for our calculations, we’ll want to know the total number of reported collisions for each state.

Back in our known_species sheet, navigate to the STATE column:

<STATE   | AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | SPECIES            | BIRDS_S> 
 MO      | LAMBERT-ST LOUIS I…| APPROACH       |          |         | Owls               | 1        
 FL      | OPA-LOCKA EXECUTIV…| APPROACH       |          |         | Hawks              | 1        
 CA      | NORMAN Y. MINETA S…|                |          |         | Gulls              | 1        
 TX      | GEORGE BUSH INTERC…| LANDING ROLL   | 0        |         | Sparrows           | 1        
 FL      | VERO BEACH MUNICIP…| DEPARTURE      |          |         | Sandhill crane     | 1        
 CA      | ONTARIO INTL ARPT  |                |          |         | Owls               | 1        
 TN      | MC GHEE TYSON ARPT | APPROACH       |          |         | Barn swallow       | 1        
 SC      | CHARLESTON AFB/INT…| DEPARTURE      |          |         | Gulls              | 1        
 CA      | WHITEMAN AIRPORT   | LANDING ROLL   | 0        |         | Rock pigeon        | 1        
 WA      | SEATTLE-TACOMA INTL|                |          |         | Barn owl           | 1        
 HI      | KAHULUI ARPT       | LANDING ROLL   | 0        |         | Gulls              | 1        
 TX      | CORPUS CHRISTI INT…| TAKE-OFF RUN   | 0        |         | Gulls              | 1        
 FL      | TALLAHASSEE REGION…| APPROACH       |          |         | Gulls              | 2-10     
 TX      | JACK BROOKS REGION…| APPROACH       |          |         | Owls               | 1        
 PA      | CAPITAL CITY ARPT …| LANDING ROLL   | 0        |         | Bank swallow       | 1        
 AL      | MOBILE REGIONAL    | APPROACH       | 2000     | 200     | Gulls              | 1        
 PA      | HARRISBURG INTL    | DEPARTURE      |          |         | Swallows           | 1        
 CT      | BRADLEY INTL       |                |          |         | Horned lark        | 2-10     
 MA      | GENERAL EDWARD LAW…|                |          |         | Horned lark        | 1        
 CA      | SACRAMENTO INTL    | Climb          | 1500     | 210     | Northern pintail   | 2-10     
 OH      | RICKENBACKER INTL  |                |          |         | American robin     | 1        
 PA      | HARRISBURG INTL    |                |          |         | American kestrel   | 1        
known_species| "STATE"                                                            c     47460 rows  

Then, to create a frequency table for the column, press Shift-F. Once you do, you should see something like this:

 STATE    count  #| percent  %| histogram                                         ~               
 TX      ‖    4670 |      9.84 | ************************************************** ‖               
         ‖    4428 |      9.33 | ***********************************************                   
 CA      ‖    3391 |      7.14 | ************************************                             
 FL      ‖    3096 |      6.52 | *********************************                                
 CO      ‖    2914 |      6.14 | *******************************                                   
 NY      ‖    2626 |      5.53 | ****************************                                      
 IL      ‖    2153 |      4.54 | ***********************                                           
 NJ      ‖    1681 |      3.54 | *****************                                                 
 PA      ‖    1665 |      3.51 | *****************                                                 
 OH      ‖    1515 |      3.19 | ****************                                   ‖               
 MI      ‖    1234 |      2.60 | *************                                                     
 MO      ‖    1140 |      2.40 | ************                                                      
 UT      ‖    1049 |      2.21 | ***********                                                       
 WA      ‖     874 |      1.84 | *********                                                         
 HI      ‖     873 |      1.84 | *********                                                         
 GA      ‖     824 |      1.74 | ********                                                          
 NC      ‖     813 |      1.71 | ********                     
 MA      ‖     801 |      1.69 | ********                                                          
 TN      ‖     774 |      1.63 | ********                                                          
 IN      ‖     773 |      1.63 | ********                                                          
 LA      ‖     737 |      1.55 | *******                                                           
 OR      ‖     734 |      1.55 | *******                                                           
known_species_STATE_freq|                                                         F        63 bins  

Spruce up the frequency table

Because we’ll later be joining this sheet to another sheet, let’s spruce it up a bit by doing the following:

  • Navigate to the count column. Rename it by pressing ^, typing state_total, and then pressing Enter
  • Press _ to resize column slightly, so that the full column name can be seen
  • Hide the percent and histogram columns by navigating to each and pressing -.

After those steps, the sheet should look something like this:

 STATE    state_total#                                                                            
 TX              4670                                                                             
                 4428                                                                             
 CA              3391                                                                             
 FL              3096                                                                             
 CO              2914                                                                             
 NY              2626                                                                             
 IL              2153                                                                             
 NJ              1681                                                                             
 PA              1665                                                                             
 OH              1515                                                                             
 MI              1234                                                                             
 MO              1140                                                                             
 UT              1049                                                                             
 WA               874                                                                             
 HI               873                                                                             
 GA               824                                                                             
 NC               813                                                                             
 MA               801                                                                             
 TN               774                                                                             
 IN               773                                                                             
 LA               737                                                                             
 OR               734                                                                             
known_species_STATE_freq|                                                         -        63 bins  

Count the number of collisions per state and species

Now that we have the denominator — collisions per state — let’s calculate the numerator: collisions per species per state.

To do that, we’ll want to create a frequency table for the combination of the STATE and SPECIES columns. Here’s how:

  • Use the Sheets Sheet (Shift-S) to navigate back to the known_species sheet
  • Navigate to the STATE column, and press ! to make it a “key” column
  • Do the same thing for the SPECIES columns

At this point, you should see something like this:

 STATE   | SPECIES            ‖<AIRPORT            | PHASE_OF_FLT   | HEIGHT   | SPEED   | BIRDS_S> 
 MO      | Owls               ‖ LAMBERT-ST LOUIS I…| APPROACH       |          |         | 1        
 FL      | Hawks              ‖ OPA-LOCKA EXECUTIV…| APPROACH       |          |         | 1        
 CA      | Gulls              ‖ NORMAN Y. MINETA S…|                |          |         | 1        
 TX      | Sparrows           ‖ GEORGE BUSH INTERC…| LANDING ROLL   | 0        |         | 1        
 FL      | Sandhill crane     ‖ VERO BEACH MUNICIP…| DEPARTURE      |          |         | 1        
 CA      | Owls               ‖ ONTARIO INTL ARPT  |                |          |         | 1        
 TN      | Barn swallow       ‖ MC GHEE TYSON ARPT | APPROACH       |          |         | 1        
 SC      | Gulls              ‖ CHARLESTON AFB/INT…| DEPARTURE      |          |         | 1        
 CA      | Rock pigeon        ‖ WHITEMAN AIRPORT   | LANDING ROLL   | 0        |         | 1        
 WA      | Barn owl           ‖ SEATTLE-TACOMA INTL|                |          |         | 1        
 HI      | Gulls              ‖ KAHULUI ARPT       | LANDING ROLL   | 0        |         | 1        
 TX      | Gulls              ‖ CORPUS CHRISTI INT…| TAKE-OFF RUN   | 0        |         | 1        
 FL      | Gulls              ‖ TALLAHASSEE REGION…| APPROACH       |          |         | 2-10     
 TX      | Owls               ‖ JACK BROOKS REGION…| APPROACH       |          |         | 1        
 PA      | Bank swallow       ‖ CAPITAL CITY ARPT …| LANDING ROLL   | 0        |         | 1        
 AL      | Gulls              ‖ MOBILE REGIONAL    | APPROACH       | 2000     | 200     | 1        
 PA      | Swallows           ‖ HARRISBURG INTL    | DEPARTURE      |          |         | 1        
 CT      | Horned lark        ‖ BRADLEY INTL       |                |          |         | 2-10     
 MA      | Horned lark        ‖ GENERAL EDWARD LAW…|                |          |         | 1        
 CA      | Northern pintail   ‖ SACRAMENTO INTL    | Climb          | 1500     | 210     | 2-10     
 OH      | American robin     ‖ RICKENBACKER INTL  |                |          |         | 1        
 PA      | American kestrel   ‖ HARRISBURG INTL    |                |          |         | 1        
known_species|                                                                    !     47460 rows  

Now, type gF, which will create a frequency table of all keyed columns. Once you do, you should see something like this:

 STATE   | SPECIES             count  #| percent  %| histogram                                     
 CO      | Horned lark        ‖    1117 |      2.35 | ********************************************… 
 TX      | Mourning dove      ‖     984 |      2.07 | ********************************************  
 TX      | Rock pigeon        ‖     339 |      0.71 | ***************                               
 HI      | Pacific golden-plo…‖     329 |      0.69 | **************                                
 FL      | Barn swallow       ‖     316 |      0.67 | **************                                
 CO      | Western meadowlark ‖     307 |      0.65 | *************                                 
 FL      | Mourning dove      ‖     304 |      0.64 | *************                                 
 CO      | Mourning dove      ‖     282 |      0.59 | ************                                  
         | Perching birds (y) ‖     274 |      0.58 | ************                                  
 IL      | American kestrel   ‖     273 |      0.58 | ************                                  
 IL      | Killdeer           ‖     266 |      0.56 | ***********                                   
 TX      | Killdeer           ‖     260 |      0.55 | ***********                                   
 UT      | Horned lark        ‖     260 |      0.55 | ***********                                   
 CA      | American kestrel   ‖     258 |      0.54 | ***********                                   
 FL      | Killdeer           ‖     236 |      0.50 | **********                                    
 CA      | Red-tailed hawk    ‖     229 |      0.48 | **********                                    
 TX      | Barn swallow       ‖     219 |      0.46 | *********                                     
 CA      | Cliff swallow      ‖     213 |      0.45 | *********                                     
 NY      | Barn swallow       ‖     208 |      0.44 | *********                                     
 CA      | Barn owl           ‖     200 |      0.42 | ********                                      
 NY      | American kestrel   ‖     196 |      0.41 | ********                                      
         | Mourning dove      ‖     195 |      0.41 | ********                                      
known_species_STATE-SPECIES_freq|                                                gF      5135 bins  

Just like we did with the state-frequency table, let’s simplify this table by removing the percent and histogram columns; navigate to each of those columns and press -, which should result in something like this:

 STATE   | SPECIES             count  #                                                           
 CO      | Horned lark            1117                                                            
 TX      | Mourning dove           984                                                            
 TX      | Rock pigeon             339                                                            
 HI      | Pacific golden-plo…     329                                                            
 FL      | Barn swallow            316                                                            
 CO      | Western meadowlark      307                                                            
 FL      | Mourning dove           304                                                            
 CO      | Mourning dove           282                                                            
         | Perching birds (y)      274                                                            
 IL      | American kestrel        273                                                            
 IL      | Killdeer                266                                                            
 TX      | Killdeer                260                                                            
 UT      | Horned lark             260                                                            
 CA      | American kestrel        258                                                            
 FL      | Killdeer                236                                                            
 CA      | Red-tailed hawk         229                                                            
 TX      | Barn swallow            219                                                            
 CA      | Cliff swallow           213                                                            
 NY      | Barn swallow            208                                                            
 CA      | Barn owl                200                                                            
 NY      | American kestrel        196                                                            
         | Mourning dove           195                                                            
known_species_STATE-SPECIES_freq|                                                 -      5135 bins  

Prepare the state-species frequency table for joining

There’s just one step left before we can join the tables: making sure that the two frequency tables share the exact same key column. (VisiData uses each sheet’s “key” columns to determine which rows to join.)

Because the key for the state-counts table is the STATE column, this table should also have STATE as its only key column. That means we need to un-key the SPECIES column. Luckily, that’s easy. Just navigate to the SPECIES column and press ! to toggle it’s status from keyed to un-keyed:

 STATE    SPECIES            | count  #                                                           
 CO       Horned lark        |    1117 ‖                                                           
 TX       Mourning dove      |     984 ‖                                                           
 TX       Rock pigeon        |     339 ‖                                                           
 HI       Pacific golden-plo…|     329 ‖                                                           
 FL       Barn swallow       |     316 ‖                                                           
 CO       Western meadowlark |     307 ‖                                                           
 FL       Mourning dove      |     304 ‖                                                           
 CO       Mourning dove      |     282 ‖                                                           
          Perching birds (y) |     274 ‖                                                           
 IL       American kestrel   |     273 ‖                                                           
 IL       Killdeer           |     266 ‖                                                           
 TX       Killdeer           |     260 ‖                                                           
 UT       Horned lark        |     260 ‖                                                           
 CA       American kestrel   |     258 ‖                                                           
 FL       Killdeer           |     236 ‖                                                           
 CA       Red-tailed hawk    |     229 ‖                                                           
 TX       Barn swallow       |     219 ‖                                                           
 CA       Cliff swallow      |     213 ‖                                                           
 NY       Barn swallow       |     208 ‖                                                           
 CA       Barn owl           |     200 ‖                                                           
 NY       American kestrel   |     196 ‖                                                           
          Mourning dove      |     195 ‖                                                           
known_species_STATE-SPECIES_freq|                                                 !      5135 bins  

Join the two frequency tables

Now, for the moment we’ve all been waiting for: Let’s join the tables!

First, press Shift-S to open the Sheets Sheet:

 name                              nRows  #| nCols  #| nVisibleCols  #| cursorDisplay   | keyColN> 
 sheets                           ‖       5 |       8 |              8 | sheets          | name     
 known_species_STATE-SPECIES_freq ‖    5135 |       5 |              3 | Horned lark     | STATE    
 known_species                    ‖   47460 |      16 |             16 |                 | STATE S… 
 known_species_STATE_freq         ‖      63 |       4 |              2 | 4670            | STATE    
 faa-wildlife-strikes             ‖   73448 |      16 |             16 | Unknown bird    |          
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
sheets|                                                                        gk         5 sheets  

Then navigate to the known_birds_STATE-SPECIES_freq row, and press s to select it. Do the same for the known_birds_STATE_freq, so that the Sheets Sheet now looks like this:

 name                              nRows  #| nCols  #| nVisibleCols  #| cursorDisplay   | keyColN> 
 sheets                           ‖       5 |       8 |              8 | faa-wildlife-st…| name     
 known_species_STATE-SPECIES_freq ‖    5135 |       5 |              3 | Horned lark     | STATE    
 known_species                    ‖   47460 |      16 |             16 |                 | STATE S… 
 known_species_STATE_freq         ‖      63 |       4 |              2 | 4670            | STATE    
 faa-wildlife-strikes             ‖   73448 |      16 |             16 | Unknown bird    |          
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
sheets| selected 1 more sheets                                                  s         5 sheets  

Press & to raise the sheet-joining prompt, and type inner to indicate that we want to perform an inner join (a.k.a. “left join”, a.k.a. “INNER LEFT JOIN”):

 name                              nRows  #| nCols  #| nVisibleCols  #| cursorDisplay   | keyColN> 
 sheets                           ‖       5 |       8 |              8 | faa-wildlife-st…| name     
 known_species_STATE-SPECIES_freq ‖    5135 |       5 |              3 | Horned lark     | STATE    
 known_species                    ‖   47460 |      16 |             16 |                 | STATE S… 
 known_species_STATE_freq         ‖      63 |       4 |              2 | 4670            | STATE    
 faa-wildlife-strikes             ‖   73448 |      16 |             16 | Unknown bird    |          
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
                                                                                                    
inner/outer/full/diff/append/extend: inner                                      &         5 sheets  

Press Enter to complete the action, after which you should see something like this:

 STATE    SPECIES            | count  #| state_total#                                             
 CO      ‖ Horned lark        |    1117 |        2914 ‖                                             
 CO      ‖ Western meadowlark |     307 |        2914 ‖                                             
 CO      ‖ Mourning dove      |     282 |        2914 ‖                                             
 CO      ‖ Cliff swallow      |     148 |        2914 ‖                                             
 CO      ‖ Lark bunting       |      91 |        2914 ‖                                             
 CO      ‖ Rock pigeon        |      82 |        2914 ‖                                             
 CO      ‖ American kestrel   |      78 |        2914 ‖                                             
 CO      ‖ Red-tailed hawk    |      73 |        2914 ‖                                             
 CO      ‖ Black-tailed jackr…|      61 |        2914 ‖                                             
 CO      ‖ Killdeer           |      41 |        2914 ‖                                             
 CO      ‖ Great horned owl   |      31 |        2914 ‖                                             
 CO      ‖ Black-tailed prair…|      31 |        2914 ‖                                             
 CO      ‖ Sparrows           |      30 |        2914 ‖                                             
 CO      ‖ Western kingbird   |      25 |        2914 ‖                                             
 CO      ‖ Desert cottontail  |      24 |        2914 ‖                                             
 CO      ‖ Perching birds (y) |      20 |        2914 ‖                                             
 CO      ‖ Burrowing owl      |      20 |        2914 ‖                                             
 CO      ‖ Mountain bluebird  |      16 |        2914 ‖                                             
 CO      ‖ Vesper sparrow     |      16 |        2914 ‖                                             
 CO      ‖ Red-winged blackbi…|      15 |        2914 ‖                                             
 CO      ‖ Little brown bat   |      15 |        2914 ‖                                             
 CO      ‖ Barn owl           |      15 |        2914 ‖                                             
known_species_STATE-SPECIES_freq+known_species_STATE_freq| "inner"                &      5135 rows  

Calculate each species’ state-level percentages

Now that we have the numerator and denominator in the same sheet, let’s calculate the percentage of known-species collisions to each species in each state.

Let’s say we want the new column to appear as the last column in the sheet, so let’s navigate to the currently-last column by typing gl. Then let’s create the new column by pressing =, typing count * 100 / state_total, and then pressing Enter.

Once you do that, you should see something like this:

 STATE    SPECIES            | count  #| state_total#| count * 100 / state_total                  
 CO      ‖ Horned lark        |    1117 |        2914 | 38.33218943033631          %               
 CO      ‖ Western meadowlark |     307 |        2914 | 10.535346602608099         %               
 CO      ‖ Mourning dove      |     282 |        2914 | 9.67741935483871           %               
 CO      ‖ Cliff swallow      |     148 |        2914 | 5.078929306794784          %               
 CO      ‖ Lark bunting       |      91 |        2914 | 3.1228551818805763         %               
 CO      ‖ Rock pigeon        |      82 |        2914 | 2.8140013726835966         %               
 CO      ‖ American kestrel   |      78 |        2914 | 2.676733013040494          %               
 CO      ‖ Red-tailed hawk    |      73 |        2914 | 2.5051475634866165         %               
 CO      ‖ Black-tailed jackr…|      61 |        2914 | 2.0933424845573096         %               
 CO      ‖ Killdeer           |      41 |        2914 | 1.4070006863417983         %               
 CO      ‖ Great horned owl   |      31 |        2914 | 1.0638297872340425         %               
 CO      ‖ Black-tailed prair…|      31 |        2914 | 1.0638297872340425         %               
 CO      ‖ Sparrows           |      30 |        2914 | 1.029512697323267          %               
 CO      ‖ Western kingbird   |      25 |        2914 | 0.8579272477693891         %               
 CO      ‖ Desert cottontail  |      24 |        2914 | 0.8236101578586136         %               
 CO      ‖ Perching birds (y) |      20 |        2914 | 0.6863417982155113         %               
 CO      ‖ Burrowing owl      |      20 |        2914 | 0.6863417982155113         %               
 CO      ‖ Mountain bluebird  |      16 |        2914 | 0.5490734385724091         %               
 CO      ‖ Vesper sparrow     |      16 |        2914 | 0.5490734385724091         %               
 CO      ‖ Red-winged blackbi…|      15 |        2914 | 0.5147563486616334         %               
 CO      ‖ Little brown bat   |      15 |        2914 | 0.5147563486616334         %               
 CO      ‖ Barn owl           |      15 |        2914 | 0.5147563486616334         %               
known_species_STATE-SPECIES_freq+known_species_STATE_freq| "count * 100 / state_t =      5135 rows  

It worked! But the column name is a bit wonky and literal. Let’s make the following tweaks:

  • Rename the column by navigating to it, pressing ^ to enter the column-name-editing mode, typing pct_of_state, and then pressing Enter.
  • Press % to tell VisiData that it’s a “float”-type numeric column. (By default, VisiData assumes that newly created columns are just plain-old text.)
  • Press _ to resize the column to fit its contents more precisely

Now the sheet should look something like this:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 CO      ‖ Horned lark        |    1117 |        2914 |        38.33                               
 CO      ‖ Western meadowlark |     307 |        2914 |        10.54                               
 CO      ‖ Mourning dove      |     282 |        2914 |         9.68                               
 CO      ‖ Cliff swallow      |     148 |        2914 |         5.08                               
 CO      ‖ Lark bunting       |      91 |        2914 |         3.12                               
 CO      ‖ Rock pigeon        |      82 |        2914 |         2.81                               
 CO      ‖ American kestrel   |      78 |        2914 |         2.68                               
 CO      ‖ Red-tailed hawk    |      73 |        2914 |         2.51                               
 CO      ‖ Black-tailed jackr…|      61 |        2914 |         2.09                               
 CO      ‖ Killdeer           |      41 |        2914 |         1.41                               
 CO      ‖ Great horned owl   |      31 |        2914 |         1.06                               
 CO      ‖ Black-tailed prair…|      31 |        2914 |         1.06                               
 CO      ‖ Sparrows           |      30 |        2914 |         1.03                               
 CO      ‖ Western kingbird   |      25 |        2914 |         0.86                               
 CO      ‖ Desert cottontail  |      24 |        2914 |         0.82                               
 CO      ‖ Perching birds (y) |      20 |        2914 |         0.69                               
 CO      ‖ Burrowing owl      |      20 |        2914 |         0.69                               
 CO      ‖ Mountain bluebird  |      16 |        2914 |         0.55                               
 CO      ‖ Vesper sparrow     |      16 |        2914 |         0.55                               
 CO      ‖ Red-winged blackbi…|      15 |        2914 |         0.51                               
 CO      ‖ Little brown bat   |      15 |        2914 |         0.51                               
 CO      ‖ Barn owl           |      15 |        2914 |         0.51                               
known_species_STATE-SPECIES_freq+known_species_STATE_freq|                        _      5135 rows  

Sort by percentage

Of course, to answer our main question, we’ll need to sort the column. To sort it descendingly, press ]. Now you should see something like this:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 SK      ‖ Gulls              |       1 |           1 |       100.00                               
 NS      ‖ Striped skunk      |       1 |           1 |       100.00                               
 NL      ‖ Snow bunting       |       1 |           1 |       100.00                               
 MB      ‖ Mallard            |       1 |           2 |        50.00                               
 MB      ‖ Cedar waxwing      |       1 |           2 |        50.00                               
 AB      ‖ Canada goose       |       1 |           2 |        50.00                               
 AB      ‖ Perching birds (y) |       1 |           2 |        50.00                               
 PI      ‖ Yellow bittern     |      72 |         173 |        41.62                               
 CO      ‖ Horned lark        |    1117 |        2914 |        38.33                               
 HI      ‖ Pacific golden-plo…|     329 |         873 |        37.69                               
 AZ      ‖ Mourning dove      |     134 |         438 |        30.59                               
 MS      ‖ Mourning dove      |      60 |         197 |        30.46                               
 WV      ‖ Killdeer           |      21 |          78 |        26.92                               
 VT      ‖ American kestrel   |      29 |         109 |        26.61                               
 IA      ‖ Killdeer           |      49 |         189 |        25.93                               
 UT      ‖ Horned lark        |     260 |        1049 |        24.79                               
 WY      ‖ Greater sage-grouse|      10 |          41 |        24.39                               
 IN      ‖ Mourning dove      |     179 |         773 |        23.16                               
 LA      ‖ Barn swallow       |     168 |         737 |        22.80                               
 MD      ‖ European starling  |      90 |         405 |        22.22                               
 BC      ‖ Red-tailed hawk    |       2 |           9 |        22.22                               
 TX      ‖ Mourning dove      |     984 |        4670 |        21.07                               
known_species_STATE-SPECIES_freq+known_species_STATE_freq|                        ]      5135 rows  

Limit to rows with at least 20 collisions

Hmmm, many of the highest-ranking species-state combinations seem to come from “states” — like the striped skunk that was struck in Nova Scotia — with very few reported collisions. So let’s limit the results to species-state combinations with at least 20 reports.

To do that, we’ll use z|, VisiData’s “select by expression” command.

First, type z| to raise the selection prompt. Then, type count >= 20:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 SK      ‖ Gulls              |       1 |           1 |       100.00                               
 NS      ‖ Striped skunk      |       1 |           1 |       100.00                               
 NL      ‖ Snow bunting       |       1 |           1 |       100.00                               
 MB      ‖ Mallard            |       1 |           2 |        50.00                               
 MB      ‖ Cedar waxwing      |       1 |           2 |        50.00                               
 AB      ‖ Canada goose       |       1 |           2 |        50.00                               
 AB      ‖ Perching birds (y) |       1 |           2 |        50.00                               
 PI      ‖ Yellow bittern     |      72 |         173 |        41.62                               
 CO      ‖ Horned lark        |    1117 |        2914 |        38.33                               
 HI      ‖ Pacific golden-plo…|     329 |         873 |        37.69                               
 AZ      ‖ Mourning dove      |     134 |         438 |        30.59                               
 MS      ‖ Mourning dove      |      60 |         197 |        30.46                               
 WV      ‖ Killdeer           |      21 |          78 |        26.92                               
 VT      ‖ American kestrel   |      29 |         109 |        26.61                               
 IA      ‖ Killdeer           |      49 |         189 |        25.93                               
 UT      ‖ Horned lark        |     260 |        1049 |        24.79                               
 WY      ‖ Greater sage-grouse|      10 |          41 |        24.39                               
 IN      ‖ Mourning dove      |     179 |         773 |        23.16                               
 LA      ‖ Barn swallow       |     168 |         737 |        22.80                               
 MD      ‖ European starling  |      90 |         405 |        22.22                               
 BC      ‖ Red-tailed hawk    |       2 |           9 |        22.22                               
 TX      ‖ Mourning dove      |     984 |        4670 |        21.07                               
select by expr: count >= 20                                                      z|      5135 rows  

Next, press Enter to complete the action. Now you should see something like this:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 SK      ‖ Gulls              |       1 |           1 |       100.00                               
 NS      ‖ Striped skunk      |       1 |           1 |       100.00                               
 NL      ‖ Snow bunting       |       1 |           1 |       100.00                               
 MB      ‖ Mallard            |       1 |           2 |        50.00                               
 MB      ‖ Cedar waxwing      |       1 |           2 |        50.00                               
 AB      ‖ Canada goose       |       1 |           2 |        50.00                               
 AB      ‖ Perching birds (y) |       1 |           2 |        50.00                               
 PI      ‖ Yellow bittern     |      72 |         173 |        41.62                               
 CO      ‖ Horned lark        |    1117 |        2914 |        38.33                               
 HI      ‖ Pacific golden-plo…|     329 |         873 |        37.69                               
 AZ      ‖ Mourning dove      |     134 |         438 |        30.59                               
 MS      ‖ Mourning dove      |      60 |         197 |        30.46                               
 WV      ‖ Killdeer           |      21 |          78 |        26.92                               
 VT      ‖ American kestrel   |      29 |         109 |        26.61                               
 IA      ‖ Killdeer           |      49 |         189 |        25.93                               
 UT      ‖ Horned lark        |     260 |        1049 |        24.79                               
 WY      ‖ Greater sage-grouse|      10 |          41 |        24.39                               
 IN      ‖ Mourning dove      |     179 |         773 |        23.16                               
 LA      ‖ Barn swallow       |     168 |         737 |        22.80                               
 MD      ‖ European starling  |      90 |         405 |        22.22                               
 BC      ‖ Red-tailed hawk    |       2 |           9 |        22.22                               
 TX      ‖ Mourning dove      |     984 |        4670 |        21.07                               
known_species_STATE-SPECIES_freq+known_species_STATE_freq| "count >= 20" | searc z|      5135 rows… 

Finally, press " to create a new sheet containing only the selected rows:

 STATE    SPECIES            | count  #| state_total#| pct_of_state%                              
 PI      ‖ Yellow bittern     |      72 |         173 |        41.62                               
 CO      ‖ Horned lark        |    1117 |        2914 |        38.33                               
 HI      ‖ Pacific golden-plo…|     329 |         873 |        37.69                               
 AZ      ‖ Mourning dove      |     134 |         438 |        30.59                               
 MS      ‖ Mourning dove      |      60 |         197 |        30.46                               
 WV      ‖ Killdeer           |      21 |          78 |        26.92                               
 VT      ‖ American kestrel   |      29 |         109 |        26.61                               
 IA      ‖ Killdeer           |      49 |         189 |        25.93                               
 UT      ‖ Horned lark        |     260 |        1049 |        24.79                               
 IN      ‖ Mourning dove      |     179 |         773 |        23.16                               
 LA      ‖ Barn swallow       |     168 |         737 |        22.80                               
 MD      ‖ European starling  |      90 |         405 |        22.22                               
 TX      ‖ Mourning dove      |     984 |        4670 |        21.07                               
 NH      ‖ American kestrel   |      24 |         117 |        20.51                               
 PI      ‖ Pacific golden-plo…|      35 |         173 |        20.23                               
 SC      ‖ Killdeer           |      45 |         229 |        19.65                               
 ID      ‖ American kestrel   |      30 |         172 |        17.44                               
 VA      ‖ Mourning dove      |      92 |         544 |        16.91                               
 VA      ‖ Killdeer           |      89 |         544 |        16.36                               
 NE      ‖ Barn swallow       |      68 |         419 |        16.23                               
 AR      ‖ Mourning dove      |      30 |         187 |        16.04                               
 RI      ‖ American kestrel   |      35 |         219 |        15.98                               
known_species_STATE-SPECIES_freq+known_species_STATE_freq_selectedref|            "       498 rows  

There you have it! The Yellow Bittern accounted for more than 40% of the reported, known-species collisions in “PI” (the FAA’s abbreviation for “USA-possessed Pacific Islands,” such as Guam). In Colorado, the Horned Lark has been nearly as collision-dominating, as has the Pacific Golden-Plover in Hawaii, and Mourning Doves in Arizona and Missouri.

Take it one step further

What if we want to find the species that are disproportionately involved in collisions in their state? How would you do that? (Hint: It involves just one more frequency table and one more join.)