Skip to content

Instantly share code, notes, and snippets.

@amdevine
Last active October 19, 2021 18:20
Show Gist options
  • Save amdevine/1d676407f7b904da0903e1151a4cbcaa to your computer and use it in GitHub Desktop.
Save amdevine/1d676407f7b904da0903e1151a4cbcaa to your computer and use it in GitHub Desktop.
Count unique, non-blank taxa in Excel

1. Make table

Make tabular data into a table (Insert → Table). Assign a short name to the table (Table Design → Table Name).

Table: taxa

A B C D E F
1 kingdom phylum class order family genus
2 Animalia Mollusca Gastropoda Neogastropoda Fasciolariidae Fasciolaria
3 Animalia Arthropoda Insecta Lepidoptera Gelechiidae Fasciotina
4 Bacteria Firmicutes Clostridia Clostridiales Ruminococcaceae Fastidiosipila
5 Plantae Tracheophyta Magnoliopsida Apiales Araliaceae Fatsia
6 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Moraxellaceae Faucicola
7 Animalia Mollusca Gastropoda Pachychilidae Faunus
8 Plantae Tracheophyta Magnoliopsida Proteales Proteaceae Faurea
9 Animalia Annelida Polychaeta Terebellida Fauveliopsidae Fauveliopsis
10 Fungi Basidiomycota Agaricomycetes Agaricales Mycenaceae Favolaschia

2. Create extra columns

Create new column, family_genus, which populates if the genus column is not empty. (This column can be used to avoid genus homonym problem.)

G2: =IF([@genus]<>"", [@family]&"_"&[@genus], [@genus])

A B C D E F G
1 kingdom phylum class order family genus family_genus
2 Animalia Mollusca Gastropoda Neogastropoda Fasciolariidae Fasciolaria Fasciolariidae_Fasciolaria
3 Animalia Arthropoda Insecta Lepidoptera Gelechiidae Fasciotina Gelechiidae_Fasciotina
4 Bacteria Firmicutes Clostridia Clostridiales Ruminococcaceae Fastidiosipila Ruminococcaceae_Fastidiosipila
5 Plantae Tracheophyta Magnoliopsida Apiales Araliaceae Fatsia Araliaceae_Fatsia
6 Bacteria Proteobacteria Gammaproteobacteria Pseudomonadales Moraxellaceae Faucicola Moraxellaceae_Faucicola
7 Animalia Mollusca Gastropoda Pachychilidae Faunus Pachychilidae_Faunus
8 Plantae Tracheophyta Magnoliopsida Proteales Proteaceae Faurea Proteaceae_Faurea
9 Animalia Annelida Polychaeta Terebellida Fauveliopsidae Fauveliopsis Fauveliopsidae_Fauveliopsis
10 Fungi Basidiomycota Agaricomycetes Agaricales Mycenaceae Favolaschia Mycenaceae_Favolaschia

3. Get unique taxa in each rank

Open a new tab and add a header for each rank.

A B C D E F
1 kingdom phylum class order family family_genus
2

In cell A2, use FILTER() to filter out all blank values, UNIQUE() to get list of all unique entries (as a spilled array), and SORT() to sort alphabetically.

A2: =SORT(UNIQUE(FILTER(taxa[kingdom], taxa[kingdom]<>"")))

A B C D E F
1 kingdom phylum class order family family_genus
2 Animalia
3 Bacteria
4 Fungi
5 Plantae

Drag the formula from A2 into B2:E2. Copy the formula into F2 - may need to go in and manually edit the column names so it pulls from family_genus, not genus.

A B C D E F
1 kingdom phylum class order family family_genus
2 Animalia Annelida Agaricomycetes Agaricales Araliaceae Araliaceae_Fatsia
3 Bacteria Arthropoda Clostridia Apiales Fasciolariidae Fasciolariidae_Fasciolaria
4 Fungi Basidiomycota Gammaproteobacteria Clostridiales Fauveliopsidae Fauveliopsidae_Fauveliopsis
5 Plantae Firmicutes Gastropoda Lepidoptera Gelechiidae Gelechiidae_Fasciotina
6 Mollusca Insecta Neogastropoda Moraxellaceae Moraxellaceae_Faucicola
7 Proteobacteria Magnoliopsida Proteales Mycenaceae Mycenaceae_Favolaschia
8 Tracheophyta Polychaeta Pseudomonadales Pachychilidae Pachychilidae_Faunus
9 Terebellida Proteaceae Proteaceae_Faurea
10 Ruminococcaceae Ruminococcaceae_Fastidiosipila

4. Count unique taxa

Use the COUNTA() function to count non-blank values. To refer to a spilled array cell, use the cell name and a pound sign (A1#).

Rank Formula Count
kingdom =COUNTA(A2#) 4
phylum =COUNTA(B2#) 7
class =COUNTA(C2#) 7
order =COUNTA(D2#) 8
family =COUNTA(E2#) 9
family_genus =COUNTA(F2#) 9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment