The Elephant(s) In the Room – Just How Many Are There?

Counting Elephants – Using Big Data to Solve Big Problems, it’s as easy as 1-2-3

One … two … three … four hundred … five hundred thousand …. How many elephants can you count before it’s too many (or too much)?

Counting is one of the first skills learned as a child.  Before addition and subtraction, the numerical building blocks of 1-2-3 are right there with the ABCs.  Whether it’s your blessings or the dealer’s cards or your money, counting comes in handy.  How many or how much is core to decision making.  How much money or how many resources do I have?   How many does the enemy have?

Counting to 10 or maybe 100 is easy, but as more needs to be counted, it becomes tedious and time intensive.  The practice loses its return on energy.  That’s where math and probability and statistics come in.  Using what can be counted easily can be leveraged not only to count more but also to add value to the meaning of what is counted.

Enter the classic “what to wear” problem (poignant for math geeks).

http://www.intmath.com/counting-probability/2-basic-principles-counting.php

Instead of laying out each combination and “counting” it, you know how many outfits you have.  This simple example can be exploited in far greater combinations…to the nth degree.

But what if you don’t know how many shirts and pants you have?

Big Data Counting – The Next Generation

Counting is another excellent venue for exploring Big Data.  Just as math saves hours of manipulating closet items, so Big Data can help with Big Problems, providing greater choices and better decision capability as a result.

Let’s look at a Big counting problem –

Just how many elephants are there in Africa?  (and why does anyone care?)

A shocking increase in poaching has ripped down African elephant populations.  In the past 10 years, the African elephant population has taken a dramatic hit with estimates that 12,000 plus a year have been slaughtered since 2006.

 “The threat of local extinction feels very real. In October 2013, Elephants Without Borders flew a survey over a park where we had previously counted more than 2,000 elephants. We counted just 33 live elephants and 55 elephant carcasses. That is why this research is so important.”

Dr. Mike Chase, director and founder, Elephants Without Borders.  http://www.elephantswithoutborders.org

Wildlife preservation is a delicate entente in the best of circumstances, but the lucrative draw of poaching in the myriad of African countries where they habitate has challenged several of its iconic epicenters.  Lions, rhinos and elephants are the majestic leaders of a rich wildlife pyramid whose dramatic loss crushes the whole ecologic system, including the native peoples that live and exist off the balance.

Poaching itself is lucrative.  The transit from Africa to Asia transforms ivory at $200 to over $2000.  Because of international standards outlawing this black market material, poaching profits only illicit activity and most dangerously – terrorism.

Elephants Without Borders

Except in South Africa

South Africa “suffers” from too many elephants.  Here the growing numbers continue to roam and forage as is their nature.  That means knocking over even the most sturdy of trees and stripping them of the best digestible leaves.  Just imagine an elephant walking through your yard or the neighborhood park taking down a couple trees that look tasty.  Imagine what a herd of 20-30 can do.  They just don’t stand still either.  They keep on the move, journeying for miles in a day carving a pachyderm hurricane path.

In any amount, this is nature’s process, culling the forest for new vegetation.  Their trails create natural fire breaks and they dig for water which other animals use.  But  where farm and urban sprawl encroach this roaming territory, it quickly becomes man versus animal.  The number of touch points are growing too.  The nature of elephants – their survival – is roaming.  Their legendary memory too has them cross paths where man’s development has erased the past.

To attempt that delicate balance, game parks in South Africa have taken to birth control and water management methods in order to keep their numbers in check.

elephantNumbers_v3

eNCA

So What Numbers Are We Talking?

Anyone who has tried to count children at a birthday party or getting all students back into a classroom after recess knows the challenges of counting live bodies.  Counting crowds is actually a science.  And Science isn’t about Knowing so much as Getting a Good Estimate.  Here’s how they counted President Obama’s inauguration crowd.

https://www.youtube.com/watch?v=3AwMmEYLWt0

Although a several ton elephant is noticeably slower and harder to miss, expand the search over the wildernesses of half a continent dissected into several countries, some war torn, and accurate counting is hard to imagine.  But someone is trying.

Not Just Throwing Money at the Issue

That effort is the Great Elephant Census, the largest pan Africa aerial survey since the 1970s, and it’s backed by one of the world’s smart guy-in-the-room icons – Microsoft co-founder Paul Allen.  Not only does this count have deep pockets, it also has expert guidance.

ABOUT THE COUNT

The Great Elephant Census is applying a strategic, consistent approach to counting elephants in numerous countries in varying climate and terrain, with an integrated audit program in situ.

The Great Elephant Census is designed to provide accurate and up-to-date data about the number and distribution of African elephants by using standardized aerial surveys of tens of hundreds of thousands of square miles. Dozens of researchers flying in small planes will capture comprehensive observational data of elephants and elephant carcasses. Our standardized method of data collection, which is validated by an independent TAT advisor ensures all data is impartial and accurate.

It’s somewhat like counting the crowds for President Obama’s inauguration.  Even with such meticulous effort though, most elephant accounting is predicated on “known” and “estimated” numbers.

Elephant Database

But … Back to Big Data

So that’s how the experts are counting elephants.  Let’s explore counting elephants instead with a Big Data lens.  An elephant census doesn’t have to solely be tallying head counts, albeit a magnificent head with flowing ears and strong tusks.  The count can be created through a variety of volumes of data that exists already and grows by the minute.  In a Big Data Elephant Census, information is created by the community and also serves the communities in return.

Big Data Elephant Census begins with a data lake of information collected from the prevalent sources:  cell phone usage, transactional data, weather, heat signature, game warden activity/reports, international shipping and markets, and of course, social media.  Big Data ingests the volume, velocity and variety of the data to look for patterns that emerge.  Like trying to count moving children, Big Data can exploit information that is too complex for “naked” human observation.  Like picking outfits from the mathematically derived wardrobe, Big Data Elephant Census provides an answer to how many elephants as it elicits the holistic picture of what that means.

THE POINT OF COUNTING ELEPHANTS NOT TO KNOW HOW MANY ELEPHANTS THERE ARE.  THE POINT OF COUNTING ELEPHANTS IS TO LEARN THE ELEPHANT POPULATION EFFECTS on OUR LIVES  – DESIRABLE AND UNDESIRABLE.

The point of counting elephants is not just to know how many elephants there are; we want to know all the factors that evolve in the elephant environment.  How does the diverse animal and vegetation habitat ebb and flow with the tramping of elephant feet?  How are indigenous and foreign humans influencing and being influenced by the elephant footprint? (har har) How are poaching and anti-poaching efforts impacting the community as well as the elephants?  How are farming and native livelihoods affecting and being affected?  What other passive economic factors, weather, and politics shift accordingly?

CONSERVATION IS NOT A STASIS …

So let’s stop trying to capture a picture and instead focus on what the flow is.  Pulling Big Data elephant count from a volume, velocity and variety of data sources articulates how the system (man and animal) manifests.  Instead of trying to chase the right amount of elephants, Big Data Elephant Counts observe the evolving energy to find the signals in the noise.  Gentle shifts or environmental shocks are recorded in situ with all the elements and players.  Big Data Elephant Counting is less reactive and is both predictive as well as evolving – like the trails we all carve through the forest.

https://datafloq.com/plans/?aid=F25D2D

Does your Doctor Know It’s Safe to Take That? Big Data Replies

Friday’s post contended that scientific method has many holes in its application.  Ben Goldacre’s “Battling Bad Science” Ted Talk explains one facet of this concept.

Big Data addresses:

Why should the ancient practice of scientific method be questioned?

AUTHORITY.

As individuals in society, we hold others in regard for accomplishments that give them authority, such as doctor for their medical degree. Although with the internet at our fingertips we have gained access to ever-greater amounts of information, we have also learned some skepticism, but still retain some sheep mentality.

Goldacre points out we still have a retained awe for authority. With a simple example, he explains how authority can be accepted by a large, popular audience when the authority is actually less than ideal.

With the ubiquity of the internet, authority will only continue to be an issue for any organization or society at large. Big Data is more of an open source platform which involves creating data lakes.  These currently infuse the data silos of an organization, or in the case of drug efficacy, corporate secrets.

“SCIENTIFIC” STUDIES

Goldacre expounds upon how cause and effect studies are “published” with basic flaws in even the simplest cases. The testing environment does not accurately, or sometimes even remotely, simulate the results touted. In addition, the plethora of factors involved is rarely accounted.   The test sample sets are representative of general or specific populations, but are these representative of YOU?

Because Big Data is able to consume a vast variety of data, not adhering to strict control methods of traditional scientific method frees the data to more readably present a viable pattern. Trying to hold all other variables constant in a scientific experiment is challenging at best and completely unrealistic practically at worst. (In real life, you can’t hold all the scientific experiments environmental factors constant to obtain the same favorable results.)

OUTCOMES

Goldacre somberly explains then that these simple examples are just that – simple. Drug studies that are the basis of doctors’ “knowledge” of treating YOU and society are based upon far more complex … and jaded processes.

Our beliefs and expectations of a drug’s efficacy shape the outcome. He gives several examples of how data is effectively rigged to produce a carefully prepared outcome. Thus making the result look … like what they want you to see.

One of the premises of Big Data is finding patterns in the data, not looking to prove or disprove a theory. Therefore, trying to rig an outcome one direction or the other is not a Big Data practice.

(…so would a drug company ever what to use it?)

MISSING DATA

Goldacre’s final, sobering point was actually the jumping off point for his next Ted Talk on how drug trials have dangerously biased results.

Missing data is one of the greater challenges to Big Data execution. Several methods are in practice to compensate for gaps such as null values or incongruous data sets. The difference with Big Data is that it readily addresses missing data as opposed to discounting it as Ben Goldacre explains in his examples. Because Big Data involves huge volumes of data points, the missing data compensation practices more readily present an accurate representation of the information.