An edtech CEO shares his perspective on all the big data buzz.
GUEST COLUMN | by Jose Ferreira
I was on a panel at SXSWedu this year. At the beginning, our moderator, Dr. Rod Berger, invited the audience to suggest buzzwords that the panelists shouldn’t be allowed to use. Someone said “big data.” (I tried to comply for about 5 minutes before I gave up.) That’s like calling “anatomy” or “supply and demand” buzzwords. How can a branch of science be a buzzword?
Someone else at SXSWedu said, “I still have yet to hear anyone explain what big data is!” as if that were an indictment of big data. To my ears, that sounds like saying, “I still have yet to hear anyone explain what the Higgs Boson is!” And, fair enough, they are both relatively new science. So let’s explain big data now. And then let’s examine the arguments against using this powerful personalization science to help kids learn.
The data have always been there. What’s changed is that technology has recently made them easy to capture.
Like the Higgs Boson, big data has been around forever, in that it’s part of the natural law of the universe. And similarly, it has attained prominent notice by human society over recent decades. But big data has been used by human beings for a long time — just in bricks-and-mortar applications. Insurance and standardized tests are both examples of big data from before the Internet.
The data have always been there. What’s changed is that technology has recently made them easy to capture. The Internet and mobile make everything done on a device capturable. Scanning technology is becoming mainstream, whether biometric scanners you wear on your body or hold in your phone, or logistical scanners embedded in packages and products.
So now the mathematics of big data are suddenly usable at scale. And that’s all big data is: a type of mathematics. Just like calculus is the mathematics of change, and probability is the mathematics of likelihood, big data is the mathematics of effectiveness. It aggregates data at scale — it doesn’t work at small scale. But when you’ve got enough data in one place, and if those data are “normalized” (meaning they can be made to adhere to some central rules, standards, or taxonomies), then you can start finding interesting patterns and outliers. And therein is the payoff: some of those patterns will turn into powerful strategies that can be used to discover flaws in and improve a given process. Big data can optimize giant complex processes, both centrally and for each individual user, in a way that bricks-and-mortar data gathering — or simply ignoring data — never could. The bigger the system, the more big data adds value. And there are few systems bigger than the global education system.
Big data in education has huge potential to improve learning materials. Education by its nature produces tremendous amounts of data thanks to a) the extended amount of time students spend working with learning materials and b) the strong correlations between educational concepts, which generate cascade effects of insights. Up until now those data were not remotely capturable at scale. Now they are.
We can use these data to generate concept-level proficiency measurements. These can in turn optimize outside-of-class work, so that each student receives a constantly-updating personalized textbook optimized down to the concept. Optimizing outside-of-class work means kids come to class better prepared. Big data can also support teachers by helping answer questions like “We covered some tough material last week — how well do they understand it?” “What else do they need to do to master a given concept?” “What score would a particular student get on Friday’s quiz if he took it today?” “How much productive time has a student spent working this week?”
Such insight should only be a positive thing for teachers, students, parents, and others. So why are some people so scared of big data in education? There seem to be a few main arguments they use to try to scare everyone else away from big data:
- Students’ privacy will be violated. Google, Facebook, and other consumer web companies violate our privacy. But that’s only because they have an ad-based business model. They can only make money by selling your data — and degrading the product experience with ads. How do they get away with this? They give away the product. They give you the choice to watch that TV show or search for that page, and in exchange they mine your data to sell you things. I agree wholeheartedly that the idea of huge corporations selling and reselling children’s personal data to make money is totally unacceptable. But there is no particular reason to believe that this will occur in education. I don’t know of a single ad-supported business model in the traditional education space. It makes no sense for an education company to sell data, and any company that decided otherwise would be immediately destroyed by the resulting outcry. However, though I think the risk is low, the stakes are high. Everybody in the education ecosystem should be held to the highest custodial standards, and there should be little forgiveness for transgressions.
- Data will replace teachers. It won’t. This idea doesn’t even make any sense to me. Teachers and data do totally different things. Data is only additive, like x-rays in hospitals and instant replay in sports. Data adds concrete information to a teacher’s observations and intuition, but it will never replace experience, personal relationships, and cultural understanding. Think of Moneyball. Statistics haven’t replaced talent scouts — they’ve just armed them with more than intuition.
- Data will be used to judge teachers. This is very unlikely, for the simple reason that classrooms don’t produce much data. A lecture produces no data at all. A flipped classroom produces a little data, but not nearly as much as people think. However, students reading textbooks and doing practice questions produce extraordinary amounts of data. So big data can optimize outside-of-class work like you wouldn’t believe, and help better prepare students for class. But because the data are largely being generated outside of class, it would be impossible to produce an algorithm for measuring teachers that is remotely as effective as simply observing them directly.
- Data dehumanizes students. This is less an argument than it is a general slur, but one does hear it used increasingly (and increasingly hysterically) by opponents of big data. “Your child is now a data point,” they say, seeking hopefully to end the debate before it starts (as if angrily denouncing a new science has a promising historical track record). Human beings produce huge quantities of data when they study. We also have a unique genetic code, measurable body chemistry, and are subject to the laws of physics. The idea of students and teachers not taking advantage of educational data for this reason makes as much sense as doctors not prescribing medicine because it “reduces” patients to chemicals and biological processes. Can you imagine a parent screaming, “You’re reducing my child to her chemical composition!”? Um, no, we’re just giving her Tylenol. Because she has a headache. Human beings can’t be perfectly understood by science, but we are governed by science. Those who would deny data-driven personalized educations materials to children are no different than religious fundamentalists who deny their children access to modern medicine because they think prayer works better.
- Big data tools will just enrich for-profit corporations. This is a classic ad hominem argument. It casts vague aspersions on everyone who works for a for-profit company, as if incorporation status alone were sufficient to judge the character of a group of people or the quality of a product. Virtually every component of the education system is made by a for-profit company — the building, the lunches, the materials. Sure, public schools themselves are not-for-profit. But teachers aren’t; they take salaries. Data opponents argue that “education is too sacred for people to profit from.” By that logic, teachers should work for free. Education is sacred. To me, it’s the most important thing in the world. So I’m delighted that the full creative energies of society, regardless of incorporation status, are focused on it. Working at a for-profit doesn’t make you a bad person and working at a non-profit doesn’t make you a saint. Let’s focus on judging only what works best for kids, not the tax status of organizations.
These arguments are the kind that one expects in the very early days of a big new movement. They will seem increasingly naïve as time goes on.
It’s early days still, but sustained, large-scale efforts to use data to improve education are currently underway. And because of the very nature of big data, the results will be highly measurable. Exactly two logical possibilities follow: Big data will either improve net learning outcomes or it won’t. And when it’s proven that it does, no school will be able to ignore big data, as doing so will give their students a structural disadvantage versus other schools, other states, or other countries. Parents and students won’t stand for it.