Science is Real. Measurement is Real. Improvement Is Real

Bill Gates, the co-founder of the company I work for and a personal hero of mine, has an op-ed in the Wall Street Journal titled “My plan to fight the world’s biggest problems.” It’s an exciting piece because it ties together several of my recent posts very well.

Science allows us to predict, control, and improve variation in the world. In order to actually make progress to these goals, it’s important to establish exemplars of great work. This is enabled through operational definitions that allow concepts to be measured. The quest for progress in science collapses when measurement becomes too difficult tor too expensive.

But the reverse is also true: progress in science begins when measurement becomes accessible.

Bill Gates’ op-ed is so awesome because he brings us back to the real world. When someone says “science,” others thinks of some cartoon view of men in white coats in a laboratory. When someone says that goal of science is the prediction, improvement, and control of variation, someone else will say that such is a “very narrow definition of science, downgrading as it does understanding and explanation.”

But the person who writes you write like Bill Gates does — who never even bother with the word “science” and hammers in that improvements are real:

Such measuring tools, Mr. Rosen writes, allowed inventors to see if their incremental design changes led to the improvements—such as higher power and less coal consumption—needed to build better engines. There’s a larger lesson here: Without feedback from precise measurement, Mr. Rosen writes, invention is “doomed to be rare and erratic.” With it, invention becomes “commonplace.”

In the past year, I have been struck by how important measurement is to improving the human condition. You can achieve incredible progress if you set a clear goal and find a measure that will drive progress toward that goal—in a feedback loop similar to the one Mr. Rosen describes.

This may seem basic, but it is amazing how often it is not done and how hard it is to get right. Historically, foreign aid has been measured in terms of the total amount of money invested—and during the Cold War, by whether a country stayed on our side—but not by how well it performed in actually helping people. Closer to home, despite innovation in measuring teacher performance world-wide, more than 90% of educators in the U.S. still get zero feedback on how to improve.

An innovation—whether it’s a new vaccine or an improved seed—can’t have an impact unless it reaches the people who will benefit from it. We need innovations in measurement to find new, effective ways to deliver those tools and services to the clinics, family farms and classrooms that need them.

… that’s the sort of person who can make a difference. The theory of science, measurement, and improvement are all left below the surface. What is left is a how-to guide to build a better world.

I write this blog for selfish reasons, I enjoy learning about the world. Bill Gates does what he’s doing to change the world.

This Too Shall Pass

The Big Think has a rather poorly worded article, “Can we reach the end of knowledge.”

The article borders are incomprehensibility, because it confuses three things: ways of knowing, which are how we understand the world, science, one way of knowing based on testing falsifiable hypotheses, and normal science, which is a social phenomenon capable of scientific progress through the exemplars of good research.

ways_of_knowing_0

Humans will have “ways of knowing” as long as we exist, and science as long as we desire it, so the only sensible way to ask the question is how normal science will end: how will we stop making scientific progress?

Assuming a lack of a nuclear holocaust or other calamity, we will stop making progress in science for the same reason that we will stop making progress in the construction of propeller planes (a technology that has been in decay since the 1940s): the costs will exceed the benefits.

Three broad possible mechanisms for the end of normal science, therefore, are:

1. Increase in the costs of normal science, all other things being equal, or
2. Decrease in the benefits of, normal science, all other things being equal, or
3. Some external change, in other words, all things stop being equal.

On way the costs of normal science might increase is if that non-scientific fields outbid scientific fields for workers whose skills are essential to science. We may already be seeing this happen. A bit ago, Razib Khan had a much better written article, “The Real End of Science,” in which he noted the increase in scientific cheating. This is presumably undetected because there are too few scientists relative to the work we have available to them, and how much we are paying them.

article_retraction_gnxp

Related to this, normal science may end because of a decrease in the benefits of normal science. Perhaps the economic return on capital in both the short, medium, and long terms will be relatively low for scientific investments as opposed to capital improvements, and so it does not make sense to pay enough for scientists to engage in research that can make progress.

Thirdly, the ecosystem that supports normal science might collapse, changing the costs and benefits simultaneously. For instance, folks like Diane Ravitch are openly hostile to normal science and the federal-academic complex that supports it. A coalition of leftists and rightists could take down or deform the Large Research Universities and the Grant Funding Agencies to greatly retard normal science, subjecting them to the same lobotomy of low wages that has destroyed the American teaching profession.

Of course normal science will end. The important questions are when it will end, and who will miss it?

The Search for Academic Utility

Over the past few weeks I talked a lot about “paradigms.” Paradigms are “research programs” that focus on a few exemplars of high quality work. This allows science to make progress, and breaks up “old boy” networks by privileging results over connections. The need for progress also allows students to have better lives after they graduate. Professors, like all people, crave money, power and respect.

Thus, normal science, paradigms, research programs, and exemplars align our the need for progress, student’s need for good lives, and professor’s need for professional accomplishment. This is how academia works. Science is not a cartoon. It is a great human achievement that gets human beings to predict, control, and improve variation of the objects scientists study.

Normal Science is good because it is useful, not because it is True.

Consider three areas of work: race-based explanations for school performance, my UFO theory, and the ancient astronaut theory of Great Pyramid construction. If I had to bet, I would bet a great deal of money there’s a very strong impact of race in academic performance not explainable by income, I would bet a small amount of money the aliens at Roswell were from Japan, and I would bet against someone claiming that the logistics of Great Pyramid construction was designed by creatures from beyond the Moon.

ancient_ufo

But attempting to found an academic career on either of these theories would lead to a failure to gain tenure. The reason is all are currently outside of the normal science in educational psychology, digital humanities, and Egyptology. None of these theories are currently useful in the field, so none are pursued.

ways_of_knowing_0

Normal Science is just part of Science, which along with Inquiry are two of the ways of knowing about the world. There’s more to this world than captured in data sets. My friend Mark Safranski recent captured readers after linking to a data set, stating “ there are hidden qualitative decisions in who did the counting, how and by what yardstick.” Indeed, Normal Science has even more limitations than that.

A lot of grief is caused by considering Science the search for Truth. It may that, but Normal Science is the search for utility in an academic context.

A Lucid Visit

Yesterday I had a lucid dream of visiting my grandparents.

That is, I had a dream of it, but I was aware that I was dreaming, so I could make the most of my time.

Lucid dreaming requires being in the hypnagogic state, where you possess consciousness without wakefulness. You can enter a hypnagogic state from wakefulness, or from dreamland. The problem in either case is maintaining consciousness, as it’s easy to lose in dreamland.

Yesterday, I entered the hypnagogic state by counting to myself while falling asleep. I first began counting sheep, but that was too cartoony, so then I imagined trying to count sheep in a pen, then cattle in a field, and that became cattle in my grandfather’s field. Soon I was counting the steps to his house. Then I was in a hypnagogic state.

I did not want to lose consciousness, so I then looked down while moving. In software terms, the human mind has a “known bug” in the graphics driver while sleeping: if you look down while you’re walking in a dream, you’re feet either will be invisible or else will look very, very strange. This is so noticeable that even in dreamland, it alerts your consciousness. So you can stay lucid dreaming even in dreamland by looking at your feet while walking.

In a lucid dream you can control your environment (instead of a normal dream, which is like watching a movie). You can also warp your environment if you want to, though this requires a noticeable act of will. Yesterday, I just controlled what I did and where I went, but I let dreamland unfold as it wanted to.

I visited the garage, saw the things inside vividly and individually. “There are things here I never asked about,” I thought. Outside the garage, saw the sod house flicker into and out of existence.

I entered the farmhouse through the front door. I saw the little entryway, and all the sounds inside, WNAX on the radio, my grandfather sitting down by the table, my grandma standing, my dad was there too. I heard them all. I felt the shadows of the living, but I only heard my grandpa, my grandma, and my dad.

The sounds and the textures were hyper-real, though visually everything was like a ‘progressive render,’ where it became noticeably clearer as I focused. I saw the little TV on the refrigerator. I walked from the kitchen to the dining room. I saw the old phone, the desk with the recorder that my idiot uncle gave my grandparents, the plants, and the cabinet with the radio. (I knew there was an Atari in there somewhere, though I did not look for it.)

I passed through the glass portico into the living room. I felt the tape on the large comfortable chairs. I felt the shadows of the living again. I saw the couch, the painting above the couch, and the chairs on each side. The old television (that I caused havoc with when I was young). The long table with the storage area underneath, I once hid in. The bull.

I saw the loveseat, the window, and walked to the back entry way. I was hopefully because there was a building set I loved, that belonged to my uncle when he was young, and I wanted to see the brand name, but I could not make it out. I could see the pieces vividly, see the army men and the home-made Parcheesi set set, but I could not make out the brand name.

Disappointed, I walked in the remaining rooms of the house. Each was vivid. The downstairs bathroom, my grandfathers room (in which I had a nightmarish flash back to reality, back after my grandfather died, going thru his things with my mom, then back to dreamland). Then the hall again, then up the stairs. I felt the texture, again hyperreal. I saw the old fire alarm / extinguisher / whatever it was — the least safe home-safety device ever created, seemingly constructed to explode glass outward during a fire. “I knew that would kill us all one day,” I thought.

Then I woke.

New Blog Recommendation

New Blog Recommendation

You read tdaxp, so you’re smart.

That’s self-serving flattery, but not untrue. If you’re reading this post, you like intelligent, thought-provoking, and unexpectedly combative posts, like my posts on education reform, scientific research programs, the rise of Christianity, and the Chinese Civil War. You’re the sort of person who probably has also read Tom Barnett on globalization, Mark Safranski on history, Razib Khan on genetics, Catholicgauze on geography, and Lion of the Blogosphere on social class.

With this in mind, I recommend Miss Nurse, RN.

(Parenthetical note:I am writing this from China, and the only one of these sites to be blocked is Lion of the Blogosphere and Miss Nurse, RN. Make of that what you will.)

I have little tolerance for health-advise pseudoscience or sophistry, but I like how “Miss Nurse” takes a critical but educated view of personal health. Her posts don’t treat the reader as an idiot, and have no easy answers.

In other words, they are what we need as we become a sicker and sicker society.

Read Miss Nurse, RN.

The Progress of the Humanities

Over on Facebook, my friend Adam Elkus linked me to an article by David Lake, titled “Theory is Dead, Long Live Theory: The End of the Great Debates and the Rise of Eclecticism in International Relations. [pdf]” The piece is extremely strong, and describes International Relations split into two camps: one engaged in normal science capable of progress, and the other a tiresome collection of “Great Debates” that are never answered.

There’s a lot to love in Lake’s paper — it really is very high quality — but the most evocative image from it is the threat that International Relations will split into two fields that do not even study the same phenomenon. What if the scientists focus on experiments (and quasi-experiments) that can be conducted in the here and now, while the Great Debaters retreat into history and just-so stories?

To me, the best hope to save International Relations from such a fate lies in the “digital humanities.” The digital humanities are not just a method for those interested in the past to escape the “humanities ghetto of low employment and low wages
— rather, the “Digital Humanities” use Big Data techniques to understand our common past, in the same way that companies like Facebook use many of the same techniques to understand many private pasts. (Some more information on the digital humanities is available on the personal site of Jason Heppler of Stanford University.)

As an example, take Lake’s discussion of Zara Steiner‘s Triumph of the Dark, a narrative history of the outbreak of the Second World War. Lake notes the rigor of the book, but sadly states such a work can generate no hypotheses or tests. But a digital humanities approach — say working from the massive newspaper, magazine, book, and census corpora at our disposal, is not so limited. It is easy to imagine hypotheses that explain the reason for the motives of leaders with that amount of data to work with. Perhaps the degree to which “Hitler wanted war” can be tracked by measuring the day-to-day bellicosity of the written works of those he met with? Or might the locations that we knew Neville Chamberlain spent certain parts of his life be linked to pro-peace inflections in the lives of others?

International Relations is the science dedicated to predicting, controlling, and improving the behavior of States. This should be done through hypotheses testing, modeling building, and including the methods of the digital humanities. There are many ways to advance science.

In the chaos of old boy networks – in stagnant fields with no progress – it sucks to be young. But in all those areas where science and technology march hand-in-hand toward progress, it is a joy to be young!

Predicting “Null Results,” with Science

Chris Blattman and Daniel Nexon both link to a paper. “Oil and Conflict: What Does the Cross Country Evidence Really Show?,” published in AEJ: Macro (the article is gated, but an older version is available from SSRN).

The paper purports to show no correlation between the presence of oil and violence. This may or may not be true — the issue seems complicated and as long as North Dakota doesn’t erupt into another round of midwest violence, my life won’t change much one way or another.

What was interesting was Dan Nexon’s commentary on it. I once would have agreed with Nexon’s comment. That was before I learned the tools that allow to conduct research.. Anyway, Nexon’s comment:

Indeed, no one — and I mean no one — who has ever invoked “Popper” or “falsification” as a standard for scientific inquiry should be allowed into a proseminar while graduate students remain actively dissuaded from pursuing research with null results because everyone knows you can’t get null results published.

By “null result,” Dr. Nexon actually means a lack of significant correlation. Nexon’s comment only makes sense if you assume he has been exclusively exposed to fully-saturated models, of the r2 = .15, p = .04 sort beloved by first-year graduate students.

The rigorous way to test for a lack of correlation is in a structural model. For instance, take this structural model from my dissertation:

abbott_dissertation_loadings_md

Each line with a number of it is analogous to an r2 finding in a fully saturated model. (You can read the whole thing if you want details ;-) ). But more important, every box-box, box-circle, or circle-circle pair with no line is assumed to be a “null result,” apart from the variation explained by following the lines of correlation between them. There are some easy introductions to structural equation modeling available. Structural equation modeling allows you not just to have science — but even normal science — while testing for lack for correlation between variables.

Karl Popper’s work on falsification is a hallmark of science. It is required for science to fulfill its objective of predicting, controlling, and improving behavior of the objects we study.

Controversies in Normal Science

tl:td: “Normal Science” refers to science when scientists focus on making progress, not just arguing in circles. The difference between probabilistic and bayesian — or “go-up” and “go-next” — statistics matters much more to the future of normal science and the nonsense that Walt and Meirsheimer came up with in a recent article.

Will F. Moore, at his blog A Second Mouse has a really funny up titled “Commentary on Mearsheimer and Walt.” Not surprisingly, it is a commentary on Mearsheimer’s and Walt’s (d’uh! :-) ) recent post, which I also criticized. Basically, Mearsheimer and Walt wrote a piece in which they demonstrated deep confusion of scientific methods, and lamented the decline of the “old boys network” and its replacement by objective methods of evaluation.

One of the ridiculous parts of Mearsheimer and Walt’s columns is their inability to distinguish substantive from non-substantive divisions in normal science. For instance, a large part of Mearsheimer and Walt’s piece is dedicated to a discussion of “scientific realism,” which appears to be a confused discussion of instrumental validity. Mearsheimer and Walt completely miss the division of scientific research into frequestist and bayesian camps, which Will Moore humorously emphasizes:

Quantitative approaches—particularly the misapplication of hypothesis testing methods which make complete sense in the context of survey research but no sense whatsoever in the context of the analysis of one-off populations—may be wrong, but at least we can systematically say why they are wrong.[4] Grand theory?—welcome to the narrative fallacy and that wonderful little hit of dopamine that your brain gives you in response to any coherent story. And that’s all they’ve got to work with.

..

[4] And again, we know of plenty of alternatives, including the rapid emergence of Bayesian model averaging which is likely to wipe out the cult of incremental frequentist garbage can models. The cult otherwise known by the initials APSR and AJPS.

Previously on this blog, I’ve referred to “frequentist” and “bayesian” statistics as “go-up” and “go-next,” because frequentist work tends to emphasize building a model of reality, while bayesian models tend to focus on predicting what will happen next. As I wrote previously:

The Go-Up view of statistics is that statistics measures the population from which an observation comes from. The appropriate way to go-up is to wait until you have a sufficient number of observations. and then generalize about the population from our observations. So if you are conducting science, and you notice. This is the method that Derbyshire was describing in 2010. A large number of observations of academic performance show consistent gaps between black and white learners. Because we’re “going-up” from observations to populations, we can conclude some things about the population, and how outcomes in the population should work-out over all, but it makes no sense to try to predict any given student’s success based on this. We’re going-up, not going-next.

The Go-Next view of statistics is that statistics gives us the likelihood of something being true, based on what has come before. In Go-Next statistics, population-averages are besides the point. What matters is guessing what’s going to happen, next, based on what you’ve seen before. The whole point is to guess what’s going to work for individuals you know only a few things about, based on your experience with other individuals who shared some things with the new strangers.

..

The superstructure of science changes as the infrastructure of the economy changes. The Go-Next philosophy of statistics, once the peasant stepchild of the serene Go-Up interpretation, now reigns supreme.

The unfolding victory of Go-Next Statistics matters much, much more than, say, the Copernican Revolution. The number of people whose daily conversations were actually impacted by Copernicus may have been a few dozen, all involved in the Papal-Academic complex.

How many times a day does Facebook’s decision of which news to share impact you?

There are real controversies and real research programs in normal science.

Too bad Walt and Mearsheimer know so little about normal science they were unable to identify either.

Money, Power, and Normal Science

Fabio Rojas has a post up titled Theory Death in Political Science. It links to a post by Stephen Saideman, “Leaving Grand Theorists Behind,” which was published as Saideman’s Semi-Spew. (A companion piece was also published at Duck of Minerva and discussed by me earlier.)

Here’s the beginning of the post:

A definition: theory death is when some intellectual group tires of theory based on armchair speculation. Of course, that doesn’t mean that people stop producing theory. Rather, it means that “theory” no longer means endless books based on the author’s insights. Instead, people produce theory that responds to, or integrates, or otherwise incorporates a wealth of normal science research. In sociology, theory death seems to have happened sometime in the 1980s or 1990s. For example, recent theory books like Levi-Martin’s Social Structures or McAdam and Fligstein’s A Theory of Fields are extended discussions of empirical research that culminate in broader statements. The days of endlessly quoting and reinterpreting Weber are over. :(

Now, it seems, theory death is hitting some areas of political science.

What Fabio Rojas calls “theory death” is the “normalization of science.” That is, the establishment of methods that allow for progress in the prediction, control, and improvement of behavior of some object of study (molecule, person, State, etc.) over time.

The next line is particularly important:

Science becomes normalized when the power the Old Boys network achieves through limiting competition is overtaken by the money available for creating progress.

There have been two great flowerings of science in American history. Both emerged from the establishment of the great American University System in the late 19th century, but they accelerated at different times. As I wrote previously:

Following the Second World War science boom, the federal government accelerated the rise of the American research universities. From the Second World War to the Vietnam War, physics was a favorite area for funding. From this we received many new physical inventions, such as a transistor. After the Vietnam War, medicine is a favorite area for funding. Now we have great medical breakthroughs.

While social science research funding is only a fraction of medical research, the federal academic complex ensures that there is bleed through from health sciences to the social sciences as well. The bureaucratic momentum for peer-reviewed scientific research funding. Such funding requires that researchers seek to achieve progress in some areas, which of course privileges normal science (which is capable of achieving progress) relative to non-paradigmatic science (Which is not).

ways_of_knowing_3

The reason that Political Science is late to normalization — why it is experiencing “theory death” later than other fields — come from the obvious exception to this general rule for how academia works:

Professors, like most people, respond to the incentives of power, influence, and money.

The institution of tenure reduces uncertainty regarding money, and focuses the incentives on power and influence.

Power in academia comes from the number of bodies a professor has under him. These bodies might be apprentices (graduate students he advises), journeymen (post-docs who have a PhD and work at the lab, or staff researchers), or simple workers (lab technicians, etc).

Influence in academia comes from the extent to which one is successful in influencing one’s peers. This is typically measured in terms of influence scores, which are a product of how often the academic is cited, weighted by how important of a publication he is cited in.

Unlike other places in academia, professors hope to influence national policy makers, and so are relatively immune to academic discipline. This actually hurts scholarship. For instance, Victor Cha’s otherwise great book on North Korea, The Impossible State, is pretty much ruined by his analysis of Kim Jung Il, which was basically a job application. Likewise, Stephen Walt and John Mearsheimer (who began this discussion by defending the Old Boys network) basically produce political propaganda for the Old Right (pessimistic, Army-focused, and anti-Zionist). The lack of academic discipline has allowed political science to get away with graduating students into the “humanities ghetto” — because skills don’t matter in political science as much as connections, those without connections are left with high unemployment and bitter job prospects:

wages_employment_majors_humanities_ghetto_md

The way forward is probably for grant-funding organizations to support normal science in political science research, and for political agitators to coalesce within agenda-driven “think tanks.” Educational sciences have already experienced this split. It’s time for Political Science to normalize, too.

Definitions and Progress

A couple days ago a post on Duck of Minerva linked to a working paper called “I can has IR theory?” [pdf]. The title was funny, but something about the contents bothered me.

I Can Has IR Theory Appears to have tow components
1. It is an extended hit peace against “neopositivism,” which appears to be a methodology (or something) disliked by the authors. It is difficult to know if this is true, however, because the authors do not bother to define their terms.
1. It includes a discussion of “scientific ontology,” which likewise is never defined.

Unlike “neopositivism” though (the only thing I can tell about which is that the authors — Patrick Jackson and Daniel Nexon — dislike it, and that it appears to be related to quantitative methods), the article includes numerous descriptions of “scientific ontology.” It is these descriptions that bothered me.

“Scientific ontology” appears to be synonymous for “nomological network,” an antiquated and simplistic form of modeling that is prone to error.

First, some passages from Jackson and Nexon’s working paper:

To be more precise, we think that international-relations theory is centrally involved with scientific ontology, which is to say, a catalog—or map—of the basic substances and processes that constitute world politics. International-relations theory as “scientific ontology” concerns:
• The actors that populate world politics, such as states, international organizations, individuals, and multinational corporations;
• Their relative significance to understanding and explaining international outcome
• How they fit together, such as parts of systems, autonomous entities, occupying locations in one or more social fields, nodes in a network, and so forth;
• What processes constitute the primary locus of scholarly analysis, e.g., decisions, actions, behaviors, relations, and practices; and
• The inter-relationship among elements of those processes, such as preferences, interests, identities, social ties, and so on.

(Note how they are measured is left out.)

And this passage (as mentioned above, “Neopositivism” is never defined and only loosely described, so focus on the passage related to “scientific ontology”)

The Dominance of Neopositivism
This line of argument suggests that neopositivist hegemony, particularly in prestige US journals, undermines international-relations theorization via a number of distinct mechanisms:
• It reduces the likelihood that international-relations theory pieces will be published in “leading” journals because neopositivism devalues debate over scientific ontology in favor of moving immediately to middle-range theoretic implications; • It reduces the quality of international-relations theorization by requiring it to be conjoined to middle-range theorizing and empirical adjudication; and
• It forces derivative middle-range theories to be evaluated through neopositivist standards.

(Note that scientific ontology thus excludes “middle-range theoretical implications.)

In an earlier work, I wrote that :

As a measure of construct validity, nomothetic span is more inclusive than Cronbach and Meehl’s (1955) concept of the nomological network, as nomothetic span includes not only how a construct relates to other construct, but also how measures of the same construct relate to each other (Messick, 1989).

Because the undefined concept of “scientific ontology” appears to be more or less identical to the idea of nomological network, which was described a half century ago. Without incorporating measurement into a model, it’s impossible to a functional definition, a method of falsifying the model, or even a way to make useful predictions. And without this ability, it’s impossible to make progress.

Operational definitions are absent from Jackson’s and Nexon’s piece, both from their primary terms, and their view of “scientific ontology.”