Tag Archives: measurement

Structural Equations — or — Translating Theories into Models

My friend Adam Elkus recently asked what made Structural Equation Modeling (SEM) powerful for testing theories, besides the ability to test for null results.

This is my answer.

What I like about SEM is that it allows models to be created that better reflect theories than any other method I know. Other methods introduce a greater source of unmeasurable error — model error — than SEM, because those methods force you take your theories, translate them into another form, and then test those.

Take the example theory (which is crazy):

“While democracy exhibits substantial inertia — more democratic places stay more democratic, less democratic places stay less democratic — communication technology forces us to reshape our understandings of how democracy grows or declines. Within any community, the growth or decline in the strength of democratic institutions mediates outside international pressure entirely through smartphone connectivity.

“By strength of democratic institutions I mean such tings as average turn-over of political offices, number of political questions per year voters are asked to consider, the percentage of major editorials that our critical of government policy. By international pressure I mean UN resolutions that mention a country, statements by foreign ministers that reference a country, and number of applications for McDonalds franchise that were rejected. By smartphone connectivity I mean the fraction of the population that has smart-phones, the average number of web impressions per person to Wikipedia, and the average number of hours per day individuals spend playing Angry Birds.”

OK, let’s create the SEM for it. The generic measurement model for democracy, for time0 and time1 is (converted to a pseudo-Mplus language)

LATENT democracy0 (float);
MANIFEST democracy0 ONTO politicalTurnover0 (float); // [0…1]
MANIFEST democracy0 ONTO politicalQuestions0 (int); // [0…n]
MANIFEST democracy0 ONTO criticalEditorials0 (int); // [0..n]

LATENT democracy1 (float);
MANIFEST democracy1 ONTO politicalTurnover0 (float); // [0…1]
MANIFEST democracy1 ONTO politicalQuestions0 (int); // [0…n]
MANIFEST democracy1 ONTO criticalEditorials0 (int); // [0..n]

// … And

LATENT smartphoneConnectivity (float);
MANFIEST smartphoneConnectivity ONTO ownershipRate (float); // [0…1]
MANIFEST smartphoneConnectivity ONTO wikipediaRate (float); // [0…n]
MANIFEST smartphoneConnectivity ONTO angryBirds(float); // [0…24]

// AND

LATENT internationalPressure (float);
MANIFEST internationalPressure ONTO unResolutions(int); // [0…n]
MANIFEST internationalPressure ONTO fmCriticisms(int); // [0…n]
MANIFEST internationalPressure ONTO mcRejections(int); // [0…n]

// no time0, because we’re assuming that smartphoneConnectivity0 completely mediates the change in democratic trajectory, that isn’t the result of inertia

// OK, now we’d create our latent model

democracy0 LOADS ON democracy1; // the inertia of democracy

internationalPressure LOADS ONTO smartphoneConnectivity; // … smartphones mediate international pressure..
smartphoneConnectivity LOADS ONTO democracy1; // .. onto democracy

Because SEM allows us to so faithfully translate the model of our theories into the model of code, we now have a serious question that isn’t obvious from the paragraph, but is more obvious when we’re in he process of writing down

Because you see your theory in “code” (or matrices if you insist on the algebraic way to do this, which I’ve only used for class), it makes assumptions or mistake jump our more. Like in the first draft of this email I said that smartphones mediate onto democracy, but I didn’t define what they mediated — hence the inclusion of international Pressure.

Using graphviz/dot, here is a picture of what our model “looks” like:

examplesem_md

(A more common format is for a left to right flow of manifest predictor indicators, latent predictor factors, latent outcome factors, and manifest outcome indicators. The format above is chosen to fit well on my blog page, and impatience in taking time to make it look better.)

As I said, the theory’s crazy — but SEM allows that theory to be translated into a model that can be directly tested, and frees us from having to waste our time with hacks like ANOVA, multiple regression, or dead theory disconnected from reality.

Science is Real. Measurement is Real. Improvement Is Real

Bill Gates, the co-founder of the company I work for and a personal hero of mine, has an op-ed in the Wall Street Journal titled “My plan to fight the world’s biggest problems.” It’s an exciting piece because it ties together several of my recent posts very well.

Science allows us to predict, control, and improve variation in the world. In order to actually make progress to these goals, it’s important to establish exemplars of great work. This is enabled through operational definitions that allow concepts to be measured. The quest for progress in science collapses when measurement becomes too difficult tor too expensive.

But the reverse is also true: progress in science begins when measurement becomes accessible.

Bill Gates’ op-ed is so awesome because he brings us back to the real world. When someone says “science,” others thinks of some cartoon view of men in white coats in a laboratory. When someone says that goal of science is the prediction, improvement, and control of variation, someone else will say that such is a “very narrow definition of science, downgrading as it does understanding and explanation.”

But the person who writes you write like Bill Gates does — who never even bother with the word “science” and hammers in that improvements are real:

Such measuring tools, Mr. Rosen writes, allowed inventors to see if their incremental design changes led to the improvements—such as higher power and less coal consumption—needed to build better engines. There’s a larger lesson here: Without feedback from precise measurement, Mr. Rosen writes, invention is “doomed to be rare and erratic.” With it, invention becomes “commonplace.”

In the past year, I have been struck by how important measurement is to improving the human condition. You can achieve incredible progress if you set a clear goal and find a measure that will drive progress toward that goal—in a feedback loop similar to the one Mr. Rosen describes.

This may seem basic, but it is amazing how often it is not done and how hard it is to get right. Historically, foreign aid has been measured in terms of the total amount of money invested—and during the Cold War, by whether a country stayed on our side—but not by how well it performed in actually helping people. Closer to home, despite innovation in measuring teacher performance world-wide, more than 90% of educators in the U.S. still get zero feedback on how to improve.

An innovation—whether it’s a new vaccine or an improved seed—can’t have an impact unless it reaches the people who will benefit from it. We need innovations in measurement to find new, effective ways to deliver those tools and services to the clinics, family farms and classrooms that need them.

… that’s the sort of person who can make a difference. The theory of science, measurement, and improvement are all left below the surface. What is left is a how-to guide to build a better world.

I write this blog for selfish reasons, I enjoy learning about the world. Bill Gates does what he’s doing to change the world.

Definitions and Progress

A couple days ago a post on Duck of Minerva linked to a working paper called “I can has IR theory?” [pdf]. The title was funny, but something about the contents bothered me.

I Can Has IR Theory Appears to have tow components
1. It is an extended hit peace against “neopositivism,” which appears to be a methodology (or something) disliked by the authors. It is difficult to know if this is true, however, because the authors do not bother to define their terms.
1. It includes a discussion of “scientific ontology,” which likewise is never defined.

Unlike “neopositivism” though (the only thing I can tell about which is that the authors — Patrick Jackson and Daniel Nexon — dislike it, and that it appears to be related to quantitative methods), the article includes numerous descriptions of “scientific ontology.” It is these descriptions that bothered me.

“Scientific ontology” appears to be synonymous for “nomological network,” an antiquated and simplistic form of modeling that is prone to error.

First, some passages from Jackson and Nexon’s working paper:

To be more precise, we think that international-relations theory is centrally involved with scientific ontology, which is to say, a catalog—or map—of the basic substances and processes that constitute world politics. International-relations theory as “scientific ontology” concerns:
• The actors that populate world politics, such as states, international organizations, individuals, and multinational corporations;
• Their relative significance to understanding and explaining international outcome
• How they fit together, such as parts of systems, autonomous entities, occupying locations in one or more social fields, nodes in a network, and so forth;
• What processes constitute the primary locus of scholarly analysis, e.g., decisions, actions, behaviors, relations, and practices; and
• The inter-relationship among elements of those processes, such as preferences, interests, identities, social ties, and so on.

(Note how they are measured is left out.)

And this passage (as mentioned above, “Neopositivism” is never defined and only loosely described, so focus on the passage related to “scientific ontology”)

The Dominance of Neopositivism
This line of argument suggests that neopositivist hegemony, particularly in prestige US journals, undermines international-relations theorization via a number of distinct mechanisms:
• It reduces the likelihood that international-relations theory pieces will be published in “leading” journals because neopositivism devalues debate over scientific ontology in favor of moving immediately to middle-range theoretic implications; • It reduces the quality of international-relations theorization by requiring it to be conjoined to middle-range theorizing and empirical adjudication; and
• It forces derivative middle-range theories to be evaluated through neopositivist standards.

(Note that scientific ontology thus excludes “middle-range theoretical implications.)

In an earlier work, I wrote that :

As a measure of construct validity, nomothetic span is more inclusive than Cronbach and Meehl’s (1955) concept of the nomological network, as nomothetic span includes not only how a construct relates to other construct, but also how measures of the same construct relate to each other (Messick, 1989).

Because the undefined concept of “scientific ontology” appears to be more or less identical to the idea of nomological network, which was described a half century ago. Without incorporating measurement into a model, it’s impossible to a functional definition, a method of falsifying the model, or even a way to make useful predictions. And without this ability, it’s impossible to make progress.

Operational definitions are absent from Jackson’s and Nexon’s piece, both from their primary terms, and their view of “scientific ontology.”

The Language of Theory, or, How to Escape the Humanities Ghetto

This morning I read an article by Patrick Thaddeus Jackson and Daniel Nexon, titled “Paradigmatic Faults in International-Relations Theory.” This piece originally appeared in a 2009 edition of Internaionl Studies Quartlerly.

I like when people agree with me, so when I saw my words echoed across time (it’s as if Jackson and Nexon read my post, built a time machine, and told their former selves what a great idea they read on tdaxp). Yesterday, I said it was riduclous to describe the International Relations cliques of “Realism,” “Liberalism,” and such as paradigms. I wrote:

The highlighted passage, originally by Daniel Maliniak simply means that empirical research is increasing, and that non-empirical research is declining, within political science. But Maliniak, and thus Walt and Mearsheimer, bizarrely use “paradigmatic” to refer to less paradigmatic (that is, less capable of progress) fields, and “non-paradigmatic” to more to more paradigmatic (that is, more capable of progress) fields.

Political science has been in the fever swamp for so long that the notion of progress as an outcome of normal science has almost entirely been lost. If Walt and Mearsheimer had their way, it might be lost, and the field simply divided into a stationary oligarchy of old boys network.

As Jackson and Nexon write:

The terminology of ‘‘paradigms’’ and ‘‘research programmes’’ produces a num-ber of deleterious effects in the field. It implies that we need to appeal to criteria of the kind found in MSRP in order to adjudicate disputes that require no such procedures. In order to do so, we spend a great deal of time specifying the ‘‘boundaries’’ of putative research programmes and, in effect, unfairly and misleadingly holding scholars accountable for the status of theories they often view as rivals to their own.

Perhaps the most well-known instance of this kind of boundary-demarcation occurs in the debates surrounding ‘‘realism’’ in international relations theory. The proliferation of countless lists of the ‘‘core commitments’’ of a realist ‘‘paradigm’’—by adherents and critics alike—shifts the focus of scholarship away from any actual investigation of whether these commitments give us meaningful leverage on the phenomenal world, and instead promotes endless border skirmishes about who is and is not a realist (Legro and Moravcsik 1999), whether predictions of balancing are central to the ‘‘realist paradigm’’ (Vasquez 1998:261–65), and so forth. Such debates and demarcations not only distract us from the actual study of world politics, but also harm disputes over international relations theory by solidifying stances that ought to remain open to debate and discussion.

So I enjoyed Jackson’s and Nexon’s takedown of the so-called “paradigms” in International Relations.

But they don’t go far enough.

Their piece ends with an appeal to Max Weber (how non-progressive can you get?!?) and an unfalsifiable taxonomy that I won’t go into

ideal_typical_taxonomy

A more useful conclusion to the paper would have been to recognize that statistics is the language of theory, the language of modeling. Instead of inviting international relations scholars to chase their own tale and bow to Max Weber and the dead, how much more useful would a positive theory of research programs in International Relations have been? For instance, consider a citation indexing method, such as PageRank [pdf] to determine if they are “clusters” PageRank sets in which certain articles were influential (exemplars?) and others were not. Did Jackson and Nexon really have no one availability to sketch even a proposed methodology for testing their claim?

The answer is probably “no.” My purpose isn’t to pick on Jackson and Nexon, but to point out the weakness of International Relations as a whole. In a related post by Patrick Musgrave, titled “The Crass Argument for Teaching More Math In Poli Sci Courses“, the following diagram showing is shown:

wages_employment_majors_md

Which clearly displays a “humanities ghetto,” that includes political science.

wages_employment_majors_humanities_ghetto_md

How can this be, if International Relations is the disciplined extraction of meaning from data, which is the same focus as the high-paying, well-employed fields?

The obvious answer is that International Relations does not teach actually useful methods for the disciplined extraction of data. It does not teach critical thinking or logical reasoning. It teaches something that apes these skills, a rhetorical ability that impresses old scholars and does not help society.

International Relations is a non-progressive field where, by and large, it sucks to be young.

ways_of_knowing_3

In an evocative comment that ties the article and the blog post together, Patrick Thaddeus Jackson states:

I don’t think that it is our job as university faculty to increase students’ future earning potential. Nor do I think that it is our job in teaching PoliSci undergrads to make sure that they can read APSR in the 1980s and 1990s. Our job is to teach students to think critically about politics, and while I am perfectly fine with the suggestion that some statistical literacy can be useful to that end, I am not prepared to give that higher pride of place than things like reading closely, writing cogently, and disagreeing with one another civilly.

The dichotomy that Jackson notes is entirely false. In his own piece, he was not able to express a constructive critical thought about paradigms — the original Nexon and Jackson article is devoid of the model specification or operationalization that would needed to turn his criticisms and taxonomy into something capable of progress. Any competent graduate from the humanities ghetto can read “closely” or write “cogently.” That’s needed is to think usefully, and for this statistical literary is required.

Be Resilient, Part IV: The Importance of Measurement

SOA, Resiliency & Consiliency,” by Stephen DeAngelis, Enterprise Resilience Management Blog, 16 May 2006, http://enterpriseresilienceblog.typepad.com/enterprise_resilience_man/2006/05/the_blogger_wig.html.

Child Labor & Resilient Nations,” by Stephen DeAngelis, Enterprise Resilience Management Blog, 7 September 2006, http://enterpriseresilienceblog.typepad.com/enterprise_resilience_man/2006/09/child_labor_res.html.

But why measure? Why not just wax poetically about social OODA loops, revised OODA loops, and other unfalsifiable concepts? Just because those are unscientific concepts, of course, do not make them wrong.

Maybe we should just think that

that resilience can’t be developed sector by sector. It must be developed holistically, with challenges in each sector attacked simultaneously. Otherwise, advances in one sector are cancelled out by setbacks in others.

The answer is: a “holistic” view of resilience is operationally worthless. Holism replaces action with an ephemeral philosophy that is not relevant for Development-in-a-Box, or anything “in-a-Box.”

I don’t think I am saying anything controversial here. Enterra CEO Steve DeAngelis, who gave the above quote about holistic approaches, earlier qualified his speech by emphasizing that his words should not be taken precisely

Both Safranski and Weeks are correct that resilience, strictly defined, refers only to a bouncing back. Unfortunately, I live in the business world where words are used to “sell” not just explain. In Enterra Solution sales pitches we try to make the point that resilience (i.e., bouncing back) is no longer sufficient if organizations want to thrive, not just survive, when faced with emerging 21st century challenges.

In business, science, are any progressive enterprise that focuses on development, selling is critical. It is crucial to generate theories and objective facts that can be understood, even without some deeper philosophically harmony between partners.

There are times and places for subjective arguments. I’ve lauded subjective perspectives, such as interpretivism and constructivism, on this blog before. Great scientific theories, such as the Wary Cooperator Model, are built from horizontal thinking. Positivism will never explain everything to us, and it may not even explain much that matters to us. When we try to induce meaning from brute facts we may even be deceived.

But that does not detract from the insistence that developmental, progressive fields of study need measurement. That’s how we build useful bodies of knowledge. That’s how we create useful fields for engineers, such as resilient software development.

That’s how science works.


Be Resilient, a tdaxp series
1. How to Measure Resilience
2. How to Measure Agility
3. How to Measure Resiliency
4. The Importance of Measurement