Lies, Damn Lies, and Metrics in Small Wars

Lies, Damn Lies, and Metrics in Small Wars

Can We Measure Progress or Failure in War?

U.S. Army Major Josh Thiel's recent article The Statistical Irrelevance of American SIGACT Data: Iraq Surge Analysis Reveals Reality challenges the contemporary notion that you can "add more (forces) and then you win" in a protracted insurgency. Josh, an Army Special Forces officer assigned to 1st SFG, studied economics at USMA and defense analysis at NPS. His concise work illustrates what is known in econometrics as a red flag. A simple linear regression of two variables showing substantial deviation is enough to suggest that the problem is much more complicated than simply adhering to the tested independent variable.

So what? Josh's work is the first part of extensive research being conducted at the CORE Lab in Monterey, CA to determine if we can ever accurately measure causation in war. This type of research is funded throughout the country by grants such as the Minerva Initiative, an investment endorsed by the SECDEF as a 21st century effort to promote collaboration between the military and social sciences to find better solutions in modern conflict.

Rigorous analysis and collaboration into quantifying war goes back as far as 1948. British scientist Lewis Fry Richardson sought to find patterns in conflict, and RAND's game theorist developed the concept of Mutually Assured Destruction (MAD) that eventually translated into a foreign policy of nuclear deterrence. In the 21st century, we have yet to find a definitive model to gain a broader understanding. To date, it remains undetermined if such a model is possible.

As Josh noted, it is naí¯ve to blindly look at collected metrics from the occupying force, counterinsurgent force, or host nation in order to gleam the current situation. For example, as I noted in the Break Point, from November 2006 to February 2007, over 126 SIGACTS occurred in the small village of Zaganiyah; however, this data will not be found in the SIGACT database because there was no coalition observer there to collect the information at the time of occurrence. Instead, if you looked at the data available, it would show only a small amount of violence in the area. This finding led me to conclude,

Control, while hard to quantify, is often "you know it when you see it," but it can be somewhat measured by the strength of the counterinsurgent security force OR the absence of violence in areas with significant government presence. The absence of violence in areas outside the reach of the government OR outside external monitor observation (press, peacekeepers) should be assumed to be under the insurgent's control.

Several other qualitative measures must be addressed when testing or collecting data.

What is the interest of the reporting officer? There is usually a political, economic, or personal interest involved. For instance, in Iraq, Shia would often minimize Shia on Sunni violence while exaggerating Sunni on Shia violence.

How accurate is the data? While many of us call it Iraqi math, exaggeration extends to many small wars. Often, a partnering Army officer or frightened civilian would report 100 armed men attacked a patrol or a village. Upon investigation, no evidence supported this report. Over time we learned to drop a zero so 100 meant 10. How do we input this data into the SIGACT database? It remains a tough judgment call. In a perfect world, the higher command will serve as the referee, but that does not make the information necessarily accurate.

When analyzing the data, one must consider the following,

What was the coalition forces mission? What were they actually doing? Were they constrained to Advisory missions in a large forward operating base? Were they conducting primarily direct action raids? Were they living in patrol bases in contested areas?

What was the enemy's mission? Were they actively massing to attack coalition forces? Have they temporarily laid down their arms to wait for the coalition to move elsewhere? Were they conducting selective assassinations of key leaders, acts of intimidation of the populace, or overt population control measures to pacify the populace into not supporting the host nation government?

How did the people react? What is the number of intelligence tips to coalition forces? Is the information valid? How many local leaders publically support the coalition force mission? How many religious leaders actively preach resistance in Friday prayers? How active are the local markets when coalition are present?

Again, we have merely begun to dig into studying the quantifiable aspects of small wars. There remains many more unknowns than known facts. Major Thiel's paper should serve as a caution that there are no bumper sticker solutions, and hopefully, his work will encourage others to collaborate to help find better answers.

Comments

Eric Larson:
I would only add that - contrary to Grant's take - while we can't be metric-focused (as in, collecting data for data's sake) small wars actually do lend themselves to the heavy use of metrics to explain the situation on the ground. In fact, the permissive environment of COIN's OE yields itself to collecting massive amounts of data that could be used to demonstrate security as an objective metric.

That said, data is just numbers. I think what Grant may actually be meaning to say is that data without structure, analysis, and explanation is not understandable to politicians, or commanders.

Eric-

I don't have as much confidence that in COIN metrics can explain anything on the ground. Maybe it is because we don't "do" metrics well- but it could also be because complex environments can't be easily broken into pieces/parts and then better understood.

And I wasn't saying that data without structure, etc., isn't understandable by politicians (it might not be, but I wasn't trying to say that). I was saying that in Afghanistan we fed data to the pols that backed up our assumptions (not out of any malice- many (most?) believed our assumptions). I think that was an improper use of metrics.

I favor, instead, using metrics to test our assumptions- so we can attempt to figure out if we're chasing our tails. I personally got the feeling that we didn't think we had the time (political will) to do that.

From Bill M:

"First things first, and that is fixing our staff culture and that is commander's business. Don't expect your intell folks to change the way they do business when you tell them the most important thing they do is put together the daily/twice daily briefing slides. You'll get what you ask for. Try getting away from powerpoint for awhile and have everyone brief off a map, it actually provides better context."

Target.

I am guilty of frequently pestered our intelligence analysts for better intelligence, but the reality is what they know is amazing, yet sadly it isn't enough to facilitate good operational level planning. Then again we can't expect a whole lot more out them when we as leaders have them focused on preparing the commander's brief. We can't expect much more out of them when the expectation of a product is a few powerpoint slides with summaries and stoplight bubbles to indicate progress or regression. I would also argue we can't expect too much more out of them when the focus by some commanders is on "counting" events and geospacially correlating them on map with various color codes, instead of understanding the enemy's strategy and its relationship with the locals. Unfortunately this type data based analysis is incomplete and all too often misleading.

Technology changes society and organizations, and although overstated it doesn't lessen the truth of it, powerpoint has dumbed us down. Many units have dumbed down their Annex B (Intelligence) from their operations orders to a few powerpoint slides and then commanders and staffs complain when they get inadequate intelligence without the necessary context. Hard to do when you give the S2 15 minutes to brief.

I think in most cases they received what the commander and operations officer wanted, a summarized version that could be easily presented via powerpoint. For irregular warfare (all war actually, but since everyone is Afghan focused at the moment I'll stick with IW) understanding context is important, and understanding context will require an investment of time for reading and long discussions (not just one way briefings). The discussions between the S/J2 and staff will identify the gaps in understanding that operators/planners need addressed, and then they can be added to future collection plans (really isn't that complex), but instead staff's (S2's included) live to prepare their daily or twice daily intelligence briefs. This consumes a lot of manhours in return for limited value.

I don't think Iphones will ever provide context from lower to higher. Context is more complex than a short report and you're not going to write more than a short report on an Iphone or Smartphone. That type of information can only effectively be passed in longer reports (that are well written), or better yet via free flow discussions to develop a common understanding. The intelligence community will adapt if they get a different demand signal from their commanders, so far they haven't because it is true commanders like getting their information on powerpoint slides inaccordance with the daily battle rhythm which all staffs are held hostage to. This isn't a slap against the commanders, it is the culture all the younger ones (LTC and below) have been brought up in. They're really busy, so they don't have time with current leadership style and processes for more effective intelligence discussions, but I think we need to question busy doing what? Perhaps preparing their own briefs for their higher's battle rhythm briefs? How did we get to this point? Is it possible to reassess how we do business and relieve the burden to a large extent of the never ending demand for information from higher from Company level to the Pentagon? Commanders and staffs need more time to interact with their units and to explore alternative approaches, and dedicate less time to preparing briefs. However, our current processes are deeply embedded in our staff culture.

On a side note, there is valuable data that can be collected with Smartphones and Iphones that is value added to support analysis, and if it is transmitted to a portal near real time where analysts can collate the data in meaningful ways using good ole fashioned paper and pencil or advancedd analytical tools it can provide useful insights that were otherwise not readily apparent. Data for SIGACTs (needs to be further refined to type of act), individual information, biometrics data, SSE data, etc. can all be valuable if transmitted via Smartphone with pictures, voice description, reports, etc. can all be useful; however, this isn't is a cure all.

First things first, and that is fixing our staff culture and that is commander's business. Don't expect your intell folks to change the way they do business when you tell them the most important thing they do is put together the daily/twice daily briefing slides. You'll get what you ask for. Try getting away from powerpoint for awhile and have everyone brief off a map, it actually provides better context.

Charles,

I shouldn't have limited my description of the Intel Community to just structure. As you noted, it's the whole mechanistic system (inputs, structure, culture, processes, and outputs/outcomes) that need reform.

Madhu,

To the first part, that's my crude attempt to describe how Iraq personally felt.

To the second part, I prefer that type of methodology, and I think that's one of the overall goals of SWJ. We provide that everyday. One point of caution. The cumulative examples and data show us the American view of the war. Eventually, that needs to be complimented by the Iraqi view and other stakeholders in order to have a broader understanding of the conflict.

With that, I'm gonna withdraw from commenting anymore on this thread and get back to my job of reviewing and publishing others work here. Please continue the discussion, and thanks for taking the time to read and comment.

@ MikeF -

Regarding your comment:

<blockquote>Just like one may get an uneasy feeling patrolling down a certain road or trail that ends up being an ambush, things just felt different in Iraq during each passing year. In the most simplest of terms, I'd say 2003 felt surreal then incomplete, 2005 felt awkward and depressing, 2006 felt scary and like we were losing, and 2007 felt hopeless until the summer. Then, when the violence ended, everyone was just shocked.</blockquote>

Is this sort of personal narrative, or mini-personal history, captured in some way institutionally? Would fielding Iphones allow people in very difficult and demanding situations keep a recorded diary of impressions, even if just a few words, even if only one entry? Impressions of the local environment, how it all feels etc? Do military historians do this kind of thing already?

At Zenpundit (Mark Safranksi's blog), there was a story about DARPA and a story-net. While the aim, I thought, was to aid in analysis of the enemy, why not a story-net of mininarratives for our own? In order to capture the qualitative data in a different way? People are already overstretched so I was thinking of something that wouldn't add to the burden or be a requirement. Also, easier for a generation of digital natives.

Then, theoretically, you could place the qualitative data within a context, which must be done at any rate. Data is only as good as its generation, and then too, what you do with it. Numbers are just numbers.

Reading MikeF's comments made me consider some items.

1) iPhones / Applications, and the technology panacea is not going to get us any closer to solving issues related to the current shortcomings in intelligence. From my perspective we have added, refined, created, and utilized technology at an increasing pace over the past ten years or so. And to show for all of this, we have intelligence professionals in theater that offer little to no value for the Soldiers on the ground. As a combat arms Soldier I receive virtually nothing from my S2, and when I bring information to the S2, I see virtually nothing done with it. I do see charts, graphs, link diagrams, fancy SIGACT slides, and lots of ISR stuff. However, I rarely if ever see the significance of these significant acts presented by the S2. Where's the meaning? It's certainly not going to come from the addition of more and more technology. It's going to come from good intelligence professionals. It's on the development of our analysts that our focus should rest, and not on the distribution of smart phones.

2) Regarding the intelligence institution and whether or not it's capable of handling the tasks. Well, that's not really the question we should be asking. The question is whether or not the current batch of MI Cdrs are going to allow or push for a re-alignment of their community. I would guess that anything likely to change the status quo will not be supported by more than just a couple "rogues".

Until there's some real change within the MI community, I would offer that we'll continue to see endless charts about SIGACTs, link diagrams, and other fancy things. We'll get no analysis. Those of us on the ground will develop the situation and we'll be supplemented by SIGINT and IMINT, allowing us to pinpoint cell leaders etc. The S2 will just be a relay for this information.

S2/G2s work for maneuver commanders for the most part. These maneuver commanders are partially to blame for our current situation and they are the reason we currently see a lot of "commander driven" operations and not "intelligence driven" operations. But the MI community is also to blame. They have divorced themselves from warfighting, for the most part. Unless and until maneuver commanders demand better, and develop their intelligence staffs to provide better analysis - and - unless and until Ft. Huachuca changes its ways of doing business, we'll just repeat this conversation over and over.

JT,

I can only answer one part of your post from my point of view. As a company executive officer, one is expected to be able to provide an accurate assessment of the company's maintenance status at any given point in time. I think the same thing extends for the intelligence assessment for a commander. He/she should be able to rattle off a current assessment in their sleep. That's their job.

The problem, IMO, is how do we capture that for the institutional knowledge? A daily Intelligence Summary (INTSUM) describing events with the commander's assessment is one way. In the right format, it'll track both quantitatively and qualitatively what is going on. Moreover, a good commander will also list what he doesn't know. This is extremely important. These unknowns drive his CCIR.

The biggest takeaway from MG Flynn's piece was questioning if our institution is capable of collecting and processing this information/intelligence? Currently, the answer is no because our structure was designed for top-down not bottom-up. That has to change.

Additionally, fielding Iphones with apps that can conduct surveys, census, spot reports, and DNA collection will help streamline the system, but we still need to create an intelligence network capable of collecting and analyzing from the bottom-up.

That's on the military side. For the historian or social scientist studying past historical data, they must learn to understand the limitations, constraints, and failures in the existing data in order not to draw the wrong conclusions.

Some thoughts from one of my favorite sources for learning how to think about understanding seemingly complex problems, be it the nature of the universe or why ones intervention in the nationalist insurgency of another is not going as well as one might hope based upon their knowledge of such problems and the efforts dedicated to the problem:

"Everything should be as simple as it is, but not simpler."

"If the facts don't fit the theory, change the facts."

"If you can't explain it simply, you don't understand it well enough."

"No problem can be solved from the same level of consciousness that created it."

"We can't solve problems by using the same kind of thinking we used when we created them."

"A man should look for what is, and not for what he thinks should be"

"A perfection of means, and confusion of aims, seems to be our main problem."

And last, but certainly not least when talking of Metrics:

"Everything that can be counted does not necessarily count; everything that counts cannot necessarily be counted."

Special thanks to Albert Einstein for his brilliant insights on how to think and how to understand.

Cheers!

Bob

Ahhh, the questions raised here have plagued us for nearly 200 years.

Auguste Comte, a French philosopher, claimed it a matter of time before we have a positive "social science" equivalent to the hard sciences of physics, physiology, and so forth. It is remarkable that we haven't given up on the idea (for a free translation of his book, published in 1853, see): http://books.google.com/books?id=Zx4OAAAAYAAJ&printsec=frontcover&sourc…

Just finished a very good Vietnam War novel (offered by one of my students) by Josiah Bunting -- "The Lionheads." Bunting was an S3 in one of the 9th Inf Div BCTs -- the one that was "waterborne" jointly with the Navy in the Mekong Delta. The story is much about the quantitative theory of victory and the associated intellectual vacuity. I'd commend the book to anyone endeavoring to study war through a quantitative lens.

MikeF - You provide a good list of 'indicators' in your response above to generate an assessment of a local area; however, I question the requirement to provide the type of causation analysis discussed by MAJ Theil. As you know, each local area has unique considerations and each unit operating in those areas will affect them in ways that are extremely difficult to standardize and quantify in terms of causation. Further, MG Flynn's SOIC paper accurately highlighted the difficulties unit intelligence sections are already having properly defining and analyzing the operational environment to drive effective operations for COIN. Is it realistic to drop another rock in the BCT-Bn-Co level intel section's pack given the massive load they are already carrying?

Assessment (really read as intelligence) in a small war environment is an art not a science. I think we need a paradigm shift in this discussion toward intelligence analysis and military judgement of that analysis vs 'quantifiable metrics' for data that cannot be reasonably attained/standardized.

Example:
-How do we capture the statistical significance on the population of Marjeh when a LCpl assigned to that area publicly converts to Islam in the early phases of Moshtarak?

-How do we capture the stastical significance of LtCol X or Capt X's relationship with Tribal Elder Y and Z?

-etc.

Good points all around. Please continue to add your thoughts. They help academics and policy makers understand things from a practitioners perspective instead of trying to interpret numbers on paper.

As I tried to articulate above, I view Josh's work as a start point towards a collaborative effort to start questioning/challenging assumptions about modern COIN. Much of the other research is actually on other small wars with larger data sets to consider.

In a perfect world, this research may help hypothesize when to intervene in a conflict. For example, one hypothesis would be a FID effort during Phase Zero under x conditions would provide the greatest probability of y success.

Bill M- great point about measuring what I call heart (emotions) and soul (will). Others would call it intuition. Is it possible to gauge and measure? Some sociologists and psychologists are attempting to figure that out. But, you're right. Just like one may get an uneasy feeling patrolling down a certain road or trail that ends up being an ambush, things just felt different in Iraq during each passing year. In the most simplest of terms, I'd say 2003 felt surreal then incomplete, 2005 felt awkward and depressing, 2006 felt scary and like we were losing, and 2007 felt hopeless until the summer. Then, when the violence ended, everyone was just shocked.

I was hoping Josh's article would address more of the complexities instead of simply applying a different set of metrics based on the same data (SIGACTs are not credible data for the reasons Mike listed and many others not listed). I found his arguments flawed, but I still applaud the efforts to challenge standing assumptions and what I believe is our inappropriate use of metrics to demonstrate either success or failure. Although I agreed with GEN Mathis directive to kill EBO (at least the way it was being practiced), it is worth mentioned that EBO conceptually was about influencing behavior of systems (somewhat easy to measure if youre trying to reduce ammunition production capacity, a lot harder to "measure" if youre talking about human behavior), so looking at the surge and associated tactics from a behavior perspective instead of SIGACT perspective one can come to different conclusions.
I personally think the surge worked in Iraq (I was there in 07 also) based on my intuition informed by observation comments shared by Iraqi security personnel, Iraqi citizens, and insurgent detainees. Where we surged the insurgents sometimes pushed back (and usually died), or withdrew immediately instead of later. When they pushed back it was a test of wills, and in my opinion the surge overall was a strategic level test of wills (abstract, hard to measure) more than anything else. The media was painting a dire picture, AQ media was predicting a win and attacks were increasing based on increased confidence of the insurgents. The will and tactics associated with actually having enough forces to push into and "hold" the populated areas to both protect the populace and deny freedom of movement for the insurgents made a difference. The willingness of the coalition to fight gave many civilians enough confidence to report on insurgent activity based on their increased confidence the coalition would actually act on their information. Putting their lives at risk to report on insurgents, and then doing nothing about it clearly didnt send the right signal to the local populace.
None of these were decisive, but rather they were several different behaviors that were enabled by the surge (my opinion) and other changes implemented by GEN P that allowed us to suppress the insurgency long enough to pull out the majority of our combat forces.
To some extent we won in Iraq when Saddam was captured (limited, but achievable objective); however, I am not convinced we achieved much else subsequent to that action that was in our national interests. Furthermore, the fact that we felt compiled to surge with U.S. and other foreign forces (non Iraqi) was a clear indication (again in my opinion) that we failed in our stated goal of enabling the Iraqis to hold their own. I dont buy into the "it takes time" myth. Why is that insurgents with much less resources and training time can in many cases field more effective fighters than the coalition trained Iraqi forces? Obviously that didnt apply in all cases, definitely not in the case of Iraqi SOF, but it applied in too many cases for regular Iraqi Army and Police forces. This is a lot more complicated than just the surge, but in Iraq (and Iraq only) I think the surge had merit for reasons that are hard to measure.

Mike, thanks for posting this - it seems to confirm the "it's not the numbers, it's the tactics" hypothesis that many of us in the small wars business have had, using some good research and presentation on Josh's part.

I would only add that - contrary to Grant's take - while we can't be metric-focused (as in, collecting data for data's sake) small wars actually do lend themselves to the heavy use of metrics to explain the situation on the ground. In fact, the permissive environment of COIN's OE yields itself to collecting massive amounts of data that could be used to demonstrate security as an objective metric.

That said, data is just numbers. I think what Grant may actually be meaning to say is that data without structure, analysis, and explanation is not understandable to politicians, or commanders. Numbers do not explain, and Josh has done a good job attempting to put some meaning to the Surge numbers.

I guess we should just hope that someone out there at ISAF is reading this and thiking six months out, rather than six days out.

Great points, Mike. Imagine my surprise to find out we were so "metrics-focused" when I served in Afghanistan. I figured since books like Prodigal Soldiers, On Strategy, and Dereliction of Duty had pointed out the issues with data collection during a "small war" that we wouldn't have been so metric-hungry. I was told it was because that was what the politicians understood.

Apparently missing is the most important input of all, that of an adjacent, competing power and its decisive successes in bring about a more stable condition (and not the so-called "surge narrative"):

--Iran backed Shia victory in the civil strife in Baghdad, effectively securing the capitol.

--A peace brokered by Iran's IRGC/Quds between the IA and Shia militias.

Without these important inputs, the analysis is incomplete and flawed.

Search

Lies, Damn Lies, and Metrics in Small Wars

Comments