Category Archives: Thesis

A.8 Mapper in Use

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

A.8 Mapper in Use

The following is a screenshot showing in use from May 26, 2004. The image shows the page output from Mapper in the Firefox web browser. The screenshot was taken and cropped with the GIMP. The map (green and red lines) was generated by MapMaker, a helper application for this thesis. The algorithm for determining the shape of the places is courtesy David Norman

.

Computer Science Thesis Index

Appendix D. Objective Tests

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

Appendix D. Objective Tests

D.1 Objective Test Descriptions

The simulation was checked against three objective tests. They measured the nation displacement, state displacement, and internal validity. Nation displacement is a measure of the degree to which the predominant nation in different places changed. State displacement is a measure of the degree to which the predominant state in different places changed. Internal validity demonstrates the difference between runs, and with that how consistent the output is.

Four nations were considered for every test. These included the British, German, Italian, and Polish nations. These nations were predetermined before the simulation code was written.

The displacement tests were considered successful if the simulation results matched known history. The results for all tests are interesting. Three of the national displacement tests were successful. In matrix form:

  Nation Displacement Test

State Displacement Test
British Pass Pass
German Pass Fail
Italian Pass Fail
Polish Marginally Pass Fail


For the national displacement test for the British, German, and Italian nations, no variance between reality and the simulation develop during the run.

For the state displacement test for German, Italian, and Polish, there was a great deal of variance. Specifically, total variation for those nations occurred by 1962, 1962, and 1961 respectively.

Regarding the national displacement test, it is argued that all nations pass the test, three completely and one marginally, since the model accurately captures the behavior of nations. The simulation was specifically designed to be a simulation of nations. An explanation of the special case of the Polish nation is on the following pages.

Regarding the state displacement test, it is clear that the model does not accurately model the behavior of states. That is, it does not correctly show how state behaviors emerge from national behaviors. The sole example is the British nation, whose behavior is completely correct throughout the simulation. The report then discusses this surprising result.

The report also investigates internal validity. Internal validity measures how much the simulation agrees with itself. A simulation that gives very different results each time would have little internal validity. The simulation exhibits internal validity, though issues with two specific nations, Italian and Polish, are troublesome. They pass by an absolute measure, but are much less internally valid than the British and German nations. The report then analyzes the ways changing how the data is interpreted effects the internal validity results.

Finally, the report mentions two unanticipated objective findings. Both point to areas of improvement. One relates to incomplete data in 5.7% of places, and the other relates to incomplete data reporting. The consequences of both are quickly discussed and it is recommended that these issues be resolved before future research is attempted.

Computer Science Thesis Index

Appendix C. Subjective Tests

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

Appendix C. Subjective Tests

C.1 Subjective Test Descriptions

The two subjective tests described in the thesis were given to three experts for evaluation. Two of the experts arrived at informed criticisms, while one believed that the precise nature of the task was outside his area of expertise. Both experts that were able to review the material found it to be reasonable, and both gave areas for further improvement.

Each expert was given instructions, a 44 page report, and then a questionnaire. These are attached in this appendix. Finally, an interesting subjective finding is discussed.

Lastly, the author analyzes the output for one other nation. This nation behaves differently from others in the model, and the author speculates as to reasons and gives paths to future research.

Computer Science Thesis Index

Chapter VI. Conclusions

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

Chapter VI. Conclusions

Full test results are available in Appendix C and Appendix D. A brief summary of the results is given.

The first objective test measured the displacement of nations. It was completely successful. Every nation evaluated was within the limits determined in the thesis proposal. Additionally, all but one matched real-world results for the 1960s perfectly.
The second objective test measured state displacement. It was less successful, as state displacement was greater than expected. Every nation except one failed the state displacement test. Additionally, these failures occurred within only a few years. The full cause and implications of this are described within Appendix D.

Internal validity with respect to density and health was also tracked. The health test was more successful, with an average standard deviation of .08. The density test was somewhat less accurate, with an average standard deviation of .12. Both of these tests are considered successes. A more complete discussion of the internal validity tests is found in Appendix D.

The two subjective tests asked expert reviewers to view traces and animations of nations. Three expert reviewers were involved, each had a recognized doctorate in a specialty relating to the nature of this simulation. One was in political science, one in psychology, and one in social anthropology. After the presentation of the trace and animation (available in Appendix C) the social anthropologist indicated that he did not believe he was equipped to properly judge the results. However, both the political scientist and the psychologist believed almost all delineated issues were reasonable. For more information on the subjective tests, see Appendix C.

Overall the objective and subjective tests support the proposition that the simulation model accurately reflects reality. Every expert who gave an analysis was positive in his comments. Additionally, the central objective tests, whether this simulation of nations accurately simulates nations, were completely successful. The general failure of the secondary objective test is distressing. However, the cause of this aberrant behavior has been isolated.

Computer Science Thesis Index

Chapter VII. Future Research

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

Chapter VII. Future Research

As mentioned previously, there is a lack of other models that examine nations. The use, in this simulation, of genetic algorithms and fuzzy logic also set this simulation apart. Therefore, there are many areas where future research and modification would be fruitful. Analysis of different theaters, distributed computing, genocide and holocaust studies, genetic programming, and the world census are discussed as possible areas of future research.


The model simulates European nations during the 1960s. The chose of setting was heavily influenced by the need to reliable census data, which made Europe a great fit. However, political scientists have been able to estimates of populations in a variety of areas. It therefore would be possible and profitable to model different theaters or areas that are currently of greater concern than Europe. The Middle East, for example, with its mixture of national identities, would be a prime candidate for consideration as a new theater.

A related exploration would be into ethnic groups within the United States. This would require a reworking of definitions, as the simulation typically treats groups with a shared language as the same ethnicity. Nonetheless, ample information is available. Very detailed ethnological information is available from the U.S. Census. It would be fascinating to move the simulation from a world of sovereign states and wars to the United States and a peaceful democracy.

Running the simulation as it currently stands is processor intensive. Several times algorithms were modified to allow for quicker execution. Additionally, a local firm donated the use of a four-CPU Intel Xeon system with 2 GB of RAM, which allowed the runs to be completed in a reasonable amount of time as several simulations could execute simultaneously. Even then there were issues relating to processor usage.

A potential solution is moving away from monolithic code. The current programming of the simulation assumes there is only one CPU to work on the problem. Distributed computing allows several CPUs to be involved. While a high performance server may be difficult to reserve, a larger of number of less powerful computers may be readily available. The code currently evaluates the next decision of each nation sequentially. Instead, it could offload computations to a collection of computers. This distribution of work could occur in a computer lab that is not currently being used, but as “Seti@Home” and “Seventeen or Bust” have demonstrated, this can also occur over the internet.

Bauer reports that “genocide” was first defined by Raphael Lemkin as the “destruction of a nation or ethnic group” and that “[generally] speaking, genocide does not necessarily mean the immediate destruction of a nation.” This model’s relevance to genocide studies is obvious, especially as the willingness to commit genocide by this definition is very close to the simulation’s definition of “aggressiveness.” With respect to genocide and holocaust studies, three avenues of research are immediately available.

First, the parallel between this model and existing research could be embraced, and the simulation could be further explored to better inform existing research. Robert Melson views genocide as of basically two types: “total domestic genocide” which are complete annihilation of a subgroup by their countryman, and “genocide in general” which can exist without state planning and little bloodshed. This broad range, from mass murder on the one-hand to multicultural assimilation on the other, is not distinguished in the current code. By adding details to allow better examination of how a nation is reduced, the study of genocide by the model is possible.

Second, differences could be explored. Especially in the context of the nation-centric concept of genocide, Bauer’s claim that “One can change one’s religion or one’s political color. One cannot change one’s ethnicity or nationality or ‘race’” (page 11) is striking. An alternate simulation could be used. The population of places could be disaggregated into age categories. Working under Bauer’s assumption, national identity of a population could be fixed after a certain age. These changes would help this view of genocide.

Third, Appendix D discusses anomalous results for the Jewish nation in the simulation. The voluntary relocation of a substantial number of Jews into the former mandate Palestine is one of the most visible national movements of the 20th century. Expanding the theater into the Middle East and North Africa would allow more research into this finding.

Genetic algorithms are used in the simulation to model the genetic relationship between nations. Allowing nations to inherit not only attributes but even logic and ways of dealing with problems would make the model more realistic.

As discussed in Appendix B, acquiring a complete and internally consistent data set for the model was a challenge. However, new developments present an opportunity. The Minnesota Population Center at the University of Minnesota is currently funding IPUMS, the “Integrated Public Use Microdata Series.” From a project web page,

Large machine-readable census microdata samples exist for many countries, but access to these data has been limited and the documentation is often inadequate. Even where such microdata are available for scholarly research, comparisons across countries or time periods are difficult because of inconsistencies in both data and documentation. IPUMS-International addresses these issues by converting census microdata for multiple countries into a consistent format, supplying comprehensive documentation, and making the data and documentation available through a web-based data dissemination system.

Such a resource would be invaluable. The database of census information supporting this model came from numerous sources and often there were gaps in the data. Additionally, the lack of census data in other theaters narrowed the study being conducted. More data and better data would help make the simulation’s output more reliable. IPUMS is still very incomplete, but as more information comes online these areas of the study can be strengthened.

Computer Science Thesis Index

Chapter V. Verification and Validation

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

Chapter V. Verification and Validation

5.1 Verification and Validation

In order for the simulation to be a useful explanation of the behavior of nations, users need to feel confident in the model. The model should either work and be proven correct, or it must be shown that it is inadequate so future corrections can be made. Verification and validation techniques are the tools used to achieve this goal, and they are explained below.

In 2000, Sargent defined three basic tests for determining the accuracy of a simulation model, which may be approached in two ways. The three tests are judgments by the designers, independent verification and validation (IVV), and scoring. Designer judgment is the most popular test, though it relies on iterative design and having experts work closely with the rest of the team. IVV is also heavily expert based. It relies on judges who are “independent of both the model development team and the model sponsor/user(s).” Scoring uses subjectively determined weights to give objective solutions. Throughout this section, “objective” or “objectively” will be defined as “using some type of statistical test of procedure,” a definition Sargent pioneered in 1994. “Subjective” or “subjectively” will mean not objective or not objectively.

Computer Science Thesis Index

C.2.1 Simulation Reporting Document Tutorial

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

C.2.1 Simulation Reporting Document Tutorial

This is a short introduction for the simulation reports. It also explains the concepts behind the simulation. The information is organized the same way in both formats. The model runs from 1959 to 1970. The 1959 data is historical data while every year after that is calculated by the simulation according to certain rules and assumptions.

There is a data reporting page for every year. It will look something like this:

french_gauge_1960_md

There is a lot of data on this screen, but it is organized logically. The following pages explain the details.

Computer Science Thesis Index

A.3 Open Source and Free Software

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

A.3 Open Source and Free Software

According to the Open Source Initiative, the license for open source software,

“…shall not restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources. The license shall not require a royalty or other fee for such sale.”

The proliferation of open source software means a wide collection of powerful software applications are freely available. This has allowed the author to develop the software for this thesis with high quality tools. Without these programs, the cost of development would have been much higher, and much of what has been accomplished would have been cost prohibitive.


The following open-source applications were used in the building of the model. They are grouped by category, and a short commentary follows each category.

Servers
Apache HTTP Server
MySQL Database Server

Apache and MySQL were the base over which almost everything else was built.
Commonly used together, the world’s most popular web server according to Netcraft and the world’s most popular open source database according to the vendor were invaluable. They allowed the author to leverage his experience to rapidly build the tools needed for this thesis. They also ensured that the thesis was platform independent. Development occurred mostly on a Windows XP home desktop, while the final simulation was run on 4-way Xeon industrial server machine.

Programming Languages
PHP
Perl

PHP was used in conjunction with Apache to build web-based tools for this thesis, especially Mapper and Merge. Additionally it served as the basis of the content managers. Perl was the language of choice for the simulation itself, as well as MapMaker and numerous small tools.

Content Managers
Geeklog
phpMyAdmin

Geeklog served as a machine-independent journal for development and ad-hoc change control mechanism. With Geeklog, the author was able to comment changes, make back-ups of key functions, and track his progress through time. phpMyAdmin is a visual front-end written in PHP served through Apache for the MySQL database server.

Text Editors
Jext
Syn

Jext and syn replaced Microsoft Wordpad and Microsoft Notepad respectively. Jext’s color highlighting and careful selection of fonts and colors make it a friendly environment for coding. Syn is able to open and search very large files quickly, and it becomes especially important when examining large log files.

Office Suite
OpenOffice.org

OpenOffice.org, or OOo, is composed of Writer (word processor), Calc (spreadsheet), Impress (presentation), and Draw (illustration). OOo has very good Microsoft Office import and output filters. It also has an open and simple file format, while allowed reports to be generated quickly by custom written scripts. Every component of OOo has been used during this thesis.

Image Manipulation
The Gimp
Image Magick

The Gimp assisted in basic image editing, such as cropping screen captures and preparing images for display. Image Magick, through its Image::Magick module for Perl, was critical in allowing images to be drawn from a computer program. The maps and dials used in the reports, for example, were written with Image::Magick.

File Management Utilities
Zip
7zip

Zip is a windows port of the GNU zip utility for unix-like systems. It was necessary for the operation of OOOlib with minimal changes, and was incorporated into dhaOOO. 7zip is a visual compressed file manager similar to Winzip. It proved to be the most user-friendly and stable compression tools with compressing files larger than a gigabyte.

Other
OOOlib
Mozilla Firefox
Wikipedia

OOOlib is a small utility for generating OpenOffice.org documents from a Perl program. It is the basis for dhaOOO. Mozilla Firefox is a web browser that was used to view Merge and Mapper. Wikipedia is an open source encyclopedia used for reference in the writing of this thesis.

4.3.5 States

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

4.3.5 States

The last entity type to consider is states. States emerge from the behavior of nations over places. Like places they are just entities without any methods of their own. Nonetheless, the output of the model should be intelligible and should follow historical patterns if only states and no nation information is displayed. This is because the central thesis of this model is that the behavior of states is actually a side-effect of the behavior of nations.

Nonetheless, the behavior of states is very important. The state level is a great test to see if a nation-based model can predict not just the evolution of nations, but also other activities by other entities. If it does not the argument that other behavior merely emerges out of national behavior is severely undermined.

The structure of the state entity type is kept purposefully simple. The objective in designing it is to ensure that a meaningful and important problem in political science, why states behave in the way they do, can be solved by this model. However, this model is not an all-inclusive attempt to see how parliamentary styles, entrenched bureaucracies, and other details. Though an extended version of this model should be able to explain these behaviors, it is simply out of the scope of this project.

Entity 4 (States)

  • UID
  • Name
  • Power

Figure 10. States Entity

Computer Science Thesis Index

4.3.4 Nations-in-Places (NPs)

Note: This is an excerpt from a draft of my thesis, A Computer Model of National Behavior. The introduction and table of contents are also available

4.3.4 Nations-in-Places (NPs)

NPs are considered next. An NP is not a true entity type because it is really the relation between a nation and a place. However, it has its own attributes and because of the important role it plays in the model, it typically is viewed as if it is an entity. Like the other entities it has a UID, but because it is a relation it has foreign keys that correspond to its nation UID and place UID. Similar to the nation UID and the parents of nations is national history. This list tracks the nations of which this nation in place has been part. National history like parents does not affect the simulation, but it makes the model more useful by letting the system rapidly determine the history of a NP, and where it fits in the great march of nations.

The rest of the NP attributes are subjective. They are the familiar foursome of assertiveness, aggressiveness, health, and magnitude, plus density. Because NPs can be thought of as inheriting from both nations and places, care should be taken to ensure that the precise purpose of these attributes is not confused and that every attribute adds something valuable to the system.

Specifically both magnitude and density are needed, in spite of similar definitions. The common thread of both nation magnitude and place magnitude is that it indicates the importance, with a value of zero making that thing irrelevant. The differences are pretty clear after explanations, however. The problem is a result of the originality of this thesis, meaning there is little established jargon to fall back on.

Density is a measure of a NP’s existence, or put another way a nation’s existence in that particular place. It is analogous to the density of an electron in a region of space. A density of one shows that the nation completely and definitely exists in a place, while a density of zero shows that a nation in no way exists in that place. The NP will struggle for more density if it has any will to live. The density index affects calculations for the nation, because it serves as a weight in the weighted average.

Magnitude shows the importance of a place for a nation. This can be seen as the emotional bond between a nation and a particular place. An example of this is Split, Croatia, which early in this century was very important to Italian politicians while being little cared for by anyone else, including Croatia. Magnitude is set to an appropriate value at the beginning of a run, and factors such as the effort a nation has put into securing a place and its length of occupancy there may affect this variable.

The final three attributes, assertiveness, aggressiveness, and health, operate just as they did with nations. Assertiveness and aggressiveness directly affect competition, while health tells if the NP is in decline in the given place. The striving between NPs occur entirely within a place, so a NP in one place cannot affect a NP in another. So while a NP may be super aggressive, it is only super-aggressive to other NPs in its same place.
It has been mentioned, but the other purpose of these attributes is to affect the weighted averages of the nation to which the NPs belong. Assertiveness is the weighted average of assertiveness, magnitude is the weighted average of magnitude and density, etc. So another reason why so many attributes have to be analogs of those in nations is that they are need to keep the model coherent.

Visually, the NPs entity type can be visualized as follows

Entity 3 (Nations-in-Places)

  • UID
  • Place ID
  • Nation ID
  • National History
  • Density
  • Name
  • Parents
  • Assertiveness
  • Aggressiveness
  • Health
  • Magnitude

Figure 9. Nations-in-Places Relation

Computer Science Thesis Index