Statistical Baseball Research Bibliography Charlie Pavitt The goal of this essay is to introduce the Statistical Baseball Research Bibliography and explain its use. The Bibliography is the result of a comprehensive survey of baseball literature. Along with many books, it includes articles in baseball journals (such as the Baseball Analyst, Baseball Research Journal, Sabermetric Review) along with academic journals (for example, Operations Research). The use of the Bibliography requires access to and familiarity with Lotus 1-2-3 or a compatible spreadsheet program. Articles have been included in the Bibliography if they meet the following criteria: 1 - They have been intended to make a contribution to our knowledge about baseball as a statistical science. This does not mean that the article must include statistical analysis. Many worthy articles have made theoretical or critical contributions without performing statistical analyses. 2 - While articles that present methods for evaluating or ranking teams or players are included if they make a meaningful contribution, articles that do nothing more than evaluate or rate teams or players are not included. Books that appear to be intended to do little more than exploit the recent popular market for books on baseball statistics have not been included and will not be added in the future. 3 - Articles must either have been published in conventional markets (either academic or trade) or by SABR. Self-published works will only be included if they have made an unusual contribution (such as Cook's Percentage Baseball and the Computer). The only exceptions are the Baseball Analyst and By The Numbers, as these represent the efforts of the statistical research community at large. The entries are arranged alphabetically according to the last name of the author of the article or book. Each entry has eleven columns of information. These are as follows: Column A - Last name of author. If the article is credited to two or more authors, in most cases it will be entered under each author and noted as a multiple-authored article in Column K. Column B - First name and middle initial of author. Columns C, D, and E present a code system identifying the content of the article. If an article includes more than one clearly different content area, it will be entered under each of them. In the case of books, it will be entered under each content area covered. The code system consists of three hierarchically organized levels, respectively called the supercode, macrocode, and microcode. Each of these codes is symbolized by a capital letter. To begin, each article is categorized within a general subject area. This general subject area is indicated by the article's supercode, which can be found in Column C. Each general subject area is divided into more specific content areas. Each specific content area is indicated by the article's macrocode, which can be found in Column D. It is important to remember that the same macrocode may symbolize a different category for different general areas. For example, the macrocode S indicates Stolen base within the supercode category Batting, Starter/reliever within the supercode category Pitching, and Succession within the supercode category Managing. Finally, each is categorized according to whether it is mainly about Players, Teams, or Leagues. These three categories comprise the system of microcodes, which are found in Column E. For example, an article on how to evaluate the ability of individual base stealers would be given the microcode P. An article on how to evaluate the contribution of stolen bases to a team's offense would be given the microcode T. An article on the long-term value of base stealing on run production in baseball would be given the microcode L. As articles may discuss issues at more than one of these levels, the microcode should not be blindly trusted. The user interested in learning about the value of basestealing in general ought to look at all of the articles coded B-S. Column F - The title of the article or book. Due to space limitations, the title may by shortened or paraphrased. If the title is not clearly indicative of the article's content and space permits, an indication of the content may be included, within { } brackets. Column G - If journal article, title of journal, If book, name of publisher, If contributed chapter in book edited or mostly written by someone else, editor/author listed here. Look for listing of book under editor or primary author for title/publisher/year. Column H - If journal article, volume or issue journal If book, location of publisher Column I - Date of publication Column J - Pages that article is on. If article/book includes more than one subject areas, pages will be specific to the discussion of the subject area. Column K - Comments. If article is multi-authored, coauthors will generally be listed here. If article is part of debate, extension of earlier article, etc., other article(s) in series will be cited. When Column F consists of book title, title of book chapter may be listed here. I intend to update the Statistical Baseball Research Bibliography annually with both new material and old material that I find. Therefore, I would be interested in seeing any statistically-based articles anyone is familiar with not in the present version of the Bibliography and considering them for inclusion in future versions if they meet the criteria listed on the first page. The following two pages present the supercode and macrocode system. Coding System The code is indicated by the capitalized letter in each entry. Comments include the most popular areas of research under each category. Supercode Macrocode Comments Batting All nonsituational aspects of offense Baserunning Methods for measuring Clutch Does it exist? If so, how to measure it Evaluation Methods for measuring how good Hot/cold streaks Do they exist? bLack/white/Latin Differences in performance Minor/major Relation between the two in performance Pinchhitting What type of player is best? Ranking Methods for measuring who is best Stolen bases Impact on offense; measuring ability Traded etc. Effect of changing team on performance Walks Impact on offense; measuring batting eye Fielding Catcher Methods for evaluating Double play Impact on overall defense Evaluation Methods for measuring how good bLack/white/Latin Differences in performance Ranking Methods for measuring who is best General Introductions to baseball research. Should be required preliminary reading for everyone. No macro or microcodes. Managing Evaluation Methods for measuring how good Succession Effects of changing managers on team Overall Total performance Evaluation Methods for measuring how good Hot/cold streaks Do they exist? Do they matter? Ranking Methods for measuring who is best World series Predicting winners Pitching Age Its effects Clutch Does it exist? If so, how to measure it Evaluation Methods for measuring how good Fly ball/ground ball Methods for measuring bLack/white/Latin Differences in performance Minor/major Relation between the two in performance wOrkload Analysis of ideal load Power/finesse Methods for measuring Ranking Methods for measuring who is best Starter/reliever Comparisons; impact of relievers Traded etc. Effect of changing team on performance Walks Their implications laYoff Its effects oRganizational Evaluation of how teams are run Age Effects on team performance Talent Evaluation of how teams develop players Situational At bat Impact of ball/strike count Batting order Impact on performance Day/night Impact on performance Fly ball/Ground ball Impact on performance Game Runs scored by winners versus losers Home/away Impact on performance Inning Runs expected given base/out situations Left/right Implications of platoon differentials Opposition Implications of who team is playing ball Park Impact on performance Season Tendencies from month to month Turf/grass Impact on performance Umpire Performance Comparisons among them