Rating the players ------------------ Table of contents ----------------- General concepts Rounding Left/right splits Estimating missing statistics Rating the position players Defensive ratings Pitcher hitting Rating the pitchers Information common to batters and pitchers General concepts ---------------- Diamond Mind Baseball players have always been rated on the basis of statistics that are adjusted to reflect their home park and the era in which they played. This is what enables you to play meaningful games between teams from vastly different eras. Without adjusting for era, there would be a serious bias in favor of hitters from the eras when offense was sky-high (e.g. early 1930s, late 1990s) and pitchers from the times when pitching dominated the game (e.g. 1900s, 1960s). Without adjusting for parks, we would tend to over-rate hitters who were lucky enough to play in Coors Field and we would over-rate pitchers who did much of their work in Dodger Stadium, just to name a couple of examples. Most of our season disks are based on single seasons, so we evaluate the players relative to the league averages for that season after adjusting for park effects. This is a simple process that gets a little trickier only when a player changes teams during the year, in which case we evaluate him separately for each team and then combine the results. For this project, players were evaluated based on their performance across several seasons. As with our annual season disks, each individual stat line was evaluated relative to the league averages after adjusting for park effects. Then those individual stat lines were combined, using a weighted average based on the amount of playing time for each stat line. At this stage of the process, we had a set of values for each player that were expressed in terms of park-adjusted differences from the league average. This levels the playing field, because these values indicate how each player performed relative to the norms for his day and with park effects removed. We then turned those era- and park-neutral differences back into a set of statistics by applying them to an era and park of our choosing. Choosing the park was easy. We just left it as Neutral. We could have projected the Boston players into Fenway Park, the Los Angeles players into Dodger Stadium, and so on. But that would have made it harder for you to compare the players. Because we chose the Neutral Park, you can generate a Batting Register and sort on various statistics to see how the players rank. If we had chosen a different park for each player, you could still do that, but you'd have to keep those park factors in mind as you looked over the rankings. One player may rank ahead of another solely because he was projected into a more favorable park, not because he was a better performer during his peak years. For the baseline era, we averaged the NL seasons from 1955 to 2002. We focused on one league because we didn't want to mix results for DH and non-DH leagues. We wanted as large a block of seasons as possible, but we also wanted a consistent set of statistics. Prior to 1955, there were many seasons when stats like intentional walks and sacrifice flies were not recorded, or were not recorded in the same manner, so we chose to stop there. To sum up, the statistics on this disk reflect how each player would have performed in a Neutral Park and in an era that was like the average NL season from roughly the past fifty years. Players who dominated their leagues by the largest margin wound up with the best statistics on this disk. Rounding -------- We've defined our peak periods to include a lot of playing time. For most players, that amounts to at least 6000 plate appearances or batters faced. And because each selected season is weighted based on how much playing time was involved, short seasons can have no impact on the overall evaluation of a player. This can create the appearance of a problem in how we selected the peak seasons, but it isn't really a problem. For example, suppose a pitcher made his debut in a September call-up and faced a grand total of 40 batters that year. When placed in a pool of 7000 batters faced, the 40 can make a difference, but the differences might appear only in the fourth decimal place. When we scale things to 1000 batters faced for the purposes of presenting a normal-looking line of stats for the pitcher, those minute differences can disappear. In other words, short seasons sometimes have an impact that is so small that it doesn't matter whether we do or don't include that short season in the peak period. That can be true even if the pitcher had a 1.50 ERA or a 5.50 ERA during that short season. Because it makes no difference, our system sometimes includes one of these short seasons in the peak period for a player. It that short season happens to be the player's first year, you'll notice it, because we identify each player by the first year of their peak period. And you may wonder why that short season was deemed worth of inclusion. The answer is that it doesn't make a difference either way, so his performance on this disk wouldn't be affected if we were to change our system to exclude these seasons that don't have any meaningful impact on the overall stats for a player. Left/right splits ----------------- These splits are known for only about 30 of the 100+ seasons that we evaluated for this disk. That left us with a difficult choice. We could use the splits for modern players and assign standard splits to the older players, or we could treat everyone equally and use standard splits for all players. We chose to use standard splits for everyone. Even though you won't see left/right splits on the displays and reports, the event tables for every player do reflect our standard left/right adjustment, so left/right strategy is still a big part of the game using these players. Estimating missing statistics ----------------------------- In the early years of baseball history, certain statistics that we now take for granted today were not kept. As a result, we were forced to make estimates where the actual data was missing. We hoped to develop good estimates of the missing stats by looking at other stats that have been recorded for all of baseball history. For example, perhaps we could estimate sacrifice flies by looking at things like how often a batter put the ball in play, homered, and grounded into double plays. So we compiled a database consisting of thousands of modern player-seasons with a minimum 400 plate appearances or batters faced. Using scatter plots, correlation tests, and multiple regression, we looked for relationships between (a) the stats that are available for all seasons and (b) the stats that are now available but were missing for some of the early years. When strong relationships were found, we used those relationships to estimate the missing stats. When we were unable to find any relationship at all, or where the relationship was very weak, we estimated the missing stats based on league averages. Unfortunately, and to our surprise, we found only a few strong relationships despite all of our effort. As a result, many of the missing stats from the late 1800s and early 1900s are estimated based on league averages. The good news is that most of the missing stats don't have a large impact on the value of the player, so the use of league averages doesn't significantly alter how these players will perform in your Diamond Mind Baseball games. Players who were great in the early part of the 20th century will still be great on this disk, even if some of their stats weren't tracked by the record keepers of their time. Rating the position players --------------------------- In the Diamond Mind Baseball game, certain statistics are used to determine the likelihood that different events will occur in a particular batter-pitcher matchup. Those statistics include singles, doubles, triples, homers, hit batsmen, walks, and strikeouts. In another section of these notes, we've already explained how we computed park- and era-adjusted values for these statistics, so we won't repeat that here. But some other real-life statistics are not used to determine play outcomes. For example, it doesn't matter how many games a player appeared in or how many games he started. And the game engine doesn't use runs or runs batted in to generate play results. Instead, it generates fundamental events like singles and doubles, then lets the runs and RBI flow naturally from those events. Even though these non-fundamental stats are not used to generate play results, the disk looks a lot better if we include those stats, so we'll take a moment to discuss how we did that. Games and games started were estimated based on each player's real-life relationship between games played and plate appearances. Because this doesn't affect play outcomes, we didn't spend a lot of time on this, but we wanted these figures to look reasonable. Runs and runs batted in were computed in the same way as the fundamental events, by aggregating era- and park-adjusted values from each player's peak years. This means that players who were lucky enough to be on very good offensive teams will project for higher run/RBI totals than players who were surrounded by weaker hitters. And leadoff hitters will have more runs than RBI, while middle of the order hitters will have more RBI than runs. These figures are interesting to look at because they tell us how many runs and RBI each player would have had if they played for similar teams in a neutral environment. But they don't affect play outcomes, and you can expect to see these players post different run and RBI totals in your DMB games because you may be using them in different batting order position and with different teammates. Stolen bases and caught stealing figures are based on peak-year attempt rates and success rates. Our era- and park-adjustments can cause a player's batting stats to change, either for better or worse, so it was necessary to first determine how many times each player was projected to get on base in our neutral environment, and adjust the steals accordingly. Intentional walks were challenging. For more than half of baseball history, they were not recorded, so we estimated the breakdown of unintentional walks and intentional walks for hitters from that period. In Diamond Mind Baseball, intentional walks are generated by your decisions or by the computer manager, while unintentional walks flow from the batter-pitcher confrontation. When we rated the players, we set them up to earn the right number of unintentional walks during those plate appearances in which they are pitched to. Having done that, we were left with the question of whether to add in some intentional walks for each player. Remember, adding them in wouldn't change their walk rates in the game, because the manager makes those decisions. We chose to leave the intentional walk values at zero for all players, partly because we lack the historical data for many of those players, and partly because the rate at which these players were intentionally walked in their real- life games has no bearing on how often they will be intentionally walked as part of these potent lineups. The values for game-winning RBI were left at zero for everyone because this was an official stat for only a short time and it has no impact on play outcomes. The values for catcher's interference were left at zero for everyone because these events are extremely rare (less than one per ten thousand plate appearances, if memory serves) and historical data is not available. Every position player has a Clutch rating of Normal. With players of this caliber, it seemed unnecessary to give them a boost in clutch situations, and every study that we know of has concluded that players are unable to sustain clutch performances over the long term. Defensive ratings ----------------- Our defensive ratings for players are biased toward the peak years we selected. You may think of a certain player as a brilliant fielder, but if his peak period occurred during a period when his defensive performance was lower due to age or injury, our ratings reflect that. We also had minimim playing time requirements for a player to earn a rating at a position, and those requirements were based on how often he played during his peak period. Generally speaking, we gave ratings only to players with at least 100 games at a position during his peak period. You may think of a player as a right fielder, but if he had moved to first base before or early in his peak period, he may be rated only as a first baseman. Pitcher hitting --------------- Because pitchers don't bat all that often, we decided to base their hitting stats on their entire careers rather than choosing only the hitting stats for their peak pitching seasons. We believe this larger sample of plate appearances provides us with a better picture of their hitting ability. To avoid unnecessary rounding errors, the pitcher hitting stats on the disk are scaled to about 100 plate appearances for pitchers who batted at least that many times in their careers. For relief pitchers and DH-league pitchers who didn't bat 100 times, we scaled their pitching stats to match their career plate appearances as a hitter. Rating the pitchers ------------------- As we stated in the position player section, certain statistics are used to determine play results, other statistics are accumulated as a result of those play results, and some are more for decoration. Games pitched, games started, and games finished were estimated based on each pitcher's real-life relationship between games and batters faced. These values don't affect play outcomes, but they give you a feel for how the pitcher was used during his peak period. Runs and earned runs were computed in the same way as the fundamental events, by aggregating era- and park-adjusted values from each pitcher's peak years. An alternative would have been to use a formula like Runs Created to generate run and earned run values that are consistent with the rest of the stats, but we thought it would be more interesting to use the adjusted peak-period stats. The complete game figures are also based on differences from the league average. Complete game percentages have dropped sharply in the last hundred years. To level the playing field, it was necessary to evaluate pitchers relative to the norms for their time. Otherwise, everyone who pitched a hundred years ago would have Excellent durability, regardless of whether their durability was above- average by the standards of that era. Starter durability ratings were derived from these adjusted complete-game figures. The values for many modern stats (holds, inherited runners, pickoffs) were left at zero for all pitchers because we have no historical data and because they do not affect play outcomes. Hold ratings are extremely hard to assign for historical players because there isn't a lot of data on steals allowed by pitchers. We used the available data for the last thirty years and used more subjective methods for the rest. We weren't comfortable with any of our options for assigning ground ball percentages. Our play-by-play data covers only the last thirty years. We could have used that information to assign real GB% values for modern pitchers, but we had nothing to go on for the rest. We made several attempts to come up with a reliable way to estimate GB% from other pitching and fielding stats, but found no strong relationships. Rather than put out a disk with some pitchers having accurate GB% values and others having standard GB% values, we chose to go for uniformity and assign everyone a standard GB% of 50. Every pitcher has a Jam rating of Normal. With players of this caliber, it seemed unnecessary to give them a boost in clutch situations, and every study that we know of has concluded that players are unable to sustain clutch performances over the long term. Information common to batters and pitchers ------------------------------------------ For each player, we set the year to match the first season of his peak period. This means that the player ages that are displayed are the seasonal ages (as of July 1) of the beginning of the period for which he's rated. This gives you a quick way to see which players peaked earlier or later than the norm. All players were assigned an injury rating of Normal. We thought about researching real-life injuries, but that would have delayed the release of the disk by weeks or months without necessarily making it a better product. If one player happened to suffer a serious injury during his peak period, while another player tended to get hurt in seasons other than his peak years, is that really enough of a reason to make the first player Prone and the second one Normal? We're using the best years for the best players in history, and we thought it would be best to give every player the same risk of injury. The disk does not include projected fielding statistics. The game engine does not use fielding stats to determine the outcome of any plays. Instead, it uses the range ratings, throwing ratings, and error rates to control defensive performances.