Rating the players
                          ------------------

Table of contents
-----------------

General concepts
Rounding
Left/right splits
Estimating missing statistics
Rating the position players
Defensive ratings
Pitcher hitting
Rating the pitchers
Information common to batters and pitchers


General concepts
----------------

Diamond Mind Baseball players have always been rated on the basis of statistics 
that are adjusted to reflect their home park and the era in which they played.  
This is what enables you to play meaningful games between teams from vastly 
different eras.

Without adjusting for era, there would be a serious bias in favor of hitters 
from the eras when offense was sky-high (e.g. early 1930s, late 1990s) and 
pitchers from the times when pitching dominated the game (e.g. 1900s, 1960s).

Without adjusting for parks, we would tend to over-rate hitters who were lucky 
enough to play in Coors Field and we would over-rate pitchers who did much of 
their work in Dodger Stadium, just to name a couple of examples.

Most of our season disks are based on single seasons, so we evaluate the players 
relative to the league averages for that season after adjusting for park 
effects.  This is a simple process that gets a little trickier only when a 
player changes teams during the year, in which case we evaluate him separately 
for each team and then combine the results.

For this project, players were evaluated based on their performance across 
several seasons.  As with our annual season disks, each individual stat line was 
evaluated relative to the league averages after adjusting for park effects.  
Then those individual stat lines were combined, using a weighted average based 
on the amount of playing time for each stat line. 

At this stage of the process, we had a set of values for each player that were 
expressed in terms of park-adjusted differences from the league average.  This 
levels the playing field, because these values indicate how each player 
performed relative to the norms for his day and with park effects removed.

We then turned those era- and park-neutral differences back into a set of 
statistics by applying them to an era and park of our choosing.  

Choosing the park was easy.  We just left it as Neutral.  We could have 
projected the Boston players into Fenway Park, the Los Angeles players into 
Dodger Stadium, and so on.  But that would have made it harder for you to 
compare the players.

Because we chose the Neutral Park, you can generate a Batting Register and sort 
on various statistics to see how the players rank.  If we had chosen a different 
park for each player, you could still do that, but you'd have to keep those park 
factors in mind as you looked over the rankings.  One player may rank ahead of 
another solely because he was projected into a more favorable park, not because 
he was a better performer during his peak years.

For the baseline era, we averaged the NL seasons from 1955 to 2002.  We focused 
on one league because we didn't want to mix results for DH and non-DH leagues.  
We wanted as large a block of seasons as possible, but we also wanted a 
consistent set of statistics.  Prior to 1955, there were many seasons when stats 
like intentional walks and sacrifice flies were not recorded, or were not 
recorded in the same manner, so we chose to stop there.

To sum up, the statistics on this disk reflect how each player would have 
performed in a Neutral Park and in an era that was like the average NL season 
from roughly the past fifty years.  Players who dominated their leagues by the  
largest margin wound up with the best statistics on this disk.


Rounding
--------

We've defined our peak periods to include a lot of playing time.  For most 
players, that amounts to at least 6000 plate appearances or batters faced.  
And because each selected season is weighted based on how much playing time 
was involved, short seasons can have no impact on the overall evaluation of 
a player.  This can create the appearance of a problem in how we selected 
the peak seasons, but it isn't really a problem.

For example, suppose a pitcher made his debut in a September call-up and faced 
a grand total of 40 batters that year.  When placed in a pool of 7000 batters 
faced, the 40 can make a difference, but the differences might appear only in 
the fourth decimal place.  When we scale things to 1000 batters faced for the 
purposes of presenting a normal-looking line of stats for the pitcher, those 
minute differences can disappear.

In other words, short seasons sometimes have an impact that is so small that 
it doesn't matter whether we do or don't include that short season in the peak 
period.  That can be true even if the pitcher had a 1.50 ERA or a 5.50 ERA 
during that short season.

Because it makes no difference, our system sometimes includes one of these 
short seasons in the peak period for a player.  It that short season happens 
to be the player's first year, you'll notice it, because we identify each 
player by the first year of their peak period.  And you may wonder why that 
short season was deemed worth of inclusion.

The answer is that it doesn't make a difference either way, so his performance 
on this disk wouldn't be affected if we were to change our system to exclude 
these seasons that don't have any meaningful impact on the overall stats for 
a player.


Left/right splits
-----------------

These splits are known for only about 30 of the 100+ seasons that we evaluated
for this disk.  That left us with a difficult choice.  We could use the splits
for modern players and assign standard splits to the older players, or we could
treat everyone equally and use standard splits for all players.  We chose to
use standard splits for everyone.

Even though you won't see left/right splits on the displays and reports, the 
event tables for every player do reflect our standard left/right adjustment,
so left/right strategy is still a big part of the game using these players.


Estimating missing statistics
-----------------------------

In the early years of baseball history, certain statistics that we now take for 
granted today were not kept.  As a result, we were forced to make estimates 
where the actual data was missing.

We hoped to develop good estimates of the missing stats by looking at other 
stats that have been recorded for all of baseball history.  For example, perhaps 
we could estimate sacrifice flies by looking at things like how often a batter 
put the ball in play, homered, and grounded into double plays.

So we compiled a database consisting of thousands of modern player-seasons with 
a minimum 400 plate appearances or batters faced.  Using scatter plots, 
correlation tests, and multiple regression, we looked for relationships between 
(a) the stats that are available for all seasons and (b) the stats that are now 
available but were missing for some of the early years.

When strong relationships were found, we used those relationships to estimate 
the missing stats.  When we were unable to find any relationship at all, or 
where the relationship was very weak, we estimated the missing stats based on 
league averages.

Unfortunately, and to our surprise, we found only a few strong relationships 
despite all of our effort.  As a result, many of the missing stats from the late 
1800s and early 1900s are estimated based on league averages.

The good news is that most of the missing stats don't have a large impact on the 
value of the player, so the use of league averages doesn't significantly alter 
how these players will perform in your Diamond Mind Baseball games.  Players who 
were great in the early part of the 20th century will still be great on this 
disk, even if some of their stats weren't tracked by the record keepers of their 
time.


Rating the position players
---------------------------

In the Diamond Mind Baseball game, certain statistics are used to determine the 
likelihood that different events will occur in a particular batter-pitcher 
matchup.  Those statistics include singles, doubles, triples, homers, hit 
batsmen, walks, and strikeouts.

In another section of these notes, we've already explained how we computed park- 
and era-adjusted values for these statistics, so we won't repeat that here.

But some other real-life statistics are not used to determine play outcomes.  
For example, it doesn't matter how many games a player appeared in or how many 
games he started.  And the game engine doesn't use runs or runs batted in to 
generate play results.  Instead, it generates fundamental events like singles 
and doubles, then lets the runs and RBI flow naturally from those events.

Even though these non-fundamental stats are not used to generate play results, 
the disk looks a lot better if we include those stats, so we'll take a moment to 
discuss how we did that.

Games and games started were estimated based on each player's real-life 
relationship between games played and plate appearances.  Because this doesn't 
affect play outcomes, we didn't spend a lot of time on this, but we wanted these 
figures to look reasonable.

Runs and runs batted in were computed in the same way as the fundamental events, 
by aggregating era- and park-adjusted values from each player's peak years.  
This means that players who were lucky enough to be on very good offensive teams 
will project for higher run/RBI totals than players who were surrounded by 
weaker hitters.  And leadoff hitters will have more runs than RBI, while middle 
of the order hitters will have more RBI than runs.

These figures are interesting to look at because they tell us how many runs and 
RBI each player would have had if they played for similar teams in a neutral 
environment.  But they don't affect play outcomes, and you can expect to see 
these players post different run and RBI totals in your DMB games because you 
may be using them in different batting order position and with different 
teammates.

Stolen bases and caught stealing figures are based on peak-year attempt rates 
and success rates.  Our era- and park-adjustments can cause a player's batting 
stats to change, either for better or worse, so it was necessary to first 
determine how many times each player was projected to get on base in our neutral 
environment, and adjust the steals accordingly.

Intentional walks were challenging.  For more than half of baseball history, 
they were not recorded, so we estimated the breakdown of unintentional walks and 
intentional walks for hitters from that period.  

In Diamond Mind Baseball, intentional walks are generated by your decisions or 
by the computer manager, while unintentional walks flow from the batter-pitcher 
confrontation.  When we rated the players, we set them up to earn the right 
number of unintentional walks during those plate appearances in which they are 
pitched to.  

Having done that, we were left with the question of whether to add in some 
intentional walks for each player.  Remember, adding them in wouldn't change 
their walk rates in the game, because the manager makes those decisions.

We chose to leave the intentional walk values at zero for all players, partly 
because we lack the historical data for many of those players, and partly 
because the rate at which these players were intentionally walked in their real-
life games has no bearing on how often they will be intentionally walked as part 
of these potent lineups.

The values for game-winning RBI were left at zero for everyone because this was 
an official stat for only a short time and it has no impact on play outcomes.

The values for catcher's interference were left at zero for everyone because 
these events are extremely rare (less than one per ten thousand plate 
appearances, if memory serves) and historical data is not available.

Every position player has a Clutch rating of Normal.  With players of this 
caliber, it seemed unnecessary to give them a boost in clutch situations, and 
every study that we know of has concluded that players are unable to sustain 
clutch performances over the long term.


Defensive ratings
-----------------

Our defensive ratings for players are biased toward the peak years we selected.  
You may think of a certain player as a brilliant fielder, but if his peak period 
occurred during a period when his defensive performance was lower due to age or 
injury, our ratings reflect that.  

We also had minimim playing time requirements for a player to earn a rating at a 
position, and those requirements were based on how often he played during his 
peak period.  Generally speaking, we gave ratings only to players with at least
100 games at a position during his peak period.  You may think of a player as a
right fielder, but if he had moved to first base before or early in his peak 
period, he may be rated only as a first baseman.


Pitcher hitting
---------------

Because pitchers don't bat all that often, we decided to base their hitting
stats on their entire careers rather than choosing only the hitting stats for
their peak pitching seasons.  We believe this larger sample of plate appearances
provides us with a better picture of their hitting ability.

To avoid unnecessary rounding errors, the pitcher hitting stats on the disk are
scaled to about 100 plate appearances for pitchers who batted at least that
many times in their careers.  For relief pitchers and DH-league pitchers who
didn't bat 100 times, we scaled their pitching stats to match their career
plate appearances as a hitter.


Rating the pitchers
-------------------

As we stated in the position player section, certain statistics are used to 
determine play results, other statistics are accumulated as a result of those 
play results, and some are more for decoration.

Games pitched, games started, and games finished were estimated based on each 
pitcher's real-life relationship between games and batters faced.  These values 
don't affect play outcomes, but they give you a feel for how the pitcher was 
used during his peak period.

Runs and earned runs were computed in the same way as the fundamental events, by 
aggregating era- and park-adjusted values from each pitcher's peak years.  An 
alternative would have been to use a formula like Runs Created to generate run 
and earned run values that are consistent with the rest of the stats, but we 
thought it would be more interesting to use the adjusted peak-period stats.

The complete game figures are also based on differences from the league average.  
Complete game percentages have dropped sharply in the last hundred years.  To 
level the playing field, it was necessary to evaluate pitchers relative to the 
norms for their time.  Otherwise, everyone who pitched a hundred years ago would 
have Excellent durability, regardless of whether their durability was above-
average by the standards of that era.  Starter durability ratings were derived 
from these adjusted complete-game figures.  

The values for many modern stats (holds, inherited runners, pickoffs) were left 
at zero for all pitchers because we have no historical data and because they do 
not affect play outcomes.

Hold ratings are extremely hard to assign for historical players because there 
isn't a lot of data on steals allowed by pitchers.  We used the available data 
for the last thirty years and used more subjective methods for the rest.

We weren't comfortable with any of our options for assigning ground ball 
percentages.  Our play-by-play data covers only the last thirty years.  We could 
have used that information to assign real GB% values for modern pitchers, but we 
had nothing to go on for the rest.  We made several attempts to come up with a 
reliable way to estimate GB% from other pitching and fielding stats, but found 
no strong relationships.  Rather than put out a disk with some pitchers having 
accurate GB% values and others having standard GB% values, we chose to go for 
uniformity and assign everyone a standard GB% of 50.

Every pitcher has a Jam rating of Normal.  With players of this caliber, it 
seemed unnecessary to give them a boost in clutch situations, and every study 
that we know of has concluded that players are unable to sustain clutch 
performances over the long term.


Information common to batters and pitchers
------------------------------------------

For each player, we set the year to match the first season of his peak period.  
This means that the player ages that are displayed are the seasonal ages (as of 
July 1) of the beginning of the period for which he's rated.  This gives you a 
quick way to see which players peaked earlier or later than the norm.

All players were assigned an injury rating of Normal.  We thought about 
researching real-life injuries, but that would have delayed the release of the 
disk by weeks or months without necessarily making it a better product.  

If one player happened to suffer a serious injury during his peak period, while 
another player tended to get hurt in seasons other than his peak years, is that 
really enough of a reason to make the first player Prone and the second one 
Normal?  

We're using the best years for the best players in history, and we thought it 
would be best to give every player the same risk of injury.

The disk does not include projected fielding statistics.  The game engine does 
not use fielding stats to determine the outcome of any plays.  Instead, it uses 
the range ratings, throwing ratings, and error rates to control defensive 
performances.