Table of Contents

ExplainingNormalization

Digital Diamond Baseball supports two types of normalization: historical normalization and minimum playing time normalization.

Historical Normalization

Historical normalization makes it possible to play games with players from wildly different historical eras.  For example, thanks to historical normalization players that played during the "dead-ball" era will be able to compete fairly against modern-day players.  Historical normalization requires a destination year, which is the year in which all players in the library will normalized to.  This normalization year is specified in the Normalize Options dialog.

Event probabilities and ratings for each player in the library are automatically adjusted using historical normalization, and these normalized probabilities are displayed in all views in the game (e.g., Player Card pane, Browse Real Life Stats pane, Matchups dialog).  The adjustment made to each event probability is determined based on the ratio between the average probabilities for the normalization year and the average probabilities for the player's year (see League Averages for information).  More specifically, Bill James's Log5 method is used.  The Log5 formula is:

 

Combined event probability = ( (a * b) / c ) / ( num + ((1-a)*(1-b))/(1-c) )

Where a = player's event probability; b = average event probability in the normalization year; and c = average event probability in the player's year

 

Minimum Playing Time Normalization

Minimum playing time normalization attempts to adjust event probabilities for players that don't meet a minimum number of plate appearances, batters faced, steal attempts, fielding chances, and so on.  This type of normalization is important because it prevents a player's performance from being over influenced by a small statistical sample size.  For example, without normalization a batter that went 1 for 1 in real-life may end up leading the simulated league in batting average, or a fielder that never made an error, will never make an error in a simulated game.

The adjustment made by minimum playing time normalization moves the probabilities towards a reduced league average.  In other words, if a player did not play much in real-life, normalization will adjust his performance by "sliding" it towards a value below the league average. 

The normalization algorithm used by Digital Diamond Baseball predicts what a player's statistics would be if they met the minimum play requirements.  The best way to explain how this is done is to give an example 

Let's assume we want to predict how many doubles a batter would have if they had met the minimum number of plate appearances specified in the Normalize Options dialog.  To determine the total number of doubles, the minimum playing time normalization algorithm simulates the missing  plate appearances, assuming that the batter will perform worse than the average hitter.  Just how worse is determined by the normalization penalties specified in the Normalize Options dialog (there are two different penalties, one for pitchers and one for batters/runners/fielders).  This normalization is achieved using the following algorithm:

 

For batter's:

 

normalized number of SO =

      actual SO  + ROUNDUP[ ( (min PA - actual PA) * ( (1.0 / batter's normalization penalty) * league average  for SO/PA) ) ]

 

normalized number of BB =

      actual BB  + ROUNDDOWN[ ( (min PA - actual PA) * (batter's normalization penalty * league average  for BB/PA) ) ]

 

normalized number of 1B =

      actual 1B  + ROUNDDOWN[( (min PA - actual PA) * (batter's normalization penalty * league average  for 1B/PA) ) ]

 

normalized number of 2B =

      actual 2B  + ROUNDDOWN[( (min PA - actual PA) * (batter's normalization penalty * league average  for 2B/PA) ) ]

 

normalized number of 3B =

      actual 3B  + ROUNDDOWN[( (min PA - actual PA) * (batter's normalization penalty * league average  for 3B/PA) ) ]

 

normalized number of HR =

      actual HR  + ROUNDDOWN[( (min PA - actual PA) * (batter's normalization penalty * league average  for HR/PA) ) ]

 

For pitcher's:

 

normalized number of SO =

      actual SO  + ROUNDDOWN[( (min PA - actual PA) * (pitcher's normalization penalty * league average  for SO/PA) ) ]

 

normalized number of BB =

      actual BB  + ROUNDUP[ ( (min PA - actual PA) * ( (1.0 / pitcher's normalization penalty) * league average  for BB/PA) ) ]

 

normalized number of 1B =

      actual 1B  + ROUNDUP[ ( (min PA - actual PA) * ( (1.0 / pitcher's normalization penalty) * league average  for 1B/PA) ) ]

 

normalized number of 2B =

      actual 2B  + ROUNDUP[ ( (min PA - actual PA) * ( (1.0 / pitcher's normalization penalty) * league average  for 2B/PA) ) ]

 

normalized number of 3B =

      actual 3B  + ROUNDUP[ ( (min PA - actual PA) * ( (1.0 / pitcher's normalization penalty) * league average  for 3B/PA) ) ]

 

normalized number of HR =

      actual HR + ROUNDUP[ ( (min PA - actual PA) * ( (1.0 / pitcher's normalization penalty) * league average  for HR/PA) ) ]

 

For the purposes of this example, assume the minimum number of plate appearances before normalization is defined as 100.  We can also assume that the batter only had 50 plate appearances and had just 1 double.  Finally, let's assume that the average batter (as specified in the current league averages file) hit a double in 4.7% of their plate appearances, and the normalization penalty for batters is 50%.  Using these numbers, the normalization formula would predict a total of 2 doubles:

 

normalized number of 2B =

      1 + ROUNDDOWN[( (100 - 50) * (0.50 * 0.047) ) ] = 2 doubles

 

Minimum playing time normalization is used by Digital Diamond Baseball for calculating SO, BB, 1B, 2B, 3B, HR, SB, and E for any player that does not meet the minimum play requirements as defined in the Normalize Options dialog.  When browsing players in the Browse Real Life Stats pane, the names of players with normalized statistics are shown using a red, italicized font (see Browsing Players in a Library). 

Finally, after a player's probabilities have been adjusted using minimum playing time normalization, they will also be adjusted using historical normaliztion.