
Predictabilityat P= 1.0, %

N^{a}

Recommended Pvalue

Predictability at this Pvalue, %

Reduction in predictability, %

Pvalue at which imbalance occurs


One variable

Two categories

33.0

100

0.8^{b}

32.0

1.0

0.7

 
200 to 300

0.7 ^{b}

33.0

0

0.5

 
≥400

0.7 ^{b}

33.0

0



Two categories  unequal prevalence
 
100

1.0

33.0

0

1.0

 
200

0.8 ^{b}

33.0

0

0.7

 
300

0.7 ^{b}

33.0

0

0.5

 
≥400

0.7 ^{b}

33.0

0



Three categories
 
100

1.0

32.0

0

0.9

 
200

0.7 ^{b}

32.5

<1.0

0.6

 
300

0.7 ^{b}

33.0

<1.0

0.5

 
≥400

0.7 ^{b}

33.0

<1.0



Four categories
 
100

1.0

32.0

0

0.9

 
200

0.9 ^{b}

32.5

<1.0

0.8

 
300

0.7 ^{b}

32.7

<1.0

0.6

 
≥400

0.7 ^{b}

32.5

<1.0

0.5

Two variables

Both with 2 categories

56.0

100

0.9

54.0

2.0

0.8

 
200

0.7

48.0

8.0

0.6

 
300

0.7

48.0

8.0

0.5

 
≥400

0.7

48.0

8.0



Both with 2 categories  unequal prevalence
 
100

1.0

56.0

0

1.0

 
200

0.9

54.0

2.0

0.8

 
300

0.7

49.0

7.0

0.6

 
≥400

0.7

48.0

8.0

0.5

Three variables

All with 2 categories  equal prevalence

67.0

100

1.0

67.0

0

0.9

 
200 to 300

0.7

55.0

12.0

0.6

 
≥400

0.7

56.0

11.0



All with 2 categories  unequal prevalence
 
100

1.0

67.0

0

0.9

 
200 to 300

0.7

55.0

12.0

0.6

 
≥400

0.7

55.0

12.0

0.5

Four variables

All with 2 categories  equal prevalence

74.0

100

1.0

74.0

0

1.0

 
200

0.8

64.0

10.0

0.7

 
300

0.7

57.0

17.0

0.6

 
≥400

0.7

58.0

16.0

0.5


^{a}The categories of N are dependent upon the point at which imbalance is observed (the value of probability of assignment P). ^{b}For one prognostic variable the reduction in predictability is so small as the probability of assignment P is reduced that the recommended P value is 1.0.