12. Multinomial Logistic Regression Multinomial logistic regression is defined for a response variable with three or more discrete outcomes. It is an extention of logistic regression based on the binomial distribution (i.e., where the response has only two outcomes). The multinomial model handles data analysis situations where a response variable is ordinal (the order of the response categories is important) or nominal (order of the response categories does not matter). Examples of each type will be illustrated here. 1a. Ordinal Logistic Regression with a Categorical Explanatory Variable In a study of how victims respond to emotional abuse, subjects were grouped into two levels by how they described their situation: the victim was very close to the perpetrator (VC) or was not very close (NVC). In order to examine the impact of closeness of perpetrator on time to disclosure, time was transformed into an ordinal variable since the actual value contains considerable measurement error. This approach may also minimize recall biases due to a lack of memory by the subject. Converting time to an ordinal outcome is one way to measure the length of time the victim took before reporting an incidence of emotional abuse. The three time responses, coded as 1, 2, or 3 represent these durations: Response = 1: less than 1 year Response = 2: 1-10 years Response = 3: more than 10 years or never The codes 1, 2, and 3 represent increasing time durations; however, this choice is arbitrary but keep in mind it will be easier to interpret than if they represented decreasing time durations. In general, since recoding data induces measurement error one would usually prefer to record the actual values of time to disclosure and model them whenever possible (i.e., if the actual disclosure latencies were known, a time variable on a continuous scale would better represent a duration measurement rather than an ordinal outcome). The variable count in the DATA step below represent the number of respondents in a recent survey of undergraduate students (roughly 20 years old) who voluntarily identified themselves as having experienced emotional abuse at some time in their life. DATA phy(keep= experience resp count) row_tot(keep=experience total); LABEL resp='Response'; INPUT experience $ r1 r2 r3; total=SUM(OF r1-r3); OUTPUT row_tot; * save the row totals; resp=1; count=r1; OUTPUT phy; resp=2; count=r2; OUTPUT phy; resp=3; count=r3; OUTPUT phy; cards; NVC 10 4 12 VC 8 8 33 ; When placed in a table, these data are easier to interpret. For reasons that will become obvious, the 3 levels of the response variable head the columns and combinations of the explanatory variables define the rows: PROC TABULATE DATA=phy NOseps; CLASS experience resp; VAR count; TABLE (experience all='Total'), (resp all='Total')*count=' '*sum=' '*f=5.0 / rts=12 BOX='Counts'; run; ------------------------------------ |Counts | Response | | | |-----------------| | | | 1 | 2 | 3 |Total| |----------+-----+-----+-----+-----| |experience| | | | | |NVC | 10| 4| 12| 26| |VC | 8| 8| 33| 49| |Total | 18| 12| 45| 75| ------------------------------------ The question of interest is whether the two groups of respondents classified as NVC or VC responded with nearly the same proportions across the three response categories. Ordinal logistic regression reveals how the two groups may differ in this situation. With PROC GENMOD, the multinomial distribution is specified on the MODEL statement with a link that identifies an ordinal response variable (LINK=clogit): PROC GENMOD DATA=phy; CLASS experience; MODEL resp = experience / DIST=multinomial LINK=clogit type3; ESTIMATE 'NVC vs VC' experience 1 -1 / exp ; FREQ count; OUTPUT out=prd2 p=pred; RUN; Note that the choices of links functions for this model are limited to the cumulative logit (cLogit), cumulative probit (cProbit), and cumulative complementary log-log (cLOGLOG) with PROC GENMOD. The GENMOD Procedure Model Information Data Set WORK.PHY Distribution Multinomial Link Function Cumulative Logit Dependent Variable resp Frequency Weight Variable count Number of Observations Read 6 Number of Observations Used 6 Sum of Frequencies Read 75 Sum of Frequencies Used 75 Class Level Information Class Levels Values experience 2 NVC VC Response Profile Ordered Total Value resp Frequency 1 1 18 2 2 12 3 3 45 PROC GENMOD models the probabilities of levels of the response having LOWER Ordered Values (the first column) in the response profile table. That is, levels of the response equal to 1 and 2 will be compared with response=3 with an odds ratio. [You can reverse this default ordering, that is, model the probabilities of HIGHER ordered values compared to the lowest value, by specifying the DESCENDING option on the PROC statement.] Parameter Information Parameter Effect experience Prm1 experience NVC Prm2 experience VC Analysis Of Parameter Estimates Standard Wald 95% Chi- Parameter DF Estimate Error Confidence Limits Square Pr > ChiSq Intercept1 1 -1.5340 0.3422 -2.2048 -0.8633 20.09 <.0000 Intercept2 1 -0.7491 0.3011 -1.3393 -0.1590 6.19 0.0128 experience 1 1 0.9760 0.4801 0.0351 1.9170 4.13 0.0420 experience 2 0 0.0000 0.0000 0.0000 0.0000 . . Scale 0 1.0000 0.0000 1.0000 1.0000 NOTE: The scale parameter was held fixed. LR Statistics for Type 3 Analysis Chi- Source DF Square Pr > ChiSq experience 1 4.15 0.0415 The option "type3" on the MODEL statement from GENMOD produces a likelihood ratio significance test for the effect of experience on the response. The pvalue of 0.0415 indicates the two groups did not report the incident with the same proportions over the three time periods. Contrast Estimate Results Standard Chi- Label Estimate Error Confidence Limits Square Pr > ChiSq NVC vs VC 0.9760 0.4801 0.0351 1.9170 4.13 0.0420 Exp(NVC vs VC) 2.6539 1.2741 1.0357 6.8005 The pvalue printed from the ESTIMATE statement is the same as pvalue reported in the "Analysis of Parameter Estimates" table. In this case, it is a Wald test. The odds ratio is found by exponentiating the coefficient: EXP(0.9760)= 2.6539. It is reported in the second row under the "Estimate" column and indicates that victims who were not very close (NVC) were more likely to report an incident in a shorter durations of time that those who were very close (VC). The predicted values from this model produced with the OUTPUT statement help one to interpret the odds ratio. PROC SORT DATA=prd2; BY experience _level_ ; PROC PRINT DATA=prd2; run; Here is the entire contents of the dataset containing the predicted values: Obs experience resp count _ORDER_ _LEVEL_ pred 1 NVC 1 10 1 1 0.3640 2 NVC 2 4 1 1 0.3640 3 NVC 3 12 1 1 0.3640 4 NVC 1 10 2 2 0.5565 5 NVC 2 4 2 2 0.5565 6 NVC 3 12 2 2 0.5565 7 VC 1 8 1 1 0.1774 8 VC 2 8 1 1 0.1774 9 VC 3 33 1 1 0.1774 10 VC 1 8 2 2 0.3210 11 VC 2 8 2 2 0.3210 12 VC 3 33 2 2 0.3210 Only the first record from each set of three repeated predicted values (i.e., where _LEVEL_ changes its value from the previous record) needs to be extracted from this dataset: DATA prd2; SET prd2; BY experience _level_ ; KEEP experience _level_ pred; IF first._level_then OUTPUT; RUN; The reduced dataset is: PROC PRINT DATA=prd2; run; Obs experience _LEVEL_ pred 1 NVC 1 0.3640 2 NVC 2 0.5565 3 VC 1 0.1774 4 VC 2 0.3210 The first two rows of the column labeled pred contain cumulative probabilities for the NVC group: Pred=PROB( _level_ LE 1 | experience=NVC) = .3640 Pred=PROB( _level_ LE 2 | experience=NVC) = .5565 Pred=PROB( _level_ LE 3 | experience=NVC) = 1 (implied by the ordinal model) To get the probabilities for the individual levels, subtract adjacent values. Prob (Response=1 | experience=NVC) = .3640 Prob (Response=2 | experience=NVC) = 0.5565 - 0.3640 = .1925 Prob (Response=3 | experience=NVC) = 1 - 0.5565 = .4435 Transpose these two records for each value of experience (with an index given by the value of _order_) into one record: PROC TRANSPOSE DATA=prd2 OUT=prd3(drop=_name_ _label_) prefix=_; BY experience; VAR pred; ID _level_; RUN; The variable _level_ defines the respective value of pred in that row as the probability for that particular level of the response. DATA prd3; merge prd3 row_tot ; * Merge in row totals saved earlier; BY experience; DROP _: ; prob1 = _1; prob2 = (_2 - _1); prob3 = (1.0 - _2); cnt1 = _1*total; cnt2 = total*(_2 - _1); cnt3 = total*(1 - _2); prb23= prob2 + prob3; odds1=prob1/prb23; prb12= prob1 + prob2; odds2=prb12/prob3; PROC TABULATE data=prd3 NOseps; CLASS experience ; VAR prob1 prob2 prob3 prb12 odds1 prb23 odds2 cnt1-cnt3; TABLE experience, (cnt1 cnt2 cnt3)*sum=' '*f=7.2 / rts=15 box='Est Counts'; TABLE experience, (prob1 prob2 prob3)*sum=' '*f=7.4 / rts=15 box='Est Probs'; TABLE experience, (prob1 prb23 odds2)*sum=' '*f=7.4 / rts=15 box='odds ratio 1'; TABLE experience, (prb12 prob3 odds1)*sum=' '*f=7.4 / rts=15 box='odds ratio 2'; RUN; --------------------------------------- |Est Counts | cnt1 | cnt2 | cnt3 | |-------------+-------+-------+-------| |experience | | | | |NVC | 9.46| 5.00| 11.53| |VC | 8.69| 7.04| 33.27| --------------------------------------- --------------------------------------- |Est Probs | prob1 | prob2 | prob3 | |-------------+-------+-------+-------| |experience | | | | |NVC | 0.3640| 0.1925| 0.4435| |VC | 0.1774| 0.1436| 0.6790| --------------------------------------- The odds ratio under the cumulative logit assumption is computed by combining the probabilities for the two choices for adjacent categories into one. That is, prb23 = prob2 + prob3: ------------------------------- |odds ratio 1 | prob1 | prb23 | odds = prob1/prb23 |-------------+-------+-------| |experience | | | |NVC | 0.3640| 0.6360| 0.5724 Odds Ratio = 0.5724/0.2157 = 2.65 |VC | 0.1774| 0.8226| 0.2157 ------------------------------- And prb12 = prob1 + prob2 ------------------------------- |odds ratio 2 | prb12 | prob3 | odds = prb12/prob3 |-------------+-------+-------| |experience | | | |NVC | 0.5565| 0.4435| 1.2547 Odds Ratio = 1.2547/0.4728 = 2.65 |VC | 0.3210| 0.6790| 0.4728 ------------------------------- Interpretation of the Odds Ratio In the first table the odds ratio of 2.65 indicates those who belonged to the NVC group are more likely to disclose within a year than to wait 1 or more years to disclose or to never disclose. The second table also indicates those who belonged to the NVC group are more likely to disclose within a year or from 1-10 years to disclose than to wait 10 or more years or never disclose. The ordinal logistic regression model can also be run with PROC LOGISTIC. It is important to do so since it will test the proportional odds assumption: PROC LOGISTIC DATA=phy; CLASS experience / param=glm; MODEL resp = experience / link=clogit; FREQ count; run; Score Test for the Proportional Odds Assumption Chi-Square DF Pr > ChiSq 0.4707 1 0.492646 The p-value of this test is not significant so the ordinal logistic model is reasonable for these data. The remainder of the output from PROC LOGISTIC for these data is essentially the same as from PROC GENMOD.