Conjoint Analysis: Finding Customer Preferences

Marketers are always looking for ways to identify which attributes of products their customers really like. Customers tend to choose products, but subconsciously they rate their preferences based on the attributes of those products. These attributes, also called stimuli, affect the customers decision making behavior.Conjoint analysis is a technique that allows the researcher to ascertain the value customers place on different attributes of a product or service, without specifically asking about the attributes themselves.

Lets consider the example of a number of different products, such as cars. We want to see, if we polled a number of customers, how the customers will rate the attributes of the cars simply by asking them which cars they prefer. Below is a sample table of customer preferences for a series of cars (8 in total). We ask the customers to rate the cars in terms of preference (10 being the best, 1 being the worst)

Cust car1 car2 car3 car4 car5 car6 car7 car8
1 7 9 5 6 5 4 3 6
2 3 4 3 2 6 8 5 7
3 8 10 6 4 3 2 5 2

Our profiles for each car look like the following

style color gastype
car1 1 1 1
car2 1 1 2
car3 1 2 1
car4 1 2 2
car5 2 1 1
car6 2 1 2
car7 2 2 1
car8 2 2 2

Style (1 – Sport , 2- Luxury), Color (1 – Red, 2 – Black), Gastype (1-Eco , 2 – Non Eco)

At this point we wish to see exactly how the first customer might have ranked the attributes based on the products chosen. Conjoint analysis, in R , is handled through the conjoint package. No assumptions are really needed for conjoint analysis even though it uses regression, OLS, to determine the model. The only thing necessary is that the the dependent variables, attributes or stimuli are factors, and that the levels  are distinguishable, i.e. the levels are clearly different such asSport/Luxury, Red/Black, Economy/Non Economy, .

The response from the conjoint analysis is a measure of utility of the factors for the respondents, in other words, which attribute is more useful to the respondent, but remember we only asked them to rate the product overall.

In R, we would run the following and get the result.

caModel(y=cmatrix[1,], x=cprof) 


Estimate Std. Error t value Pr(>|t|) 
(Intercept) 5.6250 0.4841 11.619 0.000314 ***
factor(x$style)1 1.1250 0.4841 2.324 0.080800 . 
factor(x$color)1 0.6250 0.4841 1.291 0.266265 
factor(x$gastype)1 -0.6250 0.4841 -1.291 0.266265 
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.369 on 4 degrees of freedom
Multiple R-squared: 0.6859, Adjusted R-squared: 0.4503 
F-statistic: 2.911 on 3 and 4 DF, p-value: 0.1643

Our goal here is to examine the coefficients to see which ones have more of a utility for this customer. The above output would be very familiar to someone who knows regression, but we aren’t necessarily interested in the regression model. Recall that we only used the first row into the function to obtain the result. We can obtain the same result in a more easily readable fashion using the  the following:

caUtilities(y=cmatrix[1,], x=cprof, z=clevels)
[1]  5.625  1.125 -1.125  0.625 -0.625 -0.625  0.625

The result from running the caUtilities function gives us the utility for the first customer across the six attributes. We would read the attributes starting form the second number (the first is the intercept). Therefore, for this customer we would compare the +1.12  (which is Sport) and -1.12 (which is Luxury). We would interpret this as Sport has more utility for this customer than luxury. Similarly, Red has more utility than Black, and Non Economy has more utility than Economy. Remember, this is for the first customer only.


caPartUtilities(y=cmatrix, x=cprof, z=clevels)

would  give all of the utilities for each customer in the dataset which can be used for comparison.

Now we can group the importance of each attribute, not just the level of each attribute by using the caImportance function.

caImportance(y=cmatrix[1,], x=cprof)
[1] 47.37 26.32 26.32

This means that the Sport/Luxury stimuli was more important than the other two (Color & GasType). You can then run this for any customer to see how customers are different from each other.

Finally, if we want to see a model for the full dataset, and examine the overall importance of each of the stimuli we would run the Conjoint function and then examine the model, but especially the last line returned to see how the stimuli ranked in importance against the other stimuli. In this example we see that the function returns a model, and the average importance of the stimuli 38.76 (style) / 30.23 (color) / 31.01 (gas type). This means that style is more important than the other two, how much more important is up to interpretation.

Conjoint(y=cmatrix, x=cprof, z=clevels)

lm(formula = frml)

 Min 1Q Median 3Q Max 
-5,350 -1,775 -0,025 1,916 4,925 

 Estimate Std. Error t value Pr(>|t|) 
(Intercept) 5,5563 0,1828 30,400 < 2e-16 ***
factor(x$style)1 -0,1312 0,1828 -0,718 0,47377 
factor(x$color)1 0,1563 0,1828 0,855 0,39393 
factor(x$gastype)1 0,5062 0,1828 2,770 0,00629 ** 
Signif. codes: 0 ‘***’ 0,001 ‘**’ 0,01 ‘*’ 0,05 ‘.’ 0,1 ‘ ’ 1

Residual standard error: 2,312 on 156 degrees of freedom
Multiple R-squared: 0,05408, Adjusted R-squared: 0,03589 
F-statistic: 2,973 on 3 and 156 DF, p-value: 0,03355

[1] "Part worths (utilities) of levels (model parameters for whole sample):"
 levnms utls
1 intercept 5,5563
2 Sport -0,1312
3 Luxury 0,1312
4 Red 0,1563
5 Black -0,1563
6 Economy 0,5062
7 Non-Eco -0,5062
[1] "Average importance of factors (attributes):"
[1] 38,76 30,23 31,01

It is important to note that Conjoint analysis should really be used as a means to an end and not an end in and of itself. It can provide some very useful information about product attributes to help marketers design better products and identify more specific target markets.

Files for the sample above

Conjoint2 – R Code

conjointmatrix – customer preference matrix

conjointprofiles –  profile matrix

levels – levels of attributes