Logo Help Center
Chat with us
Open a ticket
Sign in
  1. aytm Help Center
  2. Research Tests
  3. Advanced MaxDiff

Articles in this section

  • Overview: MaxDiff Research Test
  • Building an Advanced MaxDiff
  • Analyzing an Advanced MaxDiff
  • Advanced MaxDiff Aggregate Methodology
  • Advanced MaxDiff HB Methodology
  • Advanced MaxDiff HB TURF Analysis

Analyzing an Advanced MaxDiff

Analyze Your MaxDiff Experiment

  1. On the Results page, click on the Preference Likelihood drop-down to toggle between Preference likelihood (#/screen), Average-based PL (% baseline), or Utility Scores.
  2. Hover over the bars to see further statistical analysis.
  3. Click the hamburger menu to download a PNG, JPEG, PDF, or SVG vector image of the current data visualization.
Snag_5c78e5b.png

 


 

Preference Likelihood (#/screen)

With Preference Likelihood selected, the baseline is set at appropriate percent depending on the number of items per screen programmed in the MaxDiff. This will represent the chance an item would be selected from a random set of items, but now the set size will match how many items were tested within the MaxDiff tasks respondents completed.

  • 33% if (3/screen)
  • 25% if (4/screen) 
  • 20% if (5/screen)

In the example below we showed respondent 5 alternatives per screen, the baseline is the black line set at 20%.

Snag_5bd25de.png

 


 

Average-based PL (50% baseline)

For Average-based PL (50% baseline) the baseline is set at 50%, the probability an item would be chosen from among a set of two no matter how many items per screen respondents interacted with. 
Snag_5bee3ba.png

 


 

Utility Scores

With Utility Scores visualized, values are shown with a zero-centered average, to show performance in relation to one another. Since zero represents the average performance, the more positive an item's utility, the more it is preferred by respondents, and the more negative an item's utility, the less it is preferred.
Snag_5c34ffe.png

 


 

MaxDiff FAQs

What is the best metric/output to use in my analysis?

There is not a single best metric per se, it often is a matter of personal preference.

  • Preference Likelihood scores are more easily interpreted than utility scores because the values have more meaning. With preference likelihood, each percent represents the probability an item would be most preferred out of a given set.
    • If you prefer the given set to reflect the task respondents completed, use Preference Likelihood (#/screen).
    • If you prefer the given set to reflect a head-to-head comparison of one item versus another, use Average-based PL (50% baseline).
  • Utility Scores can provide an easy high-level view of what performs above average (positive value), what performs below average (negative value), and the overall rank order. For significance testing between options, we recommend using Utility Scores.

The rank order of items varies across different metrics. Which metric should I use to report rank order?

The short answer is we recommend using Utility Scores for looking at the overall rank order of items. Without diving too far into the math, Utility Scores are preferred as they are the rawest form of the analysis and the data is normalized.

 

The baseline value of Preference Likelihood (#/screen) does not match the average of all PL values. Why is this, and what does this mean?

It has to do with the mathematical transformation of the raw utility scores that is being done to produce preference likelihood based on the number of items per screen. The short answer is it’s easier to move upward than downward in these calculations. When there is a clearer rank order and preference among the items, the average of these values will creep above the baseline value which is the theoretical preference likelihood, simply thought of as chance. If all items performed equally, the average of Preference Likelihood scores would more closely align with the baseline value.

 

Some items that perform below the baseline with Average-based PL (50% baseline) perform above the baseline with Preference Likelihood (#/screen). Why is this, and what does this mean?

As mentioned above, it is possible and likely for Preference Likelihood values based on the number of items per screen in the MaxDiff exercise to creep above the theoretical average and thus each item can move up a little. This isn’t an effect observed with Average-based PL (50% baseline), going back to the mathematical simplicity of this metric. Since these two behave slightly differently with regard to the baseline, it isn’t fair to compare how items perform against the benchmark under different scenarios. For those wanting to understand pure above-average and below-average performers, we recommend looking at the Utility Scores (positive values = above average, negative values = below average, values tightly around 0 are about average).

  • Facebook
  • Twitter
  • LinkedIn
Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request
Return to top
  • Our platform
  • Solutions
  • Pricing
  • Our panels
  • Help center
  • Contact us
  • Blog
  • Privacy
  • TOU
  • About
  • Careers
  • Innovation lab
  • Demo
© 2022, Umongous, Inc. All rights reserved.

zendesk theme design by aytm c/o diziana