How to Assess Creativity in University Students (Without Falling Into Subjectivity)

Reading time: 15 minutes · Key authors: Guilford · Torrance · Alabbasi et al. · Karunarathne & Calma · Xu & Tognolini · Jönsson & Panadero · Keywords: assessing creativity in the classroom · creativity assessment higher education · creativity rubric · divergent thinking tests · TTCT · creativity assessment tools · formative creativity assessment · student creative skills

There is a problem that almost every university educator who has tried to work with creativity in the classroom has encountered at some point: assessment time arrives, and the feeling is of standing on unmapped terrain. What criterion do you use to grade an original idea? Who decides how creative a piece of work is? Is all of this not inevitably subjective?

It is a legitimate concern. And if left unresolved, it has serious consequences: educators avoid assessing creativity explicitly, students receive vague feedback on their creative processes, and creativity ends up being a declared curricular objective that is never actually measured or developed.

The good news is that research in cognitive psychology and educational assessment has been working on this problem for decades, producing concrete, validated, and replicable tools that make it possible to assess creativity with rigor — without “rigor” meaning reducing assessment to a multiple-choice test, and without “creativity” meaning leaving everything to the subjective impression of the assessor.

This article presents the main tools available, their theoretical foundations, their honest limitations, and the concrete implications for the university educator who wants to assess their students’ creativity rigorously and fairly.

The Core Problem: What It Means to Assess Creativity

Before discussing tools, it is necessary to clarify what is actually being assessed when creativity is assessed. Confusion about this is the primary source of educator discomfort with creative assessment.

Psychological research defines creativity, in operational terms, as the production of ideas, processes, or products that are simultaneously novel (original, statistically uncommon) and appropriate (useful, relevant to a specific goal) (Amabile, 2012; Guilford, 1967). This dual definition is fundamental: an idea can be original but completely useless (it is not fully creative), or it can be useful but entirely conventional (neither is that creative). Genuine creativity requires both conditions.

This operational definition has a direct implication for assessment: assessing creativity is not the same as assessing the assessor’s taste. It is possible to construct explicit criteria for determining whether a product, process, or idea meets the conditions of novelty and appropriateness in a given context. Those criteria are, by definition, teachable and assessable.

Karunarathne and Calma (2024), in a study published in Studies in Higher Education (University of Melbourne, DOI: 10.1080/03075079.2023.2225532), describe the current state of the problem clearly:

“The importance of creativity for survival in modern society is well recognised. However, the development of creative thinking skills through formal education still needs more attention, and the assessment of creative thinking skills using valid models in higher education is under-researched.”

That creativity assessment in higher education is under-researched does not mean tools do not exist. It means university educators have limited access to them — and this article aims to close that gap.

The Three Dimensions That Can Be Assessed

Creativity research identifies three main dimensions that can be assessed separately: creative potential, creative process, and creative product. Understanding the difference between these three dimensions is the first step toward designing assessment that makes sense.

Dimension 1 — Creative Potential

Creative potential refers to the cognitive capacities of the student that make creative thinking possible: fluency, flexibility, originality, and elaboration — the four dimensions of divergent thinking identified by Guilford (1967). Assessing creative potential means assessing the student’s capacity to generate original ideas, not the final result of a specific project.

The most widely used and validated instrument for assessing creative potential is the Torrance Tests of Creative Thinking (TTCT), developed by E. Paul Torrance in the 1960s. Alabbasi, Paek, Kim, and Cramond (2022), in a comprehensive review published in Frontiers in Psychology (PubMed Central, DOI: 10.3389/fpsyg.2022.1000385), describe the scope of the instrument:

“This review aims to offer school psychologists and other educators such as teachers, policymakers, and curriculum designers a comprehensive and practical guide to one of the most well-known creativity assessments — the Torrance Tests of Creative Thinking (TTCT) that was developed by E. Paul Torrance in the 1960s. The paper discusses the history, components, training, psychometric properties, and uses of the TTCT. Contrary to the notion that the TTCT is only a measure of divergent thinking skills, the current article presents its other uses.”

The same authors confirm the instrument’s robustness: the TTCT has demonstrated high reliability and validity over six decades, with validation in more than 2,000 studies worldwide, in 35 languages, including Spanish-language validations (Krumm et al., 2016).

However, the TTCT has important limitations that university educators should be aware of. As Xu and Tognolini (2022) note, in an article accessed directly from the proceedings of the HEAd’22 conference at the Universitat Politècnica de València, psychometric creativity tests — including the TTCT — “are based on a norm-referenced assessment and can only provide minimal information about what students know and can do in relation to creativity.” In other words: the TTCT measures creative potential under artificial conditions and by comparing students to their cohort, but does not provide information about how that potential unfolds in real tasks within a disciplinary domain.

For the university educator, the TTCT can be a useful diagnostic tool at the start of a course — to understand the divergent thinking profile of students — but it is not the most appropriate instrument for formative or summative assessment of creative learning within a subject.

Dimension 2 — Creative Process

The creative process refers to how the student works: how they frame the problem, how they generate and select ideas, how they prototype and refine their solution, how they learn from failure. Assessing the process means observing and documenting the journey, not just the destination.

This dimension is especially relevant in higher education because it directly reflects the competencies to be developed: divergent thinking, tolerance for ambiguity, the ability to reformulate problems, and a disposition toward iteration. A student may produce a mediocre final result due to external constraints (time, resources, information limitations) while having deployed a high-quality creative process. Assessing only the final product makes that process invisible.

The most appropriate tools for assessing the creative process are portfolios (documented collections of the work process over time), process journals (student records of their decisions, explorations, and learning), and process rubrics that specify criteria for each stage of the creative process.

Xu and Tognolini (2022) are explicit about the limitation of current approaches: “the priority of representing and measuring student creativity will be given to the creative process and the creative outcome.” The need to assess the process — not just the result — is one of the most robust consensuses in the research on creativity assessment in higher education.

Dimension 3 — Creative Product

The creative product is the tangible result of the process: the written work, the prototype, the design, the project, the solution. It is the dimension most commonly assessed in university classrooms, but also the one most frequently assessed without explicit creativity criteria — limited to technical or content-based criteria.

Assessing the creative dimension of a product requires specific criteria oriented toward novelty and appropriateness. Karunarathne and Calma (2024) used in their research three dimensions drawn from the PISA framework for assessing creativity: creative expression, knowledge creation, and creative problem solving — a model that offers a transdisciplinary structure applicable across different domains of university knowledge.

Analytic Rubrics: The Most Accessible Tool for Educators

For most university educators, the most practical tool for assessing creativity without falling into subjectivity is a well-designed analytic rubric.

An analytic rubric breaks creativity down into separately assessable dimensions and defines descriptors for different performance levels in each dimension. It does not eliminate the assessor’s judgment — no tool can do that — but it structures it, makes it explicit, and makes it communicable to students before they produce their work.

Jönsson and Panadero (2017), in their work on the design and use of rubrics to support learning assessment, verified from Scaling up Assessment for Learning in Higher Education (Springer), conclude that reliable assessment of complex competencies can be improved through the use of rubrics, especially if they are analytic, domain-specific, and complemented with exemplars and rater training. This finding has a direct implication: a generic creativity rubric — one that works equally for design, writing, engineering, and social work — will have lower reliability than a rubric specifically designed for the domain and the concrete task.

For the university educator who wants to design their own creativity rubric, Xu and Tognolini (2022) propose a three-step process based on the standards-referenced assessment model:

Step 1 — Clearly define the construct: before designing the rubric, it is necessary to define what creativity means in the specific context of the subject. Is originality of generated ideas being assessed? The quality of the problem-reformulation process? The appropriateness of solutions to the real context? The definition guides everything else.

Step 2 — Identify the assessable dimensions: from the definition, the progress variables are identified — the specific dimensions that will be assessed. For a subject focused on creative thinking, these might include: originality of ideas (statistical novelty relative to the domain), appropriateness to the objective (usefulness in the given context), elaboration (level of development and detail), and flexibility (variety of perspectives explored).

Step 3 — Describe the performance levels: for each dimension, descriptors are written for different levels — from initial to advanced performance. Descriptors should be specific enough for different assessors to arrive at the same grade when evaluating the same work, but broad enough not to exclude unexpected creative expressions.

The Consensual Assessment Technique (CAT): When Greater Rigor Is Needed for Product Assessment

For contexts where assessing the creative product requires greater rigor — final projects, degree dissertations, jury-evaluated presentations — the most validated tool is the Consensual Assessment Technique (CAT), originally developed by Amabile (1982).

The CAT is based on a conceptually sound principle: the creativity of a product can be estimated reliably through the independent judgment of multiple domain experts, without those experts needing to reach prior agreement on criteria. Experts evaluate the creativity level of the product holistically (commonly on a 1–7 scale), and the average of their independent evaluations produces a sufficiently reliable score.

What makes the CAT particularly valuable for university assessment is that it does not require creativity to be reduced to a predetermined list of criteria. Domain experts — whether instructors, professionals, or academics — have expert judgment about what is original and appropriate in their field, and that collective judgment, when properly systematized, produces evaluations with high inter-rater consistency.

The limitations of the CAT are equally important. First: it requires multiple domain-experienced evaluators, which is costly in time and resources. Second: it does not provide specific feedback to the student about which aspects of their work are more or less creative — it says how much, but not what or why. Third: it is less suitable for formative assessment (during the learning process) and more suitable for summative assessment of finished products.

The PISA Framework as a Transdisciplinary Model

For educators seeking a creativity assessment framework applicable across multiple disciplines, the model developed by the OECD for creativity assessment in PISA offers a three-dimension structure that has been validated in real educational contexts.

Karunarathne and Calma (2024) used this model in their research with 150 first-year university students in economics and business at the University of Melbourne, structuring the assessment around three main axes:

Creative expression: the ability to express one’s own ideas in an original way within a given format — a text, a visual, a proposal. The originality of expression and autonomy relative to conventional models in the domain are assessed.

Knowledge creation: the ability to generate new knowledge or original perspectives from existing information. Innovative synthesis, cross-domain connections, and the generation of unconventional hypotheses or perspectives are assessed.

Creative problem solving: the ability to identify and formulate problems in an original way and to generate novel and appropriate solutions. This is the dimension that connects most directly with the CPS models reviewed in earlier articles in this series.

The findings of Karunarathne and Calma are especially useful for pedagogical design: they identified that first-year students had specific deficits in creative expression and creative problem solving, and that the authentic assessment task designed for the study produced measurable improvements in both dimensions over the course of the semester. This confirms that well-designed assessment is not only a measurement tool: it is in itself a pedagogical intervention that develops creativity.

Five Common Mistakes in University Creativity Assessment

Research on creative assessment in higher education makes it possible to identify five mistakes that educators make most frequently.

Mistake 1 — Assessing only the final product, ignoring the process

A mediocre final product may be the result of a high-quality creative process interrupted by external circumstances. A brilliant final product may be the result of a conventional process well executed. Assessing exclusively the final product offers an incomplete and frequently unfair picture of the student’s creative capacities.

Mistake 2 — Using the same rubric to assess creativity and technical quality

Originality and technical quality are distinct criteria that are not necessarily correlated. A piece of work can be technically impeccable and completely conventional. A piece of work can be highly original and technically imperfect. A rubric that mixes both criteria without distinguishing them produces assessments that do not measure creativity in a differentiated way.

Mistake 3 — Penalizing originality that does not match the educator’s expectations

One of the most consistent findings in creativity assessment research is that evaluators tend to prefer ideas that confirm their prior assumptions about what a “good solution” looks like. This bias — which Jönsson and Panadero (2017) identify as one of the main risks of assessment without a rubric — can cause “creative” assessment to end up rewarding conformity with the educator’s expectations rather than genuine originality.

Mistake 4 — Not communicating assessment criteria to students before the work

If students do not know in advance the criteria by which their work’s creativity will be assessed, they cannot direct their efforts in an informed way. Worse still: they may interpret the absence of explicit criteria as a signal that what matters is “impressing” the educator, which generates aesthetically elaborate production that is not necessarily original. Explicit criteria shared before the work begins are a necessary condition for fair creativity assessment.

Mistake 5 — Assessing creativity with a single evaluator and no prior calibration

The reliability of creativity assessments improves significantly when multiple evaluators assess independently before comparing results, when evaluators are trained with examples of the expected performance level (exemplars), and when there is a discussion and calibration process among evaluators. Assessing creativity alone, without any of these processes, produces assessments more susceptible to the individual bias of the assessor.

A Minimal Rubric Proposal for the University Classroom

Drawing on the evidence reviewed, it is possible to propose a minimal rubric structure for assessing creativity in university assignments across different disciplines. This proposal integrates Guilford’s (1967) dimensions, the PISA framework used by Karunarathne and Calma (2024), and the rubric design principles of Xu and Tognolini (2022).

Dimension 1 — Originality: To what extent does the idea, solution, or perspective presented differ from the conventional responses in the domain? Basic level: reproduces standard perspectives without modification. Intermediate level: introduces variations on existing perspectives. Advanced level: proposes genuinely new perspectives for the domain or context.

Dimension 2 — Appropriateness: To what extent does the idea or solution respond relevantly to the problem or challenge posed? Basic level: the solution is tangential or partially relevant. Intermediate level: the solution is pertinent but with significant limitations. Advanced level: the solution responds directly and effectively to the challenge posed.

Dimension 3 — Elaboration: To what extent is the idea or solution developed, specified, and argued? Basic level: the idea is sketched without development. Intermediate level: the idea is developed with some arguments or details. Advanced level: the idea is fully developed, specified, and grounded in evidence or reasoning.

Dimension 4 — Process (when assessed): To what extent does the documented process show exploration of multiple alternatives, problem reformulation, and learning from errors? Basic level: the process shows a single line of development with no exploration of alternatives. Intermediate level: the process shows exploration of some alternatives with implicit selection criteria. Advanced level: the process shows broad exploration, explicit problem reformulation, and clear selection criteria based on feedback received.

This structure should be adapted to the specific domain and the concrete task, incorporating exemplars — examples of work at each level — to calibrate assessor judgment and communicate it clearly to students.

Formative Assessment as Creative Development

One of the most important conclusions of the research reviewed is that well-designed assessment is not only a measurement of students’ creative level: it is in itself an intervention that develops their creativity.

Karunarathne and Calma (2024) document this in their study: students who participated in the authentic assessment task designed to measure the three PISA framework dimensions showed measurable improvements in creative thinking over the course of the semester. The structure of the task — which required creative expression, knowledge creation, and creative problem solving — acted as scaffolding for the development of those competencies.

This has a direct pedagogical implication: designing creativity assessment with clear, shared criteria does not only allow for better measurement — it also teaches students what it means to think creatively within the context of their discipline. The rubric, in this sense, is not merely a grading tool: it is a map of the creative territory the student is learning to navigate.

Conclusion: Subjectivity Is Not Inevitable

Assessing creativity in the university classroom does not have to be a territory of arbitrary subjectivity. The theoretical frameworks of Guilford and Torrance, standards-referenced analytic rubrics, the Consensual Assessment Technique, and the PISA framework offer concrete, validated, and applicable tools for assessing creativity rigorously, fairly, and formatively.

What research consistently shows is that the key is not to eliminate the assessor’s judgment — that is neither possible nor desirable — but to structure it through explicit criteria, communicate it before the work begins, and calibrate it through training and exemplification processes. An educator who does this is assessing their students’ creativity with the same rigor they apply to any other complex competency.

And at the same time, they are teaching their students what it means to create with intention.

References

Alabbasi, A. M. A., Paek, S. H., Kim, D., & Cramond, B. (2022). What do educators need to know about the Torrance Tests of Creative Thinking: A comprehensive review. Frontiers in Psychology, 13, 1000385. https://doi.org/10.3389/fpsyg.2022.1000385

Amabile, T. M. (2012). Componential theory of creativity (Working Paper No. 12-096). Harvard Business School. https://www.hbs.edu/faculty/Pages/item.aspx?num=42469

Guilford, J. P. (1967). The nature of human intelligence. McGraw-Hill.

Jönsson, A., & Panadero, E. (2017). The use and design of rubrics to support assessment for learning. In D. Carless, S. M. Bridges, C. K. Y. Chan, & R. Glofcheski (Eds.), Scaling up assessment for learning in higher education (Vol. 5, pp. 99–111). Springer.

Karunarathne, W., & Calma, A. (2024). Assessing creative thinking skills in higher education: Deficits and improvements. Studies in Higher Education, 49(1), 157–177. https://doi.org/10.1080/03075079.2023.2225532

Xu, W., & Tognolini, J. (2022). Build an assessment rubric of student creativity in higher education. In 8th International Conference on Higher Education Advances (HEAd’22). Universitat Politècnica de València. https://doi.org/10.4995/HEAd22.2022.14695