Conversational Assistant

Client: Renaissance Learning
Role: User Researcher, Design Lead
Duration: 7 months

*Note: one of the major constraints from our client was that we could not redesign the current Goal Setting Wizard UI in any way. Therefore, our solution builds upon what already exists today.

The Problem

Renaissance Learning’s standardized testing Goal-Setting Wizard (GSW) was being used by only 5% of Renaissance’s customers, American K-12 teachers. My team and I conducted user research with teachers across three states to understand why this was happening and to identify opportunities for product enhancement. We re-designed, prototyped, and built a functional MVP encompassing a conversational assistant and online professional development experience that taught and supported K-12 teachers while they set goals. This MVP showed improvements in teachers’ understanding of using test data to set personalized goals and in the usability of the GSW’s interface. As Design Lead, I was responsible for the final UX, UI, and interaction design of our final deliverable suite. I also took a leadership role in guiding our team’s prototyping process throughout low, mid, and hi-fidelity stages.

Design Features

The Bot

We built a conversational agent or “bot” to support teachers in real-time with suggestions and targeted feedback on goal-setting decisions. Key design features include:

Scaffolds users. The bot teaches the same step-by-step process for selecting a personalized student goal and understanding data. After the first couple of uses, a teacher learns the goal-setting process and can turn off the bot. Then, next time a teacher needs to set goals, if she has forgotten best practices, she can turn the bot on again. Users are not left to articulate goals from scratch. Instead, to reduce cognitive load, the system provides suggested goal options. Users can create a custom goal if they so choose.

Lets users take the lead. The bot never restricts user actions. Conversational tone of voice is polite and suggestive so that users always feel in-control. Upon first using the Goal Setting Wizard, the bot’s default is full support mode. However, at any time, a user can easily select to “minimize” the bot.

Calm technology. Even in full support mode, the bot’s location on the screen is one that does not take up primary attention space. Users can focus on the original Goal Setting Wizard interface and are not impeded by the bot at any time. One of Renaissance’s priorities was ensuring that teachers learn to set goals and analyze testing data properly. Therefore, while the bot never goes away, in minimized mode, the bot is designed to let users know that there is a suggestion for best practice, but it does not distract or interrupt the user’s workflow.

While the agent looks and feels like a chatbot, our final deliverable to Renaissance did not leverage AI due to constraints.

The Tutorial

Because the bot was designed to provide the minimum necessary level of support for teachers to set goals, we also built a tutorial that dives into the explanations and process behind setting the goal-setting wizard. Key design features include:

Worked example. A worked example is a step-by-step demonstration of how to solve a problem. In this case, the worked example in the tutorial demonstrates the process of setting a personalized goals for students using testing data. Thinking and decision-making is made explicit. The training is rooted in example students which mirrors skills users would use in a real-life case.

Data visualizations. The tutorial leverages interactive data visualizations that make goal comparison and comprehension easy.

Learner control. Users have full visibility over progress in the lesson. They can control their own pacing.

01. Research

Pretest

We asked K-12 teachers to think aloud as they tried to use the original Goal Setting Wizard. Below is a summary of teachers’ common misconceptions:

Moderated Interviews

How do goal-setting and growth measurements impact teachers’ work in the classroom? How do teachers currently set goals for students, and how is this different from Renaissance’s recommended best-practices based in learning psychometrics? In order to answer these research questions, we conducted moderated interviews with 14 classroom teachers, 6 intervention teachers, and 6 administrators across grade levels, 3 states, and specialties in math, reading, or general. To synthesize our qualitative data, we used affinity mapping and coding techniques.

Building models allowed us to uncover user behavior patterns. Vittore used three types of models: cultural models, information flow models, and sequence flow models. The cultural model helps us understand the value points, expectations, and influences between all stakeholders, including administrators, teachers, and students. The information flow model helps us visualize what types of information are being conveyed between stakeholders. Sequence flow models were used to take a detailed view at the workflow of an intervention specialists, noting the work processes and interactions between roles and tools. This model is particularly illustrative compared to the best-practice sequence flow model provided by Renaissance administrators, teachers, and students.

Personas are models of users who we interviewed in our research (Cooper, 1995). Personas are compiled from the behaviors, motivations, and needs of the many users whom we interviewed. Personas serve to inform the product design. By assembling and concretizing user patterns into particular personas, the team has a precise way to measure and communicate how well a design concept meets the needs of a concrete type of user.

Coding is a process that categorizes quotes and notes from raw interview data into relevant and informative codes determined by the protocol and guiding research questions. Some examples of our codes include: GS (Goal Setting), IoR (Interpretation of Reports), and CwS (Communication with Students).

Affinity diagramming involves grouping raw data into summative, hierarchical thematic and insight categories. This is a bottom-up process in which thematic and insight categories emerged organically as we sifted through our interview notes. Our team also used a theoretical framework informed by Ambrose et al. (2010) and Wiggins and McTighe (2005). This framework allowed us to consider how categories fit into categories of instructional design including educational goals, instruction, and assessment.

Key Insights

Data visualizations about student performance levels are useful for teachers when deciding which students need support. Teachers are interested in test validity and consider other sources to augment test data. Goals reflect desired (not expected) student performance. Teachers want to implement instruction at appropriate levels of difficulty for specific students and school groups so that all students can better their skills. A number is not enough, but it helps. Teachers want measurable and actionable data. Teachers want growth, but they don’t use a standardized reference metric against which to compare student test scores. Teachers want to communicate goals to students in a clear and motivating way. Limitations to using Student Growth Percentile (SGP) include distrust in the validity of the metric, difficulty in understanding the measurement logic, and perceived mismatch between SGP, desired goals, and realistic goals. Teachers want to administer assessments in frequencies and durations that will maximize feedback with the smallest loss to instructional time. Teachers distinguish performance measurements from growth measurements to make assessments more equitable across students.

02. Decision

Working with Renaissance Stakeholders, we held a design workshop in which we shared our research and picked ideas for solutions off of which to iterate. Below is the pitch that most informed our final design.

03. Iterations

We went through seven rounds of prototyping. Each round focused on a different aspect that would make its way into our final design. Our team tested each round with users and underwent rapid design-research cycles. Below is a selection of some of the most informative iterations towards our final product.

Types of Alerts

What types of alerts are most useful to improving data usage in goal-setting?

We created a mid-fidelity prototype of the GSW experience with a conversational agent providing messages to teachers as they set goals. The goal was to see which types of messages offered by the conversational agent would be helpful to teachers. In our design research session, we asked participants to set goals for five hypothetical students using our prototype. The first student was a pretest and had no conversational agent. For the next three students, the agent offered assistance messages, including definitions of terms, guides for best practices, and dummy references for how to incorporate other data sources. Participants were asked to accept or dismiss each of these suggestions as they set goals for the students. As a post-test the interviewees set the goal for the final student without guidance. Participants were asked to discuss which visualizations and content delivery methods best met their needs. We also asked how the agent fit into their workflow.

From this design research round, we learned that complete sample data is needed. Participants relied on complete data from a student to simulate a goal-setting process. Providing false data for each student confused teachers. For example, without knowing how their student’s learning rate compared to their academic peers, teachers were unable to evaluate whether the student might be capable of achieving a higher goal.

We also learned that skill-based concepts require continuous support. Participants quickly gained confidence in declarative concepts but required continued support in skill-based concepts. Messages about declarative goals such as definitions (e.g. “What is SGP?” were only considered helpful the first time they used the interface. Because participants often assign goals to students in an entire class at one time, they found terms may be helpful reminders, but they may become disruptive as they set several goals. Because of this finding, our final design allows users to opt out of receiving support for low-complexity content such as definitions. This is supported by past learning science research showing that learners with high prior knowledge are not harmed by high learner controlled environments.

Worked Example

If we use a worked example to pre-train teachers about best practices, are they able to apply these when setting goals for a student?

To teach deeper understanding of goal-setting concepts, best-practices, and data interpretation, we wanted to build a worked example to complement the conversational agent. A worked example is a step-by-step demonstration of how to perform a task or solve a problem. At this stage, our team built out a mid-fidelity prototype of a worked example on setting a goal for a student using their past STAR testing data. In the design research session, participants first went through the worked example explaining definitions of relevant metrics and best practices for making decisions about goals and interventions, including intervention names, duration, and goal rigor. Then, participants set goals for two students with different growth rates based on their previous testing data. After each goal, participants were asked to justify the goal so that we could understand how the interviewee used metrics in their decision. From this round of design research, we learned:

Color coding helped participants make immediate judgements about a student’s performance. Teachers relied on color coding representing good, bad, and bubble cases to help them make decisions.

Too many details about complex practices overwhelmed participants. The worked example in this round included full details on considerations a teacher should make when setting a goal. In our final design, instead, we offered short explanatory messages hitting key points accompanied with a “learn more” button that led to research-based rationale.

Teachers tend to select goals that balance likelihood to achieve that goal and being appropriately challenging. Visualizations that made those trade-offs explicit and depicted benchmarks to help comparison were incorporated into the next iteration of the prototype.

Goal Visualizations

What visualizations of data best encourage data-based decisions about goals?

In this round, we designed and tested which versions of goal visualizations most clearly communicated a student’s progress and helped teachers select goals that balanced achievability and challenge for a student.

We learned that to help comparison, visualize the impact of different goal selections. Different trend line visualizations of projected student growth associated with different goals were helpful. These visualizations were lines on line-graphs, and they helped teachers make comparisons between the goals. Refined versions made it into our final design.

Also, visuals supported by text that explained progress projections were helpful. Teachers used both the data visualizations as well as explanatory text. In our final design, we created interactions to highlight a particular goal line and reveal its associated explanatory text.  

Personalized Workflows

With turning the agent on/off and visual design in place, is our prototype usable and providing the optimal experience?

In this round, we built a high-fidelity prototype that incorporated all the moving pieces: the worked example and conversational agent being jointly available, contextual linking to Renaissance’s resource community, and the ability for each user to turn the agent on/off at different points to customize their own assisted experience. We also showed participants visually styled static screens of the bot interface and solicited feedback. Our participants responded positively to this prototype. As one participant said, “The ‘R’ looks more program-friendly and not cheesy. I like the way it looks. It’s very clean. It didn’t seem to be interfering, it seemed to be helpful.” If we were to continue building this experience, we learned that a next step could be:

Participants expected additional resources to be curated. When participants clicked on “Lean More” links contextually embedded in the chatbot interface, participants expected the precise snippet of a resource directly relevant to what they were experiencing in the GSW at that moment to surface.

Post-Test

Before giving our design and code to Renaissance, we conducted a Product Validation session to gage if our product alleviated the misconceptions identified in the Pre-Test. Improvements are represented in the image below.

We traveled to a school in which a teaching team who are current Renaissance users participated. Participants were divided into groups based on their role at the school (administrators vs teachers). Users followed the worked example. After the worked example, participants set and justified goals for two mock students using the conversational agent support. During this process, participants were asked role-specific questions about how the data visualizations and agent matched or conflicted with practices in their classroom or school: “Did the training match what you already knew about setting goals for students? Did you learn anything new?”

Participants completed a questionnaire on the experience of using the agent to set goals for students. The questionnaire covered product utility (e.g., “How valuable did you find the information presented by the product compared to the current system?”), questions to probe data use in goal-setting (e.g., “How do you define an appropriate goal for a student?” and ‘How confidence do you feel that the goals you set are an appropriate level of challenge for this student?”), and overall product recommendations (e.g., “What changes would most improve the product?”).

post-test-1

Leave a Reply

Your email address will not be published.